click below
click below
Normal Size Small Size show me how
Midterm - SC 545
Study Guide
Question | Answer |
---|---|
test | a measurement device or technique used to quantify behavior or aid in the understanding or prediction of behavior |
item | specific stimulus to which a person responds overtly; this response can be scored or evaluated |
psychological test, or educational test | a set of items that are designed to measure characteristics of human beings that pertain to behavior |
scales | relate raw scores on test items to some defined theoretical or empirical distribution |
individual tests | tests that can be given to only one person at a time |
group test | test that can be administered to more than one person at a time by a single examiner |
achievement | previous learning |
aptitude | refers to the potential for learning or acquiring a specific skill |
intelligence | refers to a person's general potential to solve problems, adapt to changing circumstances, think abstractly, and profit from experience |
human ability | behaviors that reflect either what a person has learned or the person's capacity to emit a specific behavior includes achievement, aptitude, and intelligence |
personality tests | measure typical behavior --- traits, temperaments, and dispositions; related to the overt and covert dispositions of the individual |
structured personality tests | provides a self-report statement to which the person responds (objective) |
projective personality tests | provides an ambiguous test stimulus; response requirements are unclear |
reliability | refers to the accuracy, dependability, consistency, or repeatability of test results |
validity | refers to the meaning and usefulness of test results |
test administration | the act of giving a test |
interview | a method of gathering information through verbal interaction, such as direct questions |
test battery | two or more tests used in conjunction |
representative sample | one that comprises individuals similar to those for whom the test is to be used |
mental age | a measurement of a child's performance on the test relative to other children of that particular age group |
traits | relatively enduring dispositions (tendencies to act, think, or feel in a certain manner in any given circumstance) that distinguish one individual from another |
factor analysis | method of finding the minimum number of dimensions (characteristics, attributes) to account for a large number of variables |
inferences | logical deductions about events that cannot be observed directly |
descriptive statistics | methods used to provide a concise description of a collection of quantitative information |
inferential statistics | methods used to make inferences from observations of a small group of people known as a sample to a larger group of individuals known as a population |
nominal scales | not really scales at all, their only purpose is to name objects |
ordinal scales | scale with the property of magnitude but not equal intervals or an absolute 0, allows a person to rank items |
Give an example of an ordinal scale | ranking person by height or weight |
interval scale | scale has the properties of magnitude and equal intervals but not absolute 0 |
Give an example of an interval scale | Fahrenheit scale or Celsius scale |
ratio scale | has all three properties, magnitude, equal intervals, and absolute 0 |
Give an example of a ratio scale | Kelvin scale |
frequency distribution | scores on a variable or a measure to reflect how frequently each value was obtained |
class interval | the demarcations along the x axis |
percentile rank | answers the question "What scores fall below a particular score?" |
mean | arithmetic average score in a distribution |
standard deviation | an approximation of the average deviation around the mean; the square root of the average squared deviation around the mean |
variance | average squared deviation around the mean |
Z score | difference between a score and the mean, divided by the standard deviation |
McCall's T | standard deviation is set at 10, mean is set at 50 |
quartiles | points that divide the frequency distribution into equal fourths |
median | second quartile - 50th percentile |
interquartile range | the interval of scores bounded by the 25th and 75th percentiles; the middle 50% of the distribution |
deciles | use points that mark 10% rather than 25% intervals |
stanine system | converts any set of scores into a transformed scale, which ranges from 1 to 9, standard nine |
norms | performances by defined groups on particular tests |
tracking | the tendency to stay at about the same level relative to one's peers |
norm-referenced test | test that compares each person with a norm |
criterion-referenced test | describes the specific types of skills, tasks, or knowledge that the test taker can demonstrate such as mathematical skills |
scatter diagram | a picture of the relationship between two variables |
correlation coefficient | a mathematical index that describes the direction and magnitude of a relationship |
regression line | defined as the best fitting straight line through a set of points in a scatter diagram |
In a negative correlation, high scores on the x variable are associated with what on the y variable? | lower scores on the y variable |
intercept | the value of Y when X is 0. the point at which the regression line crosses the Y axis |
example of a true dichotomous variable | gender - male/female or yes/no answers |
type of correlation coefficient used to find the association between two sets of ranks | Spearman's Rho |
type of correlation coefficient used to correlate a dichotomous variable (two categories) and a continuous variable | biseral (point biserial is true dichotomous) |
what is the variance of 1, 2, and 3` | 1 |
x and y correlated .8. What is the coefficient of alienation of this relation? | .6 |
when talking about errors in terms of psychological testing, what are we referring to | some inaccuracy in our measurements |
classic test theory assumes | that each person has a true score that would be obtained if there were no errors in measurement |
We can get an idea of how much measurement error is present in a score through | the standard error of measurement |
In the domain sampling model, the error that is being considered is the error caused by | using a limited number of items to represent a larger and more complicated construct - sample |
The Federal Government guidelines require a test to be | reliable before one can use it to make employment or educational placement decisions |
The difference between two typing tests reflects | practice effects, a form of carryover effects |
Two equivalent forms of a test and administered both, in counter balanced order, to a group of people on the same day to access reliability is | parallel forms |
The Spearman Brown formula corrects for deflated reliability because of | split half method |
An example of the most conservative estimate of split-half reliability | Cronbach's coefficient alpha |
Difference scores are created by | subtracting one test score from another |
the agreement between a test score and the construct it is presumed to measure is referred to as | validity |
validity refers to | "does it measure what it is supposed to measure" |
Which type of validity requires that test items provide an adequate representation of the conceptual domain they are designed to cover? | content-related evidence validity |
If a variable has a restricted range, it is difficult to estimate a validity coefficient due to a lack of | variability in both the predictor and the criterion |
researcher seeking to develop a measure of depression cites a moderate correlation between her measure and another as evidence of validity; this is | construct validity evidence |
discriminant and convergent evidence provide evidence for | construct-related validity |
Cronback and other authors have argued that all types are validity are really categories of | evidence |
the type of validity that subsumes all other types of validity is called | divergent validity |
which of the following statements is true | It is logically impossible that a totally unreliable test is valid |
attitude scale ranging from strongly disagree to strongly agree | Likert scale |
one method that is not used often because scoring is very time consuming is the | visual analogue scale |
the optimal item difficulty of a 6-alternative test is what | .585 |
the proportion of test takers that get a good item correct increases as a function of | the test's efficiency, validity, and reliability |
proponents of criterion referenced tests have criticized item analysis procedures because they | they do not help children learn; they just seem what the students have learned |
.93 of top students and .89 of bottom students answers the same question correctly; the instructor should not use this question because | it is not a good question; it have high and low level, not sure about middle |
peaked conventional tests present items that | are in the middle range, for middle students |
The chances that low-ability test takers will obtain each score is called the | guessing threshold |
in most situations a good test should contain items | that are the complete range, from easy to hard |
the effects of examiners' expectations upon test scores shows that | there is a correlation between scores and expectations - expectancy effects or Rosenthal effects |
Studies on the effect of reinforcement upon intelligence test performance by African Americans shows | that culturally appropriate verbal reinforcement caused higher scores |
reliability and accuracy are highest when someone is checking on the observers | reactivity |
data in behavioral observation studies have sometimes been found to be biased in the direction of the observer's own beliefs | contrast effect (form of drift) |
approach used to remove the effect of uncontrolled variability | partial correlation |
advantages of using computer-assisted test administration | easier to give; adaptability; open answers |
research on integrity tests suggests | that the validity of them is questionable |
worry, emotionality, and lack of self-confidence | test anxiety |
statement used to comfort or support an interviewee | reassuring |
transitional phrase | phrase used to move along the interview |
verbatim playback | repeating the words directly back to the interviewee |
level one response | response that does not have anything to do with the conversation |
confrontation should be used with great caution | in cases where it would cause a problem because it is direct approach |
tendency to judge specific traits on the basis of a general impression | halo effect |
research indicates people are more apt to talk about or explore themselves at deeper levels when .... responses are used | open-ended |
oldest approach to investigating human intelligence | psychometry |
when sets of diverse ability tests are administered to large, unbiased samples, almost all of the correlations are positive, a phenomenon known as | positive manifold |
main improvement of the 1908 Binet-Simon scale was the introduction of | the concept of mental age |
the scale that used the terms idiot, imbecile, and moron | 1905 Binet-Simon scale |
version of the Binet scale that first utilized a large, geographically diverse sample | 1972 |
improvements of the 1937 Scale | extended the age range and included an alternate equivalent form |
most significant psychometric of the 1937 Scale | the reliability coefficients were higher for older subjects that for younger ones |
deviation IQ became necessary because | to solve the problem of differential variation in IQs |
First version of the Stanford Binet to include non-whites | 1972 |
In the 2003 edition the verbal and nonverbal scales are | equally weighted |
major criticism of the Binet scale by Wechsler | did not have validity/reliability in older ages |
Not true of Wechsler Scales | They are invalid for adults. |
components of Wechsler's definition of intelligence | act purposefully, think rationally, and deal effectively with the environment |
subtest that measures short term and auditory memory | digit span subtest |
Verbal IQ - mean = ; standard deviation = | 100; 15 |
subtest that measures ability to learn an unfamilar task, visualize motor dexterity, and degree of persistence | digit symbol coding |
validity of the WAIS-III rests on | its correlation with previous tests by W |
attempts to measure how quickly your mind works | processing speed index |
evaluating relatively large differences between subtest scaled scores is | interpretation |
An improvement in the WISC-IV over the WISC-III is in its use of | empirical data to identify item biases |