click below
click below
Normal Size Small Size show me how
Psych t&m chap. 4a&b
Chapter 4 notes.
Question | Answer |
---|---|
a test is _______ to the extent that inferences made from it are appropriate, meaningful, and useful. if a test measures what it is supposed to measures, this constitutes as _________. | Valid, validity |
____________ is determined by the degree to which the questions, tasks, or items on a test are representative of the universe of the behavior the test was designed to sample. | content validity |
_____ is the extent to which scores on the test are related to some criterion. how strong the co relation is between the score and what its trying to predict. ex: SAT & GPA in college | Criterion-related Validity |
Cross sectional, which means all at the same time, they test both predictor and criterion variables at the same time | Concurrent |
Longitudinal, the experimenter first scores the predictor, and then at a later time score the criterion | Predictive |
Average test takers opinion on whether the test looks like its measuring what is supposed to measuring. Ex: The test taker would be very upset if they were going in to take a psychology test and the test had math questions on it. | Face Validity |
there is a correlation between test and criterion. there is also a specific way of calculating this correlation. This is called______, and the equation is _____. | Validity Coefficient, Rxy |
the error to be expected in the predicted criterion score | standard error of estimate |
the purpose of psychological testing is not the measurement itself, but the decision making from scores that comes from the measurement | decision theory |
if you predict that someone or something will succeed and it does succeed it is called a _______ | correct prediction |
predict something or someone will succeed and in fact they fail is called a ______ | false positive |
predict something or someone will fail and they in fact succeed is called a ______ | false negative |
when the prediction is correct it is called a ________ | hit |
when a prediction is incorrect it is called a ________ | miss |
theoretical, intangible quality or trait in which individuals differ. | construct |
a test designed to measure a construct must estimate the _______ | existence of an underlying characteristic such as leadership ability, based on a limited sample of behavior. |
There are two characteristics that all psychological contructs possess | The construct cannot be operationally defined. a network of suppositions can be derived from existing theory about the construct. |
pertains to psychological tests that claim to measure complex, multifaceted, and theory-bound psychological attributes such as psychopathy, intelligence, leadership ability, and the like | construct validity |
there is no one test that can measure | construct validity |
extent to which the scale does in fact correlate with related measures. | convergent validity |
extent to which the scale does not correlate with unrelated measures (correlation between unrelated measures showed be zero or low. | discriminant validity |
calls for the assessment of two or more traits by two or more methods. | Multitrait-multimethod matrix |
systematic experimental design for simultaneous confirmation of convergent and discriminant validity | multitriat-multimethod matrix |
if a test is internally consistent, and measures a single construct then its items will be | homogeneous |
as an experimenter you want to select items that form a | homogeneous scale |
when there is an appropriate change from scores as you grow older this is known as ___________ | developmental changes |
to show, that on average, persons with different backgrouds and characteristics obtain theory consistent scores on the test. | theory consistent group differences |
when you show that test scores change in appropriate direction and amount in reaction to planned or unplanned events | Theory consistent intervention effects |
`specialized statistical technique that is particularly useful for investigating construct validity | factor analysis |
the purpose of factor analysis is to | identify the minimum number of determiners or factors required to account for the inter correlations among a battery of tests. |
a correlation between an individual test and a single factor. | factor loading |
accurate identification of patients who have a syndrome | sensitivity |
has to do with accurate identification of normal patients. | specificity |
We want to have _____ correlation in both sensitivity and specificity | high |
the confirmation that the decision to use a test involves social, legal, and political considerations that extend far beyond the traditional questions of technical validity | extra validity concerns |
does use of this test result in better patient outcomes, or more efficient delivery of services | test utility |
test construction consists of ____ stages | 6 |
Kaufman and Kaufman (1983) 1. Measure intelligence from a strong theoretical and research basis 2. Separate acquired factual knowledge from the ability to solve unfamiliar problems 3. Yield scores that translate to educational intervention 4. Include | defining the test |
where the numbers only serve as category names, or simplified forms of naming | nominal scale |
this is a form of ordering or ranking. | Ordinal scale |
Equal sized units or intervals | Interval scale |
all the characteristics of an intercal scale but also possesses a conceptually meaningful zero point in which there is a total absent | ratio scale |
the six stages for test construction are : | 1. defining the test 2. selecting a scaling method 3. Constructing the item 4. testing the items 5. revising the test 6. publishing the test |
if we asked a panel of expert neurologists to list patient behaviors associated with different levels of consciousness. that would be called _______ | rankings of experts |
procedure for obtaining a measure of absolute item difficulty based on results for different age groups of test takers. | method of absolute scaling |
presents the examinee with five different responses ordered on an agree/disagree or approve disapprove continuum | likert scale |
a likert scale is also referred to as a | summative scale |
the scale in which respondents who endorses one statement also agree with milder statements pertinent to the same underlying continuum basically when you agree with the mostintense statementyou are most likelygoing toagree withthe less seriousstatement | Guttman scale |
test items are selected for a scale based entirely on how well they contrast a criterion group from a normative sample | method of empirical keying |
all scale items coorelate positively with each other and also for the total score of the scale, is also known as internal consistency | rational scale construction |
there are three questions to be asked when constructing the items they are : | 1. should item content be homogeneous or varied ? 2. what range of difficulty should the items cover? 3. how many initial items should be constructed? |
enumerates the information and cognitive tasks on which examinees are to be assessed. | Table of specifications |
when the examinee must choose between two equally desirable or undesierable options. | forced-choice methodology |
useful tool or identifying items that should be altered or discarded | Item difficulty index |
is a useful tool in the psychometricians quest to identity predicatively useful test items. | item-validity index |
also known as an item response function, is a graphical display of the relationship between the probability of a correct response and the examinee's position on the underlying trait measured by the test. | item characteristic curve |
is a statistical index of how efficiently an item discriminates between persons who obtain high and low scores on the entire test | item discrimination index |
the practice of using the original regression equation in a new sample to determine whether the test predicts the criterion as well as it did in the original sample | cross validation |
a est predicts the relevant criterion less accurately with the new sample of examinees than with the original tryout sample | validity shrinkage |
where you can find technical data about a new instrument, can also find information about item aaluses, scale reliabilities, cross-validation studies and so on. | Technical manual |
gives instructions for administering and also provides guidelines for test interpretation | users manual |