click below
click below
Normal Size Small Size show me how
Tests & Measurements
Reliability
| Term | Definition |
|---|---|
| Alternate Forms | Different versions of the same test or measure |
| Parallel Forms | Two or more versions or forms of the same test where, for each form, the means and variances of observed test scores are equal |
| Parallel Forms Reliability | An estimate of the extent to which item sampling and other errors have affected test scores on two versions of the same test when, for each form of the test, the means and variances of observed test scores are equal |
| Alternate Forms Reliability | An estimate of the extent to which item sampling and other errors have affected scores on two versions of the same test |
| Coefficient alpha | a statistic widely employed in test construction and used to assist in deriving an estimate of reliability; more technically, it is equal to the mean of all split-half reliabilities |
| Heterogeneity | More generally, having diverse contents |
| homogeneity | Describes the degree to which a test measures a single trait |
| Inflation of range/variance | a reference to a phenomenon associated with reliability estimates wherein the variance of either variable in a correlational analysis is inflated by the sampling procedure used and so the resulting correlation coefficient tends to be higher |
| restriction of range/variance | a phenomenon associated with reliability estimates wherein the variance of either variable in a correlational analysis is restricted by the sampling procedure used and so the resulting correlation coefficient tends to be lower |
| Inter-item consistency | The consistency or homogeneity of the items of a test, estimated by techniques such as the split half method |
| Internal consistency estimate of reliability | An estimate of the reliability of a test obtained from a measure of inter-item consistency |
| Inter-scorer reliability | an estimate of the degree of agreement or consistency between two and more scorers (or judges or raters or observers) |
| Item characteristic curve (ICC) | A graphic representation of the probabilistic relationship between a person’s level on a trait (or ability or other characteristic being measured) and the probability for responding to an item in a predicted way |
| Item response theory (IRT) | a system of assumptions about measurement (including the assumption that a trait being measured by a test is unidimensional) and the extent to which each test item measures the trait |
| Kuder-Richardson Formula 20 | a statistical method used to estimate the internal consistency reliability of a test, particularly when the items are dichotomous (right/wrong or yes/no) |
| Reliability | The extent to which measurements are consistent or repeatable; also, the extent to which measurements differ from occasion to occasion as a function of measurement error |
| Reliability coefficient | General term for an index of reliability or the ratio of true score variance on a test to the total variance |
| Split half reliability | An estimate of the internal consistency of a test obtained by correlating two pairs of scores obtained from equivalent halves of a single test administered once |
| Test-retest reliability | An estimate of reliability obtained by correlating pairs of scores from the same people on two different administrations of the same test |
| True Score | A value that, according to classical test theory, genuinely reflects an individual’s ability (or trait) level as measured by a particular test |