# Stats Vocabulary

Skewed distribution | A distribution that is not symmetrical, but instead has a tail that trails to the right or the left; indicates that the mean, median, and more are not all the same number, but that the majority of the data is not gathered at the mean. |

Interval scale | A scale in which the values are evenly distributed and arranged in order. |

Mean | The average value of all of the data; unreliable because it is easily altered by an outlier. |

Experimental method | Involves controlled variables and observed variables, the manipulation of a variable, and the ultimate conclusion of causation. |

Dependent variable | The variable that is not manipulated, but that depends on the manipulated variable. This is usually what we are trying to measure. |

Range | Represents the amount of variability in the data, shown by subtracting the smallest value from the largest value. |

Inferential statistics | techniques that allow us to study samples and then make generalizations about the populations from which they were selected |

Positively skewed | when the tail of a distribution points towards the positive end on the x-axis (right-hand side) |

Median | the score that divides a distribution in half so that 50% of the individuals in a distribution have scores at or below the median |

Population | the set of all the individuals of interest in a particular study |

Point-biserial correlation | a special version of the Pearson correlation used to measure the relationship between two variables in situations where one variable is regular, numerical scores but the second only has two values |

Interquartile range | a range that ignores extreme scores and instead focuses on the range covered by the middle 50% of the distribution |

Degrees of freedom | the number of scores in a sample that are independent and free to vary; because the sample mean places a restriction on the value of one score in the sample, there are n-1 degrees of freedom for the sample |

Descriptive statistics | statistical procedures used to summarize, organize, and simplify data |

Nonparametric test | tests for data that are not arranged in numerical means, but by nominal or ordinal scales instead |

Residual variance/error variance | the variance that exists in a set of sample data; indicates that the sample variance represents unexplained and uncontrolled differences between scores |

Normal distribution | a distribution of scores that is symmetrical, in which the mean, median and mode are all the same value |

Tail | the ends of the distribution of scores that represent the least-occurring scores |

Percentile rank | a number which represents the place of a single score in relation to the rest of the scores |

Central limit theorem | For any population with mean (mew) and standard deviation (ro), the distribution of sample means for sample size n will have a mean of (mew) and a standard deviation of (ro)/square root n and will approach a normal distribution as n approaches infinity |

Correlation | used to measure and describe a relationship between two variables -- without manipulation or control of the variables, and without attempt to justify the relationship |

Observed frequency | the values that result from counting the number of n individuals in a category (chi-square test) |

Power | the probability that a test will correctly reject a false null hypothesis -- the probability that the test will identify a treatment effect if one really exists |

Point estimate | an estimation that uses a single number as an estimate of an unknown quantity; very precise |

Main effect | the difference between the means in ANOVA |

Interaction | between two factors -- occurs whenever the mean differences between individual treatment conditions, or cells, are different from what would be predicted from the overall main effects of the factors |

Parametric test | tests that concern parameters and require assumptions about parameters (t-tests, ANOVA, etc.) |

Sample | a set of individuals selected from a population, usually intended to represent the population in a research study |

Sampling error | the discrepancy, or amount of error, that exists between a sample statistic and the corresponding population parameter |

Correlational method | two different variables are observed to determine whether there is a relationship between them |

Independent variable | the variable that is manipulated by the researcher |

Nominal scale | A scale created purely by non-numerical information; categorical scale, like types of cars or authors. |

Ratio scale | A scale of evenly distributed values, but in which the zero value actually means zero. |

Symmetrical distribution | a distribution in which it is possible to draw a vertical line through the middle so that one side of the distribution is a mirror image of the other |

Negatively skewed | in a distribution; when the tail points to the left |

Mode | the value with the most frequency |

Variability | the average squared distance from the mean |

Ordinal scale | A scale arranged by ranks, like the first five runners in a race, in which the order matters. |

Standard deviation | the square root of the variance |

Z scores | the precise location of each X value within a distribution; the sign signifies whether the score is above the mean or below the mean; the number specifies the distance from the mean by counting the number of standard deviations |

Critical region | composed of extreme sample values that are very unlikely to be obtained if the null hypothesis is true; boundaries are determined by the alpha level; null hypothesis is rejected of the data fall in the critical region |

Interval estimate | in which a range of values is used as an estimate of an unknown quantity |

Random sample | requires that each individual in the population has an equal chance of being selected; probabilities must stay constant from one selection to the next |

Body | the larger section of the distribution |

Binomial distribution | shows the probability associated with each value of X from X=0 to X=n |

Hypothesis test | a statistical method that uses sample data to evaluate a hypothesis about a population |

Alternative hypothesis | states that there is a change, a difference, or a relationship for the general population; predicts that the independent variable does have an effect on the dependent variable |

Type I error | occurs when a researcher rejects a null hypothesis that is actually true; means that the researcher concludes that a treatment does have an effect when in fact it does not |

Post-hoc tests | additional hypothesis tests that are done after an ANOVA to determine exactly which mean differences are significant and which are not |

Null hypothesis | states that in the general population there is no change, no difference, or no relationship; predicts that the independent variable has no effect on the dependent variable |

Type II error | occurs when a researcher fails to reject a null hypothesis that is really false; means that the hypothesis test has failed to detect a real treatment effect |

