Term | Definition |
Population | Total number of some entity |
Sample | A subset of the population
Population of interest |
Descriptive Statistics | Describes the characteristics of a population |
Inferential Statistics | Determines characteristics of a population based on observations made on a sample from the population |
Mean | Average of a distribution |
Median | The middle number of a ranked distribution |
Mode | Most frequent number in a distribution |
Nominal Data | Classified into mutually exclusive groups that lack intrinsic order |
Ordinal Data | Has values that are ranked so that inferences can be made regarding the magnitude |
Nominal Data Examples | Race, social security number, and sex |
Ordinal Data Examples | Letter grade or scale of 1 to 10 |
Interval Data | Data that has an ordered relationship with a magnitude (0 exists within interval data) |
Interval Data Examples | Test scores, temperature, or time on a clock |
Ratio Data | Has an ordered relationship and equal interval (0 does not exist in ratio data) |
Ratio Data Examples | Weight on a scale, ruler measurements, or salary earned |
Qualitative Variable | Also called a categorical variable, are variables that are not numerical (nominal or ordinal) |
Quantitative Variable | Variables that are measured on a numeric scale (interval or ratio) |
Continuous Variable | Can have an infinite number of values |
Continuous Variable Examples | Persons weight or age |
Discontinuous Variable | Can only have two possible values |
Discontinuous Variable Examples | Employed or unemployed |
Hypothesis Test | Allows for a determinations of possible outcomes and the interrelationship between variables |
Null Hypothesis | Ho, no statistical significance between the two variables in the hypothesis
The reference, a statement one want to reject |
Alternative Hypothesis | H1, proposes the relationship
Research hypothesis, a statement one wants to find support for
Main purpose is to reject the null
NEVER accept the alternative, ALWAYS reject the null |
Normal Distribution | One that is symmetrical around the mean (bell curve) |
Skew to the Right | Has few high numbers (out liars), that pull to the right (negative) |
Skew to the Left | Has a few low numbers (out lairs), pulls to the left (positive) |
Range | Difference between highest and lowest scores in a distribution |
Variance | Average squared difference of score from the mean of a distribution
How far the numbers lie from the mean
Squaring deviation from the mean/# of observations |
Standard Deviation | Square root of the variance |
Coefficient of Variation | Is a measure of relative variability
Measured by taking the standard deviation and dividing by the mean |
Standard Error | The standard deviation of a sampling distribution
Indicates the degree of sampling fluctuation |
Confidence Interval | Gives an estimated range of values which is likely to include an unknown population parameters
Width of the interval gives us an idea of how uncertain we are about the unknown parameter |
Chi Squared Test | Provides a measure of the amount of difference between two frequency distributions
Determines is there is a significant difference between expected and observed frequencies
Commonly used for probability distribution and inferential statistics |
Z-Score | Measure of the distance, in standard deviation units from the mean
Allows one to determine the likelihood or probability that something will happen |
Z-Score, Typically used if... | Know the population standard deviation
Sample size is above 30 |
T-Score | Allows the comparisons of the means of two groups to determine how likely the difference between the tow means occurred by change |
T-Score, typically used if... | Do not know the population standard deviation
Sample size is under 30 |
ANOVA | Analysis on variance
Studies the relationship between two variables, the first variable must be nominal and the second is interval |
Correlation | Tests the strength of the relationship between variables |
Correlation Coefficient | Indicates the type and strength of the relationship between variables, ranging from -1 to 1
Closer to 1 the stronger the relationship between variables |
Regression | Test of the effect of independent variables on a dependent variables |
Sampling Error | Occurs when one has taken a sample from a larger population
The sample is not representative of the population as a whole. creating a sampling error |
R2 | Squaring the correlation coefficient |
Non-Sampling Error | One that cannot be explained by the representatives of the sample
Can occur as a result of respondents misunderstanding a question or misreporting their answer |
Probability Sampling | Subset from overall population
Most reliable, defensible and rigorous method
Random, systematic, stratified or cluster |
Non-Probability Sampling | Convenience (snowball survey)
Volunteer
Implementation |
Discrete Variable | Only a finite number of values
Special case; binary or dichotomous (only two values) |
Distribution | Way to formalize values that are likely to be observed
Represent a distribution graphically or mathematically |
Reject the Null | Find evidence in the data = a statistic
If the values of a statistic is very different from what it would be under the null hypothesis - then reject |
Type 1 Error | Probability of making the wrong decision (chance) |