# Unit 2 Vocab

### Collecting Data for Study

Term | Definition |
---|---|

Data | is information that has been collected to represent real life situations, usually in number form. |

population | In statistics, this is the entire group of interest from which the sample is drawn. |

poll | the members of a group means to question them regarding a specific topic. |

sample | This is a specified part of a population, intended to represent the population as a whole. |

Random Sampling | involves choosing representatives by rolling a die, for instance. |

Stratified Sampling | involves choosing a proportional number of representatives from each of a number of subgroups of the initial population. |

Subgroups | are another name for stratum. |

Cluster Sampling | involves choosing representatives which are close to other representatives based on a particular factor such as location, age, color, size, etc. |

Multi-Stage Sampling | involves narrowing down a field of representatives by successively applying multiple different sampling methods. |

simple random sample | the process of assigning a number to each member of the population under study, and then using a random number generator to pick the samples. |

stratum | a single category or sub-population out of a larger population. |

control group | set of members deliberately kept as separate as possible from a particular study so as to provide an example of how the members should appear if unchanged. |

estimate | find an approximate answer that is reasonable or makes sense given the problem. |

representative sample | a smaller number of members of a population whose responses to events model those of the entire population. |

bias | refers to a desire to achieve a specific result from a particular study, regardless of the data. |

destructive study | requires that the sample be ruined for its intended use by the study itself. One example is in the tests to see if cars are safe. |

outlier | an observation that lies an abnormal distance from other values in a random sample from a population. |

bell-curve | normal distribution is often referred to this. |

arithmetic mean | commonly known as the average in statistics. |

standard deviation | A measure of spread of a data set. The larger the standard deviation, the more spread out the data is. |

demographic distribution | describes the relative numbers of different types of members of a sample or group. |

categorical variable | a variable that can take on one of a limited, and usually fixed, number of possible values |

quantitative variables | are any variables where the data represent amounts |

graphs | visual display for data. (Many picto, line, bar, circle, dot, etc) |

frequency | in statistics, this is the number of times a piece of data shows up. |

two-way frequency table | a way to display frequencies or relative frequencies for two categorical variables |

marginal distribution | the probability distribution of the variables contained in the subset. |

conditional distribution | distribution of values for one variable that exists when you specify the values of other variables |