# Unit 2 Vocab

### Collecting Data for Study

TermDefinition
Data is information that has been collected to represent real life situations, usually in number form.
population In statistics, this is the entire group of interest from which the sample is drawn.
poll the members of a group means to question them regarding a specific topic.
sample This is a specified part of a population, intended to represent the population as a whole.
Random Sampling involves choosing representatives by rolling a die, for instance.
Stratified Sampling involves choosing a proportional number of representatives from each of a number of subgroups of the initial population.
Subgroups are another name for stratum.
Cluster Sampling involves choosing representatives which are close to other representatives based on a particular factor such as location, age, color, size, etc.
Multi-Stage Sampling involves narrowing down a field of representatives by successively applying multiple different sampling methods.
simple random sample the process of assigning a number to each member of the population under study, and then using a random number generator to pick the samples.
stratum a single category or sub-population out of a larger population.
control group set of members deliberately kept as separate as possible from a particular study so as to provide an example of how the members should appear if unchanged.
estimate find an approximate answer that is reasonable or makes sense given the problem.
representative sample a smaller number of members of a population whose responses to events model those of the entire population.
bias refers to a desire to achieve a specific result from a particular study, regardless of the data.
destructive study requires that the sample be ruined for its intended use by the study itself. One example is in the tests to see if cars are safe.
outlier an observation that lies an abnormal distance from other values in a random sample from a population.
bell-curve normal distribution is often referred to this.
arithmetic mean commonly known as the average in statistics.
standard deviation A measure of spread of a data set. The larger the standard deviation, the more spread out the data is.
demographic distribution describes the relative numbers of different types of members of a sample or group.
categorical variable a variable that can take on one of a limited, and usually fixed, number of possible values
quantitative variables are any variables where the data represent amounts
graphs visual display for data. (Many picto, line, bar, circle, dot, etc)
frequency in statistics, this is the number of times a piece of data shows up.
two-way frequency table a way to display frequencies or relative frequencies for two categorical variables
marginal distribution the probability distribution of the variables contained in the subset.
conditional distribution distribution of values for one variable that exists when you specify the values of other variables
