click below
click below
Normal Size Small Size show me how
STATS VOCAB TEST 1
| Term | Definition |
|---|---|
| statistics | science of collecting, organizing, analyzing, and summarizing data to answer questions and draw conclusions |
| data | another word for information |
| population | entire group being studied (unlimited, specific) |
| sample | any subset from the population (as large as possible, as unbiased as possible) |
| parameter | data collected and analyzed from a population, a measure |
| statistic | data collected and analyzed from a sample |
| descriptive statistics (branch 1) | consists of collecting and organizing data to describe what is seen |
| inferential statistics (branch 2) | uses methods of analysis and probability to take results from a sample and extend them to answer questions about a population and a future |
| variable | any characteristic of the population that can take on a random outcome |
| ordinal variable | those with a natural order ex:letter grades |
| nominal variable | those with no order ex: hair color, brands |
| quantitive statistics | those that are numerical ex: age, height, weight, how many of something |
| discrete variables | those that are countable |
| continuos variables | those that are uncountable ex: height, temperature, blood pressure - note that these are usually tedious and resorting straight to a graph is usual |
| observational study | a researcher merely observes what is happening to answer questions (no manipulation) |
| hawthorne effect | do we change our behavior when we know we are being studied/watched? |
| experimental study | a researcher manipulates and looks to see how this affects other variables |
| lurking variables | those that effect the outcome of an experiment, but arent being studied by the researcher |
| qualitative variables | those that can be arranged into categories ex: hair color, martial status, brands, letter grades |
| raw data | merely collected data with no order or arrangement |
| frequency distribution | a table that arranges data into class with their frequency (NOMINAL DATA) ex: blood type |
| pareto chart | bar graph with bars in decending order of frequency |
| pie chart | circular graph with classes represented as wedges of the circle |
| bar graph | bars in class order (ORDINAL DATA) |
| dot plot | number line with data representing data value |
| grouped classes | used when the data values vary to much, open ended classes help to summarize the data |
| cumulative frequency (c.f) | the accumulation of frequencies as we go down the table |
| histogram | bar graph for continious data in which bars are arranged according to the number line |
| stem and leaf plot | a vertical graph that cuts the data value into two pieces |
| truncate | to cut off without rounding ex: 3120- 31 4210- 42 |
| open ended class | this is a class that includes more than one value, or an open ended value open to both the low and high end ex: 7+ or 3+ |
| grouped frequency distributions | uses classes of grouped intervals of data |
| cumulative frequency | this is the sum of frequencies as we go down the table (note that you would not use this for categories) |
| histogram shapes | 1. bell shaped 2. uniform 3. skewed (to the left or the right) 4. bimodal 5. U shaped |
| measure | any calculation performed on a data set |
| measures of central tendency | calculates the center of a data set |
| mean | (arithmetic average)- the sum off all data values / the total number of data points |
| median | middle value when the data is in order |
| mode | value that occurs most often (can be 0, 1, or more) ex: "the average hair color is brown" |
| midrange | quick measure of the sum of the high and low divided by 2 (old method) |
| weighted mean | used when different data values have different levels of importance ex: GPA |
| percentiles | cut a data set into 100 equal intervals ex: 43rd percentile means that the data value is above 43% of other values, and below 57% of all other values |
| quartile | cut a set of data into 4 equal intervals |
| 5 number summary | includes the datas (in order) low, Q1, Q2, Q3, and the high ex: 1, 2, 2, 5, 6, 7, 8, 8, L- 1 Q1- 2 Q2- 5.5 Q3- 7.5 H- 8 |
| z scores | a measure that shows a data values distance away from the mean (used when comparing unrelated data) |
| measures of position | these measures compare 1 data value to the rest of the set |
| outlier | an extremely high or low data value compared to the rest of the set (sometimes an oddball or mistake) |
| range | measure of variation that is found by subtracting the low value from the high value |
| variance | the mean of the squared distances of the data value fall away from the mean |
| standard deviation | squared root of the mean, of the squared distances of the data value, away from the mean |
| chebyshevs rule | for any set of data, at least (1-1/k2)100%, of the data values will fall within k standard deviations of the mean |
| empirical rule | (ONLY BELL SHAPED DATA) 68, 95, 99 |
| measures of correlation | you have two sets of data. Is there a relationship? 1st variable- x (independent or explanatory, variable being manipulated) 2nd variable- y (dependent or resultant variable) |
| scatter diagram | xy graph that plots each pair of data values as points to look for a pattern types: positive linear, negative linear, non linear, none |
| correlation coefficient | measure of the linear strength of the relationship between 2 variables |
| least square regression line | this is the line of best fit for representing the correlation of the data |
| Beyond The Scope of Study | the study represents a curtain sample of people and data, the correlation should not be extended too far beyond them, do not extrapolate too much!) |