click below
click below
Normal Size Small Size show me how
U1 Stats vocab
Stats vocab
| Term | Definition |
|---|---|
| Variables | any characteristic of an individual |
| Categorical Variables | variables which put an individual in a specific group ex:what kind of car someone drives, what someones favorite color is |
| Quantatative Variables | variables which take on a numeric value, good for adding and averaging ex:height, age, wage |
| Discrete Data | Only take on specific values, it makes sense to count them (whole numbers and simple decimals) ex: how many pets someone has, how many tardies they have |
| Continuous data | can take on any value, usually in decimals. Items which are measured, or timed |
| Stacked Bar | form of a bar chart that shows the composition and comparison of multiple variables |
| Segmented Bar Chart | compares percentages of different variables |
| Mosaic Plot | unlike other bar graphs there are no spaces, representing the joint frequency of a category of each of the two variables |
| Bar Chart | uses true frequency, the hard number. Bars have spaces. |
| Pie Chart | uses relative frequency (percentage) |
| Joint relative frequency | ratio of frequency in a cell and the total number of data values |
| Marginal relative frequency | ratio of the sum of a row or column and a total number of data values. |
| conditional relative frequency | ratio of a joint relative frequency and related marginal relative frequency |
| median | the middle number of your values |
| Range | difference between your max and min |
| IQR (interquartile range) | Q3-Q1 |
| Outlier Bounds | upper: (Q3+1.5*Range) lower: (Q1-1.5*Range) |
| Mean | average of all data points |
| Standard deviation | average distance of values from mean ex: if higher the data points on average are farther from the mean |
| Data | information gathered from observations |
| Distribution | a list of what values a variable takes on and how often it takes on each one of those values |
| summary statistics | Mean, median, standard deviation, IQR, range. |
| 5 number summary | min, Q1, med, Q3, max |
| mean | the “average” of a data set – also known as the Expected Value -non-resistant to outliers |
| median | the point at which 50% of the data is above and 50% of the data is below -resistant to outliers |
| range | the difference between the maximum and minimum values of a data set |
| IQR | the difference between the third and first quartiles of a data set |
| standard deviation | A measure of variability that describes an average distance of every score from the mean (r) sqrt(((x-mean)^2)/n-1) |
| Quartile | observations which fall at the 25th, 50th, and 75th percentiles of a data set |
| percentile | tells what percent of a data set falls below the given observation |
| box plot | a special type of diagram that shows the quartiles in a box and the line extending from the lowest to the highest value. |
| modified box plot | identifies possible outliers by replacing them with astriks and cutting off the max and min |
| histogram | a graphical display of data using bars of different heights. In a histogram, each bar groups numbers into ranges. Taller bars show that more data falls in that range. A histogram displays the shape and spread of continuous sample data. -quantative data |
| stem & leaf | a technique used to classify either discrete or continuous variables. A stem and leaf plot is used to organize data as they are collected. A stem and leaf plot looks something like a bar graph. |
| bar chart | uses parallel rectangular shapes to represent changes in the size, value, or rate of something or to compare the amount of something relating to a number of different countries or groups -categorical data |
| pie chart | a way of summarizing a set of nominal data or displaying the different values of a given variable |
| one-way table | a frequency table for a single categorical variable |
| two-way table | one way to display frequencies for two different categories collected from a single group of people |
| dot plot | Graphs a dot for each case against a single axis. |
| shape | describe the distribution (or pattern) of the data within a dataset |
| unimodal | a frequency distribution that has only one peak. |
| bimodal | A bimodal distribution has two peaks |
| symmetric | one in which the values on either side of the central value (such as the median or mean) are roughly equal |
| right skew | the tail of the data is skewed right |
| left skew | the tail of the data is skewed left |
| spread | the variation of the data |
| center | a typical value of a data point |