click below
click below
Normal Size Small Size show me how
Unit 9 Data Vocab
In order to be proficient or distinguished in the data unit, students need to be
| Term | Definition |
|---|---|
| Measure of central tendency | Any of several methods used to find a central value of two or more numbers. Mean, median and mode are all measures of central tendency. |
| (arithmetic) mean | The average of the numbers. A calculated central value of a set of numbers, calculated by adding the numbers and dividing by the "count" or how many numbers there are |
| Lower Quartile | The first of the values that divides a sorted list of number in to quarters or four parts. If there are two numbers you add them and divide by two. also abbreviated Q1 |
| Median | The "middle" of a sorted list of numbers. Place the numbers in value order and find the middle. If the middle falls between two numbers add them up and then divide by two. Also called Quartile 2 |
| Upper Quartile | The third of the values that divides a sorted list of number in to quarters or four parts. If there are two numbers you add them and divide by two. Also abbreviated Q 3 |
| Interquartile Range (IQR) | The range from Quartile 1 to Quartile 3. Subtracting Q3 and Q 1. |
| Box Plots | A type of diagram showing minimum, quartiles (including median), and maximum. The minimum to Q1 is a line, then a box is created from Q1, Q2, and Q3, and another line extends to the maximum. |
| Outlier | A value that "lies outside" is much smaller or bigger than the other values in a set of data. Three standard deviations from the mean is a common cut-off in practice for identifying outliers |
| Mean Absolute Deviation (MAD) | How far on average all values are from the middle. First find the mean ( average_ add all the numbers and divide by the count. Then find the absolute distance from the mean, (deviates) , finally find the mean or average of those numbers |
| Skew (data) | When data as a "long tail" on one side or the other of a statistical plot. When data is not symmetrical. |
| Right Skewed | Also called positive skew, because the "long tail" is on the right, and the mean is to the left of the peak of data. |
| Left Skewed | Also called negative skew, because the "long tail" is on the left, and the mean is on the right of the peak of data |
| Normal Distribution | Symmetrical distribution, where the mean is at the peak of the data |
| Bimodal | Having two peaks, or modes in a data set |
| Multimodal | Having multiple peaks or modes in a data set |
| Peak (data) | Where the mode of the data lies in a graphical representation of data |
| Uniform Data | Data that is mostly the same over an entire data set. |
| Categorical Data | Data that can be divided into specific groups. |
| Quantitative Data | Data that can be counted (discrete) or measured (continuous) |
| Qualitative Data | Data that describes something, it is not measurable |
| Bi-variate Data | Data for two variables (usually two types of related data) |
| Two-Way Table | A statistical table that shows the observed number or frequency of two variables in rows for one category and columns for the other category. |
| Joint Frequency | Where a row and column of two types of statistical data lies in a two way table |
| Marginal Frequency | the total columns and rows of a two way table |
| Conditional Relative Frequency | In a two way frequency table, it is when you isolate a row or column total, and divide the joint frequency in the cell by the marginal frequency of the column or row. Usually the word "given" is used when this is needed. |
| Scatter plot | A graph of plotted points that show the relationship between two sets of data |
| y-intercept | The point where a line or curve crosses the y-axis of a graph. The value of a function when x = 0 |
| Line of best fit | A line on a graph showing the general direction that a group of points seem to follow. |
| Data Association | A function that delivers the probable outcome from a given input given a representation of best fit for a data set |
| Correlation | When two sets of data are linked together we note how well by saying, perfect positive, high positive, low positive, no, low negative, high negative, or perfect negative. |
| Correlation Coefficient | a number between -1 and 1 that is calculated to represent the way data is linked together. -1 represents a perfect negative relationship, 0 represents no relationship, and 1 represents a perfect positive relationship. |
| Causation | When two sets of data appear to have a relationship, but the relationship is based on either a third variable or coincidence of data. |
| Minimum | The smallest value in a data set |
| Maximum | The largest value in a data set |
| Univariate Data | Data of one type or variable |
| Relative Frequency | How often something happens divided by the total of all outcomes. In a two way table, when you divide the joint frequency of cells by the total of the data |