Term | Definition |
Box Plots | A diagram that summarizes data using the median, the upper and lower quartiles, and the extreme values (minimum and maximum). It is constructed from the five-number summary of the data |
Frequency | the number of times an item, number, or event occurs in a set of data |
Histogram | a way of displaying numeric data using horizontal or vertical bars so that the height or length of the bars indicates frequency. The bars touch because the data is grouped into intervals. |
Inter-Quartile Range (IQR) | The difference between the first and third quartiles of a data set |
Mean Absolute Deviation | the average distance of each data value from the mean. It is a gauge of “on average” how different the data values are from the mean value. total distance from the mean for all values/number of data values |
Mean | The “average” or “fair share” value for the data. |
Measures of Center | The mean and the median are both ways to measure this for a set of data. It is best to use the mean when there are no outliers in the data set. |
Measures of Spread (Variation) | The range and the mean absolute deviation are both common ways to measure this for a set of data. |
Median | The value for which half the numbers are larger and half are smaller. If there are two middle numbers, it is the arithmetic mean of the two middle numbers. |
Mode | The number that occurs the most often in a list. There can be one, more than one, or none. |
Outlier | A value that is very far away from most of the values in a data set. It will skew the mean. |
Population | A group of people, objects, or events that fit a particular description. |
Range | A measure of spread for a set of data. To find the this, subtract the smallest value from the largest value in a set of data. |
Sample | A part of the population that we actually examine in order to gather information. |
Simple Random Sampling | Consists of individuals from the population chosen in such a way that every set of individuals has an equal chance to be a part of the sample actually selected. Poor sampling methods can lead to misleading conclusions. |
Stem and Leaf Plot | A graphical method used to represent ordered numerical data. Once the data are ordered, the stem and leaves are determined. Typically the stem is all but the last digit of each data point and the leaf is that last digit. |
Q3 - Q1 | Finding the InterQuartile Range (IQR). |
Add and divide by 2 | Finding the median when there are two numbers in the middle. |
Scatter Plots | A Plot that uses horizontal and vertical axes to plot data points. The relationship between two variables is called their correlation. |