# Stats Chapter 4

### vocab

distribution slices up all the possible values of the variable into equal width bins and gives the number of values (or counts) falling into each bin
histogram (relative frequency histogram) uses adjacent bars to show the distribution of a quantitative variable; each bar represents the frequency (or relative frequency) of values falling in each bin
gap a region of the distribution where there are no values
stem-and-leaf display shows quantitative values in a way that sketches the distribution of the data
dotplot graphs a dot for each case against a single axis
shape described by: single vs. multiple modes, symmetry vs skewness,
center the place in the distribution of a variable that you'd point to if you wanted to attempt the impossible by summarizing the entire distribution with a single number (mean and median)
spread a numerical summary of how tightly the values are clustered around the center; IQR, STD DEV
mode a hump or local high point in the shape of the distribution of a variable; apparent location can change as scale of a histogram changes
unimodal (bimodal) having one (two) modes
uniform a distribution that is roughly flat
symmetric a distribution with two halves on either side of the center that look like mirror images of each other
tails are the parts that typically trail off on either side; distributions can be characterized as having long tails or short tails
skewed a distribution that is not symmetric and one tail is longer than the other (skewed LEFT if the tail is longer on the right vice versa)
outliers extreme values that don't appear to belong with the rest of the data; can be unusual values that need more investigation or mistakes
median middle value with half data above and half below; if N is even then it is the average of the two values; usually paired with the IQR
range max - min
quartile lower quartile (Q1): 1/4 data below it; upper quartile (Q3) 1/4 data above it; used with median it divides the data into 4 parts
Interquartile Range (IQR) difference between Q1 and Q3; Q3 - Q1 = IQR; reported along with the median
percentile the ith percentile is the number that falls above i% of the data
5-number Summary reports the minimum, Q1, median, Q3, and the maximum values
mean average; paired with STDDEV
resistant a calculated summary where outliers have a small effect
variance sum of squared deviations from the mean divided by the count minus 1
standard deviation square root of the variance; s = sqrt((sigma(y-ybar)^2) / n-1); reported with the mean
