click below
click below
Normal Size Small Size show me how
AP STATS CH3
| Term | Definition |
|---|---|
| bivariate data | Two- variable data set |
| Univariate data | One-variable data set |
| response variable (y) | measures an outcome of a study |
| explanatory variable (x) | may help predict or explain changes in response variable |
| Scatter plot | Shows relationship between two quantitive variable measured on the same individuals. |
| Direction | A scatter plot can show a positive, negative, or no association |
| Form | A scatter plot can show a linear form or nonlinear form. In a scatter plot, the form describes the general shape or pattern of the relationship between the two variables being plotted. |
| Strength | A scatter plot can show a weak, moderate, or strong association. |
| Unusual features | Individuals that fall outside the overall pattern |
| Positive assocation | When values of one variable tend to increase as the values of the other variable increase |
| Negative association | When values of one variable tend to decrease as the values of the other variable increase |
| No association | When knowing the values of one variable doesn't help predict the values of the other variable |
| Outlier | A point that doesn't follow the pattern of the data and has a large residual |
| Influential point | Any point, if removed, substantially changes the slope, y-intercept, r, r^2, or standard deviation of e |
| Correlation | For only a linear association, measures the direction and strength of association; there is a relationship between two variables (indicates a relationship between two variables) |
| Causation | means that one variable is the direct cause of a change in the other; one variable directly causes a change in another. |
| r | Correlation coeffient- tells you how strong the relationship is and if it's + or - |
| r^2 | Coefficient of determination; % of variation in 'y' that can be explained by 'x'. (65% means 35% isn't due to x) |
| Regression line | A line that models how a response variable y changes as an explanatory variable x changes. "y-hat=a+bx" |
| y-hat | predicted value of y |
| Extrapolation | Predicting outside of the interval of x values. The further we extrapolate, the less reliable the predictions. (ex:-10g of protein) |
| Residual (e for error) | The vertical difference between the actual (observed) value of y and the value of y predicted by the regression line. "y minus y-hat" |
| a | the y-intercept; predicted value of y when x=0 |
| b | the slope; amount by which the predicted value of y changes when x increases by 1 unit |
| Least Square Regression Line | Makes the sum of the squared residuals as small as possible |
| Residual plot | A scatter plot that displays the residuals on the vertical axis and explanatory variable on the horizontal axis |
| Standard deviation of the residuals (s) | Measures the size of a typical residual (avg. distance between the actual y values and the predicted y values. |
| High leverage points | An observation with an extreme value for one or more of the independent variables (x-values), influential in determining the slope of the LSRL. These points are not necessarily outliers in the y-direction; their impact comes from their position far out in |
| Residual plot | A scatter plot that displays the residuals on the vertical axis and explanatory variable on the horizontal axis |
| Standard deviation of the residuals (s) | Measures the size of a typical residual (avg. distance between the actual y values and the predicted y values. |
| High leverage points | An observation that has a extreme value for the independent variable, strongly influences the slope of the regression line |
| Negative residual | Observed values under the LSRL |
| Positive residual | Observed values above the LSRL |