click below
click below
Normal Size Small Size show me how
Vocabulary Challenge
Simple Linear Regression Key Terms
| Term | Definition |
|---|---|
| Simple Linear Regression | a quantitative statistical method that allows us to find the best model for a linear relationship between the explanatory (independent) and predictor (dependent) variables |
| Linear Regression Model | DATA = FIT + RESIDUAL |
| Linear Regression Equation | Ypred = βo + β1*Xexp + ϵ |
| Explanatory Variable | the independent variable whose value can be used to predict the value of the predicted or response variable |
| Predicted Variable | the dependent variable whose value can be explained by the value of the explanatory or the predictor variable |
| Scatterplot | a graph that shows the relationship between two quantitative variables measured on the same individual (does not connect the points) |
| Slope | the estimated rate of change of the response variable per unit change of the explanatory variable |
| y-Intercept | the estimated value of the predicted variable when the value of the explanatory variable is zero |
| Residual | the difference between the observed value of y and the predicted value of y |
| Correlation Coefficient | used in statistics to determine how well the variables are related; a statistical measure of the linear correlation between the two given variables (also known as Pearson's r) |
| Least-Squares Regression Line | the best fit line that is calculated by minimizing the sum of the squares of the differences between the observed and the predicted values of the line |
| Assumptions | tests that must be met before moving forward with a linear regression |
| Linear Relationship | correlational relationship in which the data points cluster around a straight line |
| Variance | the expectation of the squared deviation of a random variable from its mean; informally measures how far a set of numbers are spread out from their mean |
| Homoscedasticity | The data is evenly spread out around the line of regression |
| Normality | used to determine if a data set is well-modeled by a normal distribution |
| Multicollinearity | occurs when the independent variables are highly correlated among themselves in a multiple regression |
| Tolerance | measures the influence of one independent variable on all other independent variables |
| Outlier | a value that "lies outside" (is much smaller or larger than) most of the other values in a set of data |
| Confidence Interval | a range of values so defined that there is a specified probability that the value of a parameter lies within it |