Busy. Please wait.

show password
Forgot Password?

Don't have an account?  Sign up 

Username is available taken
show password


Make sure to remember your password. If you forget it there is no way for StudyStack to send you a reset link. You would need to create a new account.
We do not share your email address with others. It is only used to allow you to reset your password. For details read our Privacy Policy and Terms of Service.

Already a StudyStack user? Log In

Reset Password
Enter the associated with your account, and we'll email you a link to reset your password.
Don't know
remaining cards
To flip the current card, click it or press the Spacebar key.  To move the current card to one of the three colored boxes, click on the box.  You may also press the UP ARROW key to move the card to the "Know" box, the DOWN ARROW key to move the card to the "Don't know" box, or the RIGHT ARROW key to move the card to the Remaining box.  You may also click on the card displayed in any of the three boxes to bring that card back to the center.

Pass complete!

"Know" box contains:
Time elapsed:
restart all cards
Embed Code - If you would like this activity on your web page, copy the script below and paste it into your web page.

  Normal Size     Small Size show me how

Biostatistics Final

Correlation and Regression

Purpose of Correlation and Regression Make inferences based on sample data that come in pairs. Determine if there is a linear relationship b/w the two quantitative variables & describe it with an equation that can be used for predictions. Two dependent populations (quantitative data).
Correlation Correlation coefficient measures the strength of the linear relationship b/w two quantitative variables. Variables must be continuous/discrete. Use scatter plot. X & Y are linearly related if the scatter of points can be approximated by a straight line.
Correlation Coefficient r measures the strength of the linear relationship b/w the paired x & y values in a sample. Represents the linear correlation coefficient for a sample. Rho represents the linear correlation coefficient for a population.
Correlation Sxy:covariance of x & y. Sx:standard deviation of x. Sy:standard deviation of y.
Interpreting the Linear Correlation Coefficient r Between -1 & 1. If r close to 0, no linear correlation b/w x & y. If r close to -1 or 1, strong linear correlation. Negative value indicates negative or inverse relationship. Positive value indicates positive relationship. r measures strength & direction.
Factors That Affect the Size of r Nonlinear relationship: linear correlation only measures degree of linear relationship, so if Xs and Ys are nonlinearly related, r may be 0 even though the 2 variables are nonlinearly related. Restricted range: restrictions on range of X/Y will reduce r.
Factors That Affect the Size of r Extreme Scores: a single extreme score may produce evidence of correlation when none exists. Combining groups: there may be no correlation w/n either group, but combining them can give the illusion of a linear correlation. Can also change its direction.
Correlation Testing hypotheses about rho. A single r can be tested to determine if the corresponding rho is different from a hypothesized value. df=n-2. CORRELATION DOES NOT PROVE CAUSATION. measures how well the best-fitting straight line actually fits.
Assumptions For each value of X there is a normally dist. subpop. of Y values. For each value of Y there is a normally dist. subpop. of X values. Joint dist. of X & Y is a normal dist. The variance of Xs/Ys is same at each value of X/Y (homoscedasticity).
R-Squared Coefficient of determination. The proportion of the variation in y that is explained by the linear relationship b/w x & y. SSR/SST. Measures closeness of fit of the sample regression equation to the observed values of Y.
Regression Used to find the best-fitting straight line that relates the scores. Objective is to predict the value of one variable (the outcome) based on the value of another variable. Use scatter plot. Best fitting line minimizes y-yhat (actual-predicted).
Regression SStotal:variation in obs. values of response variable. SSregression:variation in obs. values of response variable explained by regression. SSerror:variation in obs. values of response variable not explained by regression. SSR:1df SSE:n-2df SST:n-1df
Least Squares Criterion The best fitting straight line is the one that minimizes the sum of the squared deviation b/w the actual y values & the predicted values. Minimize SSE.
Beta The population parameter for b, the slope of the line.
How Can You Tell A Regression Question From A Correlation Question? Intent: Prediction=regression, Strength of relationship=correlation
Created by: horsenerd09