click below
click below
Normal Size Small Size show me how
2Qunatit
data analysis
| Question | Answer |
|---|---|
| useful graphs | scatterplot can get a sense for the nature of the relationship |
| what to look for in a graph | relationship between two variables where one variable causes changes to another |
| location | where most of the data lies |
| spread | variability of the data, how far apart or close together it is |
| shape | symetric, skewed etc |
| nature of relationship | existent/ non-existent strong/ weak increasing/ decreasing linear/ non-linear |
| outliers in scatterplots | represent some unexplainable anomalies in data could reveal possible systematic structure worthy of investigation |
| casual relationship | relationship between two variables where one variable causes changes to another |
| explanatory variable | explains or causes the change on x-axis |
| response variable | is changed on y-axis |
| useful numbers | correlation and regression |
| formula for the correlation coefficient | r= 1/(n-1) ∑▒〖((xi-x ̅)/sx〗)((yi-y ̅)/sy) |
| xi or yi | axis values of corresponding letter |
| xbar or ybar | mean of axis values of corresponding letter |
| sx or sy | standard deviation of axis values of corresponding latter |
| properties of r | close to 1 = strong positive linear relatoinship close to -1 = strong negative linear relationship close to 0 = weak or non-existent linear relationsip |
| cautions about the use of r | only useful for describing linear relationships sensitive to outliers |
| regression models | general linear relationships between variables focus negative = decrease |
| what regression modelling does | describes behaviour of response variable (the variable of interest) in terms of a collection predictors (related variables ie. explanatory variable(s)) |
| a linear framework is used to look at? | the relationship between the response and the regressors formula: Y = α + βx Where α is the intercept and β is the slope |
| ideal model for linear framework in terms of responses and regressors | one unique response to one given regressor |
| real world model for linear framework in terms of responses and regressors | must approximate |
| statistical model | relates response to physical model predictions allows for better predictions and quantification of uncertainty concerning the response to make decisions |
| what does regression analysis do? | finds the best relationship between responses and regressors for a particular class of models |
| experimenter controls predictors, why? | may be important for making inferences about the effect of predictors on response |
| course assumption | predictors are controlled in an experiment or at least accurately measured |
| define a good statistical model | fit, predictive performance, parsimony interpretability |
| qualitative description of model | response = signal + noise Y = α + βx + ǫ ǫ = noise |
| define signal | a small number of unknown parameters variation in response explained in terms of predictors it is the systematic part of the model |
| define noise | residual variation unexplained in the systematic part of the model can be described in terms of unknown parameters |
| what does a good statistical model do to possibly large and complex data | reduces it to a small number of parameters |
| a model will fit well if | the systematic part of the model describes much of the variation in the response (low noise) large number of parameters may be required to do this |
| define parsimony: | smaller number of parameters = grater reduction of data, more useful for making a decision |
| there is a cycle between what? | tentative model formulation, estimation of parameters and model criticism |
| a good model will | manage balance between goodness of fit and complexity provide reduction useful data |
| model response variable in terms of a single predictor | yn = values of the response variable |