click below
click below
Normal Size Small Size show me how
stats
| Question | Answer |
|---|---|
| what is the null hypothesis in one-way ANOVA? what is the alternative hypothesis in one-way ANOVA? | H0: μ1 = μ2 = . . . = μg (The population means of the g groups are all equal.) Ha: μi ̸ = μj populat means not equal for at least one pair |
| What does ANOVA stand for | Analysis of variances |
| Type 1 Error | (false positive) rejecting null when its really tue |
| Type 2 Error | (false negative) failing to reject null when its false |
| when would you use pnorm() in r instead of pt() | (sd known) You are working with a z-test The population standard deviation σ is known, OR Sample size is large (typically 𝑛≥ 30 n≥30), so normal approximation is valid |
| when would you use pt() in r instead of pnorm() | (sd unknown) You are working with a t-test The population standard deviation is unknown You are using the sample standard deviation (s) Especially with small samples |
| How would you find p value for hypothesis test in r | 2 * (1 - pnorm(abs(z))) 2 * (1 - pt(abs(t), df)) |
| How would you find a right tailed p value | 1 - pnorm(z) 1 - pt(t, df) |
| How would you find a left tailed p value | pnorm(z) pt(t, df) |
| When do I use pnorm() vs pt()? | pnorm() → z-test (σ known OR large n) pt() → t-test (σ unknown, using s, small n) |
| When to use χ² test | Tests association or independence |
| What is a p-value? | Probability of observing your data given that the null hypothesis is true, so it must be between 0 and 1 |
| If CI excludes 0 (difference in means/proportions) this indicates | Statistically significant difference |
| What does ANOVA test? | H0:μ1=μ2=μ3=… |
| What is MSE? | Pooled variance (within-group variability) |
| Is F-test one-sided or two-sided? | One-sided (right tail only) |
| Two-proportion test uses what distributions? | z-test (most common) but also equivalent to χ² (df = 1) |
| What does a boxplot show? | Median Q1, Q3 IQR Outliers ❌ NOT standard deviation |
| χ² p-value direction | Always right tail |
| If sample size increases what happens to CI and SE | CI gets narrower Standard error decreases |
| : For Ha: μ1<μ2 , where is the p-value? | left tail only |
| What is the standard error of the mean? | Variability of the sample mean across repeated samples |
| What does chi square test for | A chi‑square test checks whether the pattern of categorical data you observe is meaningfully different from what you would expect if nothing interesting were happening. |
| Two-sided test, t = 1.73, df = 39 → R command? | 2×(1−pt(1.73,39)) |
| What does a straight line in a QQ plot mean? | Approximately normal distribution |
| χ² test null hypothesis | Variables are independent |
| When to double a p-value? | Two-sided tests only NOT for F or χ² |
| When is pooled t reasonable vs welch? | When variances are similar Normality assumed |
| What makes CI wider? | Larger σ Higher confidence level Larger n (makes it narrower) |
| What does higher R² mean? | Stronger linear relationship |
| When should I use a binomial probability formula? | Use binomial when ALL of these are true: Fixed number of trials (n is given) Only 2 outcomes (success/failure) Same probability each time (p stays constant) Trials are independent |
| Why is CI coverage a binomial problem? | Each confidence interval is like a trial: “Success” = interval captures μ “Failure” = it does not |
| If you see: “Out of many confidence intervals, how many capture the true mean?” Immediately think.. | Use the binomial formula |
| What is a residual? | Residual=y−y^(x) ✔️ Actual − predicted |
| What is the null hypothesis in regression? | Ho :β=0 No linear relationship |
| What does a large F-statistic indicate? | Between-group variation > within-group variation → Evidence means are different |
| : Is ANOVA p-value one-sided or two-sided? | One-sided (right tail only) |
| If p < α in ANOVA, what do we conclude? | At least one mean is different Not which one |
| What does pnorm(x, μ, σ) give? | P(X≤x) |
| How to compute P(X>x) in R? (right tailed) | 1−pnorm(x,μ,σ) |
| What does qnorm(p, μ, σ) give? | Value of x such that P(X≤x)=p |
| If X∼N(100,5), how to find P(X>120)? | 1 - pnorm(120, 100, 5) |
| What does qt(p, df) give? | Critical t-value |
| How to get χ² p-value? | 1 - pchisq(x, df) |
| How to get ANOVA p-value in R? | 1 - pf(F, df1, df2) |
| When do I use t vs normal distribution? | σ known → normal σ unknown → t |
| Which tests are ALWAYS right-tailed? | χ² & F |
| Which function gives critical values? | Normal → qnorm t → qt |
| If the ANOVA test concludes there is statistically evidence there is a difference between groups, is the analysis complete? | No because pairwise groups should be compared to see where the differences are |
| How do the Bonferroni intervals compare to the LSD intervals | Bonferroni intervals are wider (more conservative) than LSD intervals. With LSD, each pairwise comparison uses a 95% confidence levels But with Bonferroni, you divide α across the number of comparisons |
| What are the assumptions of ANOVA | simple random samples, normally distributed with common variance σ2 |
| f we let F represent a random variable that has an F distribution with 5 and 8 degrees of freedom, what is P (F > 4.82)? | 1-pf(4.82, 5, 8) |
| If we let F represent a random variable that has an F distribution with 6 and 4 degrees of freedom, what is the value a such that P (F > a) = 0.05? | qf(0.95, 6, 4) |
| T or F If the null hypothesis (and the assumptions) are true, then the test statistic in one-way ANOVA has an F distribution. | True |
| T or F n one-way ANOVA, we assume that the observations within each group are normally distributed, and that all groups have the same population variance. | True |
| T or F n one-way ANOVA, we assume that the observations within each group are normally distributed, and that all groups have the same population mean. | False |
| T or F he test statistic in one-way ANOVA can be negative | False |
| T or F if the null hypothesis is false, then MSG will tend to be bigger than MSE | True |
| Can the f stat ever be less than zero in ANOVA | no, it is always non negative |
| Y = β0 + β1X + ε, what is Y and X | Y is the response, X is the explanatory |
| Y = β0 + β1X + ε, what is β0 and β1 and ε | β0 is the Y intercept β1 is the slope of the line ε is a random error term |
| if the units of Y are metres, and the units of X are seconds, what are the units of ˆβ0 and ˆβ1? | The units of ˆβ0 are metres (the same units as Y ). The units of ˆβ1 are metres/second ( Units of Y Units of X ) |
| What is the central limit theorem | When you take many random samples from any population (as long as it has a finite mean and variance), the distribution of the sample mean will become approximately normal (bell‑shaped) as the sample size gets large. |
| T or F β0 has the same sign as ˆβ1 | false, they could but they dont have to |
| T or F If all data points fall perfectly on a line, then r = 1 or −1 | True |
| T o F he least squares regression line always passes through the point (0, 0) | False, The line will only pass through the origin if ˆβ0 = 0 |
| T or F the least squares regression line always passes through the point ( ̄X, ̄Y ) | True |
| T or F If there is no linear relationship between Y and X, then r = 1. | false |
| What indicates that a simple linear regression model is reasonable for a dataset? | A straight‑line pattern fits the data well, residuals show no strong curvature, and deviations appear random rather than systematic. |
| Can simple linear regression be used when Y is a discrete count? | Yes, although the normality assumption is not strictly true. Regression can still provide a useful approximation if the relationship is roughly linear. |
| How do you interpret the slope in a simple linear regression model? | It represents the estimated change in the mean response Y for a one‑unit increase in X. |
| How do you interpret a change of c units in X? | Multiply the slope by c to estimate the change in the mean of Y. |
| What does a very small p‑value for the slope test (H₀: β₁ = 0) indicate? | Strong evidence of a linear relationship between X and Y. |
| What does a large p‑value for the slope test imply? | No evidence of a linear relationship between X and Y in the sample. |
| Let X₁, X₂, X₃ be independent Uniform(0, 50). What is P(X₁ > 20, X₂ > 20, X₃ > 20)? how do you solve | ((50-20)/50)^3 interval - probability over interval to the pow of n |
| For what values of n and p does the normal approximation to the distribution of ˆp work best? | n is large and p is 0.5 |
| For what values of n and p is the normal approximation very poor? | n is small and p is close to 1 or 0 |
| What are the conditions to use normal approximation for proportions | |
| given r output how do you find the proportion of variance | at the bottom look for the multiple r squared |
| how do you calculate the residual? | add first row together but second part gets multiplied by a given x, subtract your ans from the other given value |