click below
click below
Normal Size Small Size show me how
AP STATS
| Question | Answer |
|---|---|
| Variable | can be measured or observed, varies among individuals or objects within a study or dataset |
| Categorical Variable | Non numerical groupings (gender, color, type of car) |
| Quantitative Variable | Numerical value (height, weight, age) |
| Two-Way Table | Relationships between two CATEGORICAL values |
| Histogram | Height of bars is frequency, x-values are buckets |
| Back-to-Back Stemplot | Two stemplots joint by stem comparing |
| Five Number Summary | Min, Q1, Median, Q3, Max |
| IQR | Q3-Q1 |
| Marginal Distribution | One variable (just who likes fish) |
| Conditional Distribution | One variable, given another (girls who like fish) |
| Resistance | Resistant variables are not easily influenced by outliers. (Median & IQR). Mean + st. dv. are not. |
| Variance | how spread out a set of data is from its mean. s^2. Can find standard deviation using lists & 1-Var Stats |
| Mosaic Plot | Segmented bar graph but the width also shows how much the data is |
| Five Number Summary | Min, Q1, Median, Q3, Max |
| Frequency | the number of times it occurs in a dataset (its a #) |
| Relative Frequency | # of times / total # (its a %) |
| Density Curve | Area under curve = 1 |
| Response Variable | Dependent, usually y value |
| Explanatory Variable | independent, usually, x value |
| Positive Association | x & y go up. r > 0 |
| Negative Association | x goes up, y goes down. r < 0 |
| Describing Form | RoughLY liner, SlightLY curved - LY adjectives |
| Describing Stength | How close the points are (moderate, weak, strong) |
| Correlation / Correleation Coeffecient | r value, always between -1-1 where r=1/-1 is perfectly linear and r=0 is no association |
| Least Squares Regression Line | Calculate via Stat - > 8. L1 & L2 need to be set up as Explanatory & Response |
| Coefficient of Determination | r^2 |
| Inference | can only do if the individuals from a population taking part in the study were randomly selected |
| One standard deviation contains | 68% of the data |
| Two standard deviation contains | 95% of the data |
| Three standard deviation contains | 99.7% of the data |
| Empirical Rule | 68% / 95% / 99.7% |
| Contingency Table | Two Way Table LOL |
| Description of a Scatter Plot | form, direction, strength, and unusual features. |
| Residual Plots | No association = good fit |
| Census | Data on EVERY SINGLE MEMBER of the population |
| Simple Random Sample (SRS) | Equal likelihood - ex using a random number generator to select which ones to include in the sample, ignoring repeats |
| Stratified Random Sample | Divided into homogeneous strata (ex grade level), and then do SRS within the stratas. Combine in end to form sample (Some from all) |
| Cluster Sample | Taking entire groups instead of individuals - aka wanting to learn about highschool opinions on school lunch and talking to every student in 2/4 highschools instead of some students from 4/4 (All from some) |
| Systematic Random Sample | People selected via a random fixed point in an periodic interval (every 5th person) |
| Bias | Systematically favoring a response over another (issue in the sampling METHOD, not RESPONSE) |
| Voluntary Response Bias | Sample won't rep population when its only people who choose to partipate - NOT RANDOM |
| Question Wording Bias | Self explanatory |
| Convenience Bias | Talk to whos easy, NOT RANDOM, |
| Confounding Variable | Related to explanatory & influences response, creates FALSE association. ex eating more ice cream = more sunburn. Hot temp is confounding |
| 4 Components of a Well-Designed Experiment | 1. Compare two treatment groups (1 can be control), 2. Random treatments 3. Replication (1+ per group) 4. Control of confounding variables |
| Completely Randomized Experiment Design | Randomly assigned treatments, balances the effects of confounding variables |
| Single-Blind Experiment | Either researchers OR patients don't know what treatment they're getting |
| Double-Blind Experiment | Subjects & the ppl who interact don't know whats going on |
| Control Group | No treatment OR placebo treatment |
| Placebo Affect | Responding to a placebo treatment...lol |
| Randomized Block Design | Split into blocks that will influence results (ex men & women), randomly assign to blocks. Stops natural variability |
| Matches Pairs Design | Match based on a relevant factor. Either one gets treatment and other doesn't or both get |
| Statistically Significant | Unlikely to happen off chance alone |
| Law of Large Numbers | More trials = closer to expected value |
| Sample Space | All possible non-overlapping outcomes. Ex flipping coins - heads & tails is the sample space. |
| Complement | Probability of not E. 1-P(E) = Complement of E |
| P(AnB) when Independent (General Multiplication Rule) | P(A) * P(B) |
| P(AnB) when Dependent (General Multiplication Rule) | P(A) * P(B|A) OR P(A and B) = P(B) * P(A|B) ex - The probability of drawing a king and then drawing a queen is P(A) * P(B|A) = (4/52) * (4/51) = 4/667 (since there are now only 51 cards and 4 of them are queens). |
| Mutually Exclusive / Disjoint Events | (P(A or B) = P(A) + P(B) |
| P(A U B) (General Addition Rule) | P(A) + P(B) - P(A and B) |
| P(AIB) | P(AnB)/P(B) |
| Independent Events | A occurring does not influence B occurring. ex: flipping a coin twice. P(AIB) = P(A) & P(BIA) = P(B) |
| Discrete Variable | ex: number of cars in a parking lot. Whole integers |
| Continuous Variables | ex: height of students. Any variable in a range, uncountable |
| CSOCS | Interpreting Probability Graphs. Context, Shape (Symmetric, Unimodal, Skew, Uniform), Outliers, Center (Mean, Median, Mode), Spread (Range, IQR, Standard Deviation) |
| Binomial Setting | 2 outcomes |
| Geometric Setting | Independent trials until success |
| Sampling Distribution | taking ALL possible samples of a given size and put those sample statistics together as a data set. |
| 10% Condition (Independence) | Random sample needs to be LESS than 10%. Ensures that sampling without replacement is approximately equal to sampling with replacement. n < N0.1 |
| Large Counts (Normal) | n(p) & n(1-p) are greater than or equal to 10 |
| Central Limit Theorm (Normal) | Sample size is at least 30 |
| DF for Chi-Squared GOF | # of categories - 1 |
| DF for Chi-Squared Indy & Homo | (# of rows - 1) * (# of columns - 1) |
| MOE | critical value * sample standard error |
| Z* for a 90% confidence level | 1.645 |
| Z* for a 95% confidence level | 1.96 |
| Z* for a 99% confidence level | 2.576 |
| Z* Formula | (x - μ) / σ OR invNorm |
| T* Formula | (x - μ) / σ OR invT |
| 1-Sided Alternative Hypothesis | Greater or less than the Null |
| 2-Sided Alternative Hypothesis | Not Equal to the Null |
| P-Value | the probability of rejecting a null hypothesis when it is actually true. Under = Reject null, over = fail to reject null |
| Type I Error | incorrectly rejecting a true null hypothesis :( (first is truth) |
| Type II Error | failing to reject a false null hypothesis D: (II looks like an f) |
| GOF Expected Counts | N(pi) |
| Indy & Homo Expected Counts | (row total) * (column total) / GRAND total |