click below
click below
Normal Size Small Size show me how
Stats
| Question | Answer |
|---|---|
| individual | Object described by data (can be person, animal, thing). |
| variable | Characteristic of an individual (e.g., age, gender). |
| categorical variable | Places individuals in groups (e.g., race, major). |
| numerical variable | Numbers where math makes sense (e.g., GPA, income). |
| continuous variable | Any value in a range (e.g., weight, sales). |
| discrete variable | Countable numbers (e.g., # of siblings). |
| binary variable | Only two outcomes (e.g., Yes/No, T/F). |
| ordinal variable | Ordered categories (e.g., grade level). |
| observational study | Observes without interfering (e.g., surveys). |
| experiment | Researcher imposes treatment to test cause-effect. |
| response variable | Outcome/result measured. |
| explanatory variable | Factor that may cause a change. |
| population | Entire group we want info about. |
| sample | Subset of population actually studied. |
| census | Attempts to study the entire population. |
| bias | Systematic favoring of outcomes. |
| convenience sample | Choosing easiest people to reach. |
| voluntary response sample | People choose to respond (often extreme opinions). |
| simple random sample(srs) | Everyone has equal chance. |
| parameter(p) | Number describing population (usually unknown). |
| Statistic (p̂) | Number describing a sample (known). |
| margin of error | Range where truth likely falls. |
| confidence statement (95) | 95% of samples will give results close to truth. |
| random sampling error | Variation in sample results due to chance. |
| non sampling error | Errors not related to the sampling process; harder to control. Examples: processing errors, response errors, nonresponse |
| undercoverage | When some groups in a population have zero chance of being selected. |
| sampling frame | List of individuals from which the sample is drawn. |
| Erroneous inclusion: | Units not in population are included. |
| multiple inclusions | Units appear multiple times (e.g., multiple phone lines). |
| a nonsampling processing error | Mistakes like incorrect arithmetic or data entry. |
| non sampling response error | Incorrect responses due to lying, misunderstanding, or misreporting. |
| nonsampling non response | Failure to obtain data from a selected individual. |
| ways to handle non sampling | Substitute similar households. Weight responses statistically. |
| question wording what does it do | Can influence responses (e.g., “football” could mean soccer or American football). |
| Probability Sample: | Each possible sample has a known chance of being selected. |
| stratified sample | Divide population into groups (strata) → take SRS within each → combine. |
| Questions to Evaluate a Poll | Who conducted the survey? What was the population? How was the sample selected? How large was the sample (margin of error)? Response rate (%)? How were subjects contacted? When was the survey conducted? Exact questions asked? |
| casual inference | Randomized comparative experiments allow cause-and-effect conclusions. |
| lurking variable | A variable affecting results but not included as explanatory. |
| cofounding | When effects of two variables cannot be separated. Solution: Compare two or more treatments. |
| randomized comparative experiment | Compares two treatments. |
| double blind experiment | Neither subjects nor experimenters know which treatment is assigned. |
| Nonadherers: | Participants who do not follow assigned treatment. |
| instrument | Tool used to measure a variable. |
| a valid meassurement | Appropriately represents the property. |
| predictive validity | Can measurement predict success on related tasks |
| random error | Variation when measuring same individual multiple times. Small random error → reliable measurement. |
| variance | Quantifies random error: Find mean of n measurements. Subtract mean from each measurement → square differences. Average squared differences (divide sum by n−1). |
| bar chart | Compares variables between groups. |
| box plot | Shows distribution and spread between groups using quartiles. |
| histogram | Displays distribution (shape, center, variability) of a quantitative variable. |
| scatter plot | Shows relationship between two quantitative variables. |
| piechart | Shows how a whole divides into parts (percentages that add to 100). |
| pictogram | A bar graph using pictures instead of bars. |
| Pie Chart vs. Bar Chart | Pie Chart: Shows parts of a whole (must equal 100%). Bar Chart: Easier to compare categories or show only a few; more flexible and compact. |
| seasonal variation | Definition: A pattern that repeats at regular time intervals (e.g., monthly, yearly). Seasonally Adjusted Data: Has expected seasonal changes removed. |
| rounding error | Small differences (like totals adding to 99.9% or 100.1%) caused by rounding numbers early. |
| histogram decsiption types | Center: Middle of the data (mean or median). Variability : How spread out the data is. Symmetric: Both sides mirror Skewed Right: Tail extends to the right (higher values). Skewed Left: Tail extends to the left (lower values). |
| stemplot | Split each number into a stem (all digits except last) and leaf (last digit). Write stems vertically (smallest at top). Write leaves to the right of stems in order. Purpose: Displays distribution and preserves actual data values. |
| median | Definition: Middle value that splits data into two equal halves. How to Find: Arrange numbers in order; middle value (or average of two middles if even count). |
| five number summary | Includes: Minimum, Q1, Median (M), Q3, Maximum. Visualized with: Boxplot. |
| standard deviation | Definition: Average distance of data points from the mean. Purpose: Measures how spread out the data is. Note: Larger SD = more variability. |