Save
Upgrade to remove ads
Busy. Please wait.
Log in with Clever
or

show password
Forgot Password?

Don't have an account?  Sign up 
Sign up using Clever
or

Username is available taken
show password


Make sure to remember your password. If you forget it there is no way for StudyStack to send you a reset link. You would need to create a new account.
Your email address is only used to allow you to reset your password. See our Privacy Policy and Terms of Service.


Already a StudyStack user? Log In

Reset Password
Enter the associated with your account, and we'll email you a link to reset your password.
focusNode
Didn't know it?
click below
 
Knew it?
click below
Don't Know
Remaining cards (0)
Know
0:00
Embed Code - If you would like this activity on your web page, copy the script below and paste it into your web page.

  Normal Size     Small Size show me how

AP Stats Exam review

QuestionAnswer
Categorical Data Typical word description ex. Color, political party, gender
Quantitative data Data that is numerical; has an average, can talk and the spread; can be graphed.
Bar Charts For categorical data, use a y-axis; graph the totals or percentages
Pie Charts For categorical data, breaks down one total into subgroups, adds to 100%, always percentages
Misleading Graphs Used to exaggerate results by having the y-axis start at a convient point
Maginal Distributions The numbers in the margins , of a two-way table, out of a total.
Segemented Bar Graph A bar graph where various categories are spread amoung multiple groups
Simpsons Parodox When subgroups can show one relationship yet overall data shows a reverse relationship. When subgroups are unbalanced
Stemplots Like a histogram but still shows actual values. Can do a split stem plot which is two distributions
Dotplots Works well for small data sets. Having dot of each point in relation to it on a below line.
Distrubution A way of describing a varaible by what value it takes on and with what frequency
SOCS S: Shape, of the graph (symetric, bimodal, skewed, uniform) O: Outliers, any unusual values C: Center, mean median S: Spread, five number summaries
Skewness AKA strected either left or right
Histogram Similar to bar graph but each value goes in the bar directly to the left of it.
Percentiles THe percentage of data at or below a particular value
Effect of Shape on Center Symetric: mean = median: skewed left: mean is less than median; Skewed Right: mean is greater than median
Five number Summary Min, Q1, median, Q3, Max
Interquartile Range IQR is Q3-Q1
Bosplots A graph of the five number summary
Boxplot Outlier Rule Anything above Q3+1.5(IQR) or below Q1-1.5(IQR)
Standard Deviation Roughly estimates how far, on average, the data values are from the mean
Resistant Statistics It´s resistant if the inclusion of an outlier (or strong skewness) will not or barely affect it
Variance Is equal to (SD)²
Probability Distributions Like a distribution but the frequencies that the values of that variable take on are expressed as proportions
Ogives AKA Cumulative frequency graphs Shows how much of the data is at or below a particular data value
z values The number of standard deviations a data point is from the mean AKA standardize scores
Linear Transformations New data = A+B(original data) or y=a+bx. If you add a constant to "a" or to the data set only the measures of center and location change. If you multiply by a constant "b" then all summary measures change
Density Curves A graph when the curve is completely above the x-axis and area=proportion so total area under the curve =100%
Discrete If the values can be counted or listed completly
Continuous If the values are uncountable or have an innumerable amount of possibilities; unlimited outcomes
The normal distribution A symmetrical density curve, Measures the proportion of data per interval of standard deviation
The empirical rule In a normal distributions is the percent of data incompassed by +- each SD from the mean: 68%, 95%. 99.7%
The standard normal distribution Has a mean of sero and a SD of one
Scatterplot Shows relationships between two quantitative variables by pairing two related values as coordinates and graphing them
Explanatory Variables AKA input; helps predict the response variable, goes on the x-axis
Response Variable AKA Output; the variable which is a result of the input or explanatory varible, goes on the y-axis
Graph variable one against variable two means Variable one is on the y-axis and variable two is on the x-axis
DOSS D: Direction, Upward? Downward? Postive or negative? O: Outliers, unusual points S: Shape, Linear? Curved? S: Strength, Weak? Strong? Modereate?
R-value A unitless number that simply describes the strength and direction of the relationship between two quantitative variables.
Correlation vs. Causation They may be related for confounding variables
Correlation Is for two quantitative varibles
Association For categorical variables
Interpret Slope For each additional one unit increase (x), the predicted change in (y) will be (slope)
Interpret y-int If x is zero then we predict that y will be y-int
Least Squares Regressions line AKA LSRL. The one line that makes the sum of all these squares as small as possible
Residuals The differnces between the LSRL and the actual y coordinates
Residual Plot A scatterplot that pairs the residuals with the x-values of the data. Shows how large the residuals are (errors)
R-squared The amount of variation in y that is explanied by the linear relationships with x
Interpret r-squared (r²) % of variation can be explained by the linear relationship with x
On the scatterplot; Outlier Is any point that has a large residual after the line has been computed for the data
On a scatterplot; Influential point Any point that if taken out would have a large effect on the slope or coorelation
Statistics Any value that describes or summarizes a sample
Parameter Any value that describes or summarizes a population
Census The data is the entire population
Sampling Frame The part of the population from which the smaple was actually drawn
Bias A sampling method is biased if we suspect the method used will produce estimates that are predictable compared to the popluation
Bias is not a sample ____ issue, but a sampling ____ issue Size; method
Voluntary Response Bias Which subjects aren't randomly chose, but rather subject choose if they will provide data.
Non response Bias Has random selection but a meaningful proportion of the population wasn't sampled
Response Bias When the data itself is highly suspect or inaccurate (ex. illegal activity)
Under coverage Bias When the sampling method never gives a subgroup a chance to be in the sample
Convenience Sample Using data that is simple or easy to gather
Simple Random Sample SRS; Each item in the population has the same chance of being selected. All groups are equally possible
How to SRS 1) Assign each item in the population a number. Then mix them up then draw out the desired number of subjects OR 2) Use a random number table or use a calc.
Stratified Sample Use if you think there might be a confounding variable associated with the variable of interest. Group subjects into "strata"in the proportion that they make up the population
Stratified samples have ___ variation from sample to sample than SRS Less
Cluster Sampling Break down the population into "clusters" of mixed subgroups, then randomly select which clusters to sample from.
Systematic Sample Where you sample every kth item. After you randomly select the first number,
Observational Study You compare two or more groups according to some explanatory variable (s) and measure a response variable to "observe"the differences.
You ____ deduce a cause and effect relationship between the variables in an observational study, Cannot
Experiments We control the application of the explanatory variables through random assignment of treatments of subjects.
We ___ conclude a cause and effect relationship between variables in an experiment. Can
Retrospective Observational Study Find subjects with desired response variables and look back to see how they differ from the explanatory variables
Introspective Observational Study Find subjects with desired explanatory variables and follow into the future to see how they differ from the response variables
Good Experiments Have more than one treatment, random assignments, control of other variables and replication
Experimental Units Whatever are being randomly assigned to the treatments of the explanatory variables
Treatments Specific levels or combos of all levels of factors in an experiment
Control Group Establish a baseline; measure the placebo effect
Placebo A fake treatment to see if the mere participation in the experiment produces change in behavior
A placebo is ____ than just a control group that gets nothing Better
Blind Subjects don't know which treatment they get
Double-blind Researchers and subjects don't know who gets what treatment
Block Design Subjects are grouped by known similarities
Matched Pairs Paired with yourself or pairs of similar subjects
Completely Randomized No grouping ahead of time; subjects are randomly assigned to treatment groups
Confounding varaibles A variable associated with the explanatory variable that may help explain association or cause it
Cross-over matched pairs Assigned to yourself
Statistically Significant More than what would reasonably happen just due to chance or less than alpha
P(A and B) = P(A) x P(B|A)
P(B|A) = P(B) is A and B are independent
P(A or B) = P(A) + P(B) - P(A and B)
P(A and B) = 0 is A and B are mutally exclusive
Independence Two events are independent if the probability of the second event is unchanged regardless of whether the first event is happening
sample space Listing all the possible outcomes, the sum of all probabilities of the event is one, all the probabilities are between 0 and 1
probability distribution including the probability of each event in your sample space
Complements Everything in the sample space besides that event
Conditional Probabilites P(B) vs. P(B|A) Basically a probability that the factors in extra information, or an additional "condtion"
Probability of A or B P(A or B) = P(A) + P(B) - P(A and B), also written as P(A U B)
Discrete Variables Can take on a finite number of outcomes; they can be listed
Continuous Variables Can take on an infinite number of outcomes, best described by intervals rather than specific outcomes
You can only combine variances for ______ ______ independent variables
Binomial Distributions conditions 1) exactly two outcomes of interest for each trail, 2) a fixed number of trials, 3) same probability of success for each trail, 4) each trial is independent of each other
Binomial PDF Probability of exactly k successes out of n trials
Binomial CDF Probability of k or less success out of n trails
10% condition If the sample is less than 10% of the population then sampling without replacing is "close enough" to the same probabilty of success on each trial
Large Counts Condition If np and n(1-p) are both greater than or equal to 10
Geometric distributions Have a fixed probabilty of success, each trial is independent, exactly two outcomes of interest BUT we're interested in our first sucesss being on a certain trial
Sampling Distributions Describes the set of all possible values a statistic can have for a given sample size. They are very predictable in the long run even though each sample statistic is unpredictable
Statistics are said to be _____ is in the long run they average out to equal the population parameter unbiased
The _____ (SD) of a sample statistic is only dependent on the sample size, not the population size variability
A sampling distribution is ____ ____ if np and n(1-p) is greater than or equal to 10 approx. normal
Central Limit Theorem A sampling Distribution is approx. normal when n is greater than or equal to 30
Confidence Intervals Estimate populations parameters (means, proportions) by giving an interval where we believe the parameter might lie and how confident we are about it
Significance tests access the weight of the evidence against the population parameter being ture
We assume our hypotheses are ___ unless we find significant evidence to the contrary true
Point estimate Giving your sample statistic as best specific estimate of an unknown population parameter
General Formula for z-int Statistic + - multiplier (SE)
Muilipier Same as critical value
Interpret a confidence interval We can be % confident that the actual (parameter of interest) lies between __ and ___ (units)
Conditions for z-int 1) random sample, 2) population greater than or equal to 10, 3) at least 10 successes and 10 failures
ACDC A: Announce, What test? Ho? Ha? P? M? Alpha? C: Conditions, Known or assumptions? D: Do, Procedure? P=? df? z? t? C: Conclude, Interpret? Reject Null? Sig Evidence?
Conditions for t int 1) random sample, 2) population greater than or equal to 10
The normal Condition Either the population is known to be normal, n is greater than or equal to 30 or we graph the data and there is no strong skewness or outliers
We use a t-int when there are __ random variables 2
Degree of freedom DF, for one sample methods is df=n-1
Samples ____ prove any hypothesis to be true or false cannot
p-value The probabilty of getting your test statistic or more extreme, assuming the null is true
If the p-value is small reject the null
Null Hypothesis Ho, always has =
Alternate hypothesis Ha, will involve an inequality sign
The ___ the p-value, the more significant the statistical evidence is against the null hypothesis smaller
Type one error The null is actually true but we reject it
Type two error the alternate is true but we do not conclude that
power of a test The probabilty of not making a type 2 error. So the alternate is true and we conclude that
one-proportion-z-test conditions 1) random sample, 2) sample is less than 10% of population, 3) np is greater than 10, n(1-p) is greater than 10
The effect of a two tailed test is typically double the p-value
conditions for one-proportion t-test 1) random sample, 2) sample is less than 10% of population, 3) approx. normal or n is more than 30
Matched pairs t-test Just like a t-test but on the differences between each pairs. Significant evidence that the mean difference is not zero
Conditions for 2-prop-z-int 1) each is a random sample, 2) each population is greater than 10%, 3) each sample has at least 10 success and 10 failures
Conditions for 2-prop-z-test 1) each random sample, 2) each population greater than 10 times n, 3) at least 10 success and 10 failures per sample
Pooled proportions GIves the best estimate of the actual proportion
Degree of freedom for 2-sample t-int is n1+n2 -2
Goodness of fit test We take one sample and look at the distribution of a single categorical variable
Homogenous distribution test We take samples from several populations and categorize data by a single variable
Test of Independence We take one sample and then classify data by two categorical variables
GOF Hypothesis Null is the proposed distribution is correct. Alt is the distribution is incorrect
Conditions for all x² tests 1) random sample rule, 2) 10% condition, 3) large colunts rule (greater than 5)
x² tests use ___ to show what should be true is the distributions are correct expected values
Df for x² Categories - 1
Hypothesis for x² test of homogenous distributions Null is the distributions of your categorical variables are the same for each population. Alt is the distributions are not the same
X² test of independence hypothesis Null is no association, Alt is association
Hypothesis test for the slope Null, there is no linear relationship between x and y Alt, there is a linear relationship
A slope of zero signifies no linear relationship
Linear relationship between quantitative variables Conditions 1) random sample, 2) 10% condition, 3) approx, normal residuals
Df for linear regression relationship n-2
Linear regression relationship formula stat +- multiplier (SE)
Created by: Avery4
 

 



Voices

Use these flashcards to help memorize information. Look at the large card and try to recall what is on the other side. Then click the card to flip it. If you knew the answer, click the green Know box. Otherwise, click the red Don't know box.

When you've placed seven or more cards in the Don't know box, click "retry" to try those cards again.

If you've accidentally put the card in the wrong box, just click on the card to take it out of the box.

You can also use your keyboard to move the cards as follows:

If you are logged in to your account, this website will remember which cards you know and don't know so that they are in the same box the next time you log in.

When you need a break, try one of the other activities listed below the flashcards like Matching, Snowman, or Hungry Bug. Although it may feel like you're playing a game, your brain is still making more connections with the information to help you out.

To see how well you know the information, try the Quiz or Test activity.

Pass complete!
"Know" box contains:
Time elapsed:
Retries:
restart all cards