Save
Upgrade to remove ads
Busy. Please wait.
Log in with Clever
or

show password
Forgot Password?

Don't have an account?  Sign up 
Sign up using Clever
or

Username is available taken
show password


Make sure to remember your password. If you forget it there is no way for StudyStack to send you a reset link. You would need to create a new account.
Your email address is only used to allow you to reset your password. See our Privacy Policy and Terms of Service.


Already a StudyStack user? Log In

Reset Password
Enter the associated with your account, and we'll email you a link to reset your password.
focusNode
Didn't know it?
click below
 
Knew it?
click below
Don't Know
Remaining cards (0)
Know
0:00
Embed Code - If you would like this activity on your web page, copy the script below and paste it into your web page.

  Normal Size     Small Size show me how

Maths Y12 Stats

Maths Autumn Y12

QuestionAnswer
Assumption for probability question with interpolation Assuming uniformly distributed
What to label outer box of Venn diagram with S
Mutually exclusive Can't happen at same time. P(AnB) = 0 P(AuB) = P(A) + P(B) [non mutually exclusive ones would have a -P(AnB) at the end]
How to phrase probability P(x=5) = 0.7
Independent events When one event has no effect on another happening P(AnB) = P(A) x P(B) Inverse also true - if equation not satisfied, not independent
Discrete uniform distributions When all the probabilities are the same
How to give probability distribution in table form Top row is x, 1, 2, 3, etc (values of x) Bottom row is P(X = x) X = score of dice rolled e.g.
Probability mass function P(X = x) = { 1/8 if x = 0, 3 Make sure to include 0 otherwise
A biased 4 sided die rolled. Number on bottom face random variable x. Given P(X = x) = k/x, find k First draw table. The values of P(X = x) will be k/1, k/2, etc k/1 + k/2 + k/3 + k/4 = 1. Easy
'Write down the sample space' {0, 1, 2, 3, 4}
2/3 chance of winning 5 games. What's the prob of winning exactly 3? 5C3 [amount of ways to do it] x (2/3)^3 x (2/3)^2 [chance of doing one]
Requirements to use binomial distribution Fixed number of trials 2 possible outcomes Prob of success is the same each time Trials independent
What to write for every question using binomial distribution for probability X ~ B(n, p) The first part means 'if x is binomially distributed' n = number of trials p = probability of success
If X ~ B(n, p), then P(X = r) = nCr x p^r x (1-p)^(n-r) Probability of getting r successful trials is nCr = number of ways of choosing r successes from n trials p^r = prob of success to the power r (1-p) = prob of failure In formula booklet
P(X <= 1) first line of working P(X = 0) + P(X = 1) = P(X <= 1) DONT FORGET x = 0 This line of working isn't needed for like 2 markers, where the only working is X ~ B(n, p) and P(X <= 1) = answer
For questions about what value of r for a certain chance of winning Just trial and error and show that it works: P(X > r) < 0.05 P(X > 7) = 0.038
Always for probabilities Check is decimal not percent. Write P(x = eefhqohf) = and 5 < 6
Population parameter A statistical measure relating to a population e.g. a mean The lower case p
Hypothesis A statement made about a population parameter
Null hypothesis (H subscript 0) The 'default' position which we usually initially assume to be true. For rolling a dice and getting a 6, then H0 : p = 1/6
Alternative hypothesis (h subscript 1) Tells us about the parameter/situation if our null hypothesis turns out to be incorrect For rolling a dice and getting a 6, then H1: p =/= 1/6 Not necessarily true just exists as disgreement
Test statistic The result of the experiment we are using which we use to test H0 If we tossed a coin 4 times and got 3 heads then that value/proportion is the test statistic The capital X
Test the claim made at 5% significance One-tailed hypothesis test: Test the likelihood of the test statistic against H0. If the chance of it happening by coincidence with H0 rules is less than 5%, then 'we can reject H0 and can accept H1' Then ALWAYS CONTEXTUALISE 'yes new drug is better'
loga(n) = x Means a^x = n
Multiplication law loga(xy) = loga(x) + loga(y)
Division law loga(x/y) = loga(x) - loga(y)
Power law loga(x^k) = k * loga(x)
How to solve equations with logs When equation has weird powers consider it. Take logs. Bring powers down with power law. Multiply out brackets. Rearrange xs together. factorise to isolate x term.
When to take ln When you need to 'cancel out' an e base number of an indices. loge(e) = 1
For answers with e Leave in terms of e unless asked for decimal
e^x can never Be negative as it is a positive base number so no power will make it negative
Differentiating e^kx The power doesn't reduce afterwards so answer is ke^kx Don't bring down the variable (x)
P = 160e^-0.006t P = density of pesticide t = days after application Interpret the meaning of 160 in this model Sub in t = 0 When t = 0, we get 160 as the answer. t = 0 implies that no time has passed, so 160 must be the original amount of pesticide sprayed in the area
P = 160e^-0.006t P = density of pesticide t = days after application Find dP/dt Remember power doesn't change
P = 160e^-0.006t P = density of pesticide t = days after application Interpret the significance of the sign of your answer to part c The sign is -ve so gradient function downsloping so level of decay of the pesticide is decreasing
What other points you HAVE to say for a hypothesis test Q 'X is the number of votes who support Mr Evans' when X ~ B State H0 and H1 even for two-tailed. When stating these, also say 'P is the probability that a voter selected at random supports mr Evans' When X ~ B, beforehand say 'Under H0,'
What is a population A set of items that are of interest
What is a census Measures every member of a population
What is a sample A selection of the population used to estimate information about the population as a whole
What is a sampling frame The source material or device from which a sample is drawn. It is a list of all those within a population who can be sampled
Census pros Completely accurate
Census cons Time consuming Processing a lot of data takes a long time Can't be used if sampling process would render items unusable
Sample pros Less time consuming Less data to process Fewer responses needed
Sample cons Sample might not be properly representative of the population
Types of random sample Must represent population Simple random sampling Systematic sampling Stratified sampling
Simple random sampling In a simple random sample, every element in the set has an equal chance of being selected. Involves assigning number to all members then generate random numbers
Systematic sampling Members chosen at regular intervals from an ordered list
Stratified sampling Population divided into groups. Quantities chosen randomly from each group should mean it represents whole population
When saying to increase sample size Increase sample size *by testing more harnesses*
e.g. of sampling frame for council asking residents about opinions Identify the sampling units A list of residents Each individual resident
For sample data Consider removing anomalies
Pros of simple random sampling Free of bias Easy + cheap to implement Every unit has equal selection chance
Cons of simple random sampling Not suitable for large populations or sample size A sampling frame is needed
Pros of systematic sampling Simple and quick to use Suitable for large samples and populations
Cons of systematic sampling Sampling frame needed Possible bias as units do not have equal chance of selection
Pros of stratified sampling Sample accurately reflects population Guarantees proportional representation of groups
Cons of stratified sampling Classification into groups is time consuming Selection within group has same issues as simple random sampling
For calculating stratified Find total. Multiply by % sample to find how many total will be chosen. Then do (amount wanted)/(total) and multiply by population of each group
When describing pros and cons of particular sampling method Always refer to context of question
A 5 digit membership number where members ending 000 are selected is not systematic. Why? How to reduce bias? First person is not selected at random and the required elements of the sample are not being chosen at regular intervals Take simple random sample using list of members as sampling frame
Non random sampling types Quota sampling Opportunity sampling
Quota sampling An interviewer/researcher selects a sample that reflects the characteristics of the group (e.g. interviewer may meet people to assess characteristics then choose sample from that) Once a quota has been filled, no more people in that group interviewed
Opportunity sampling Sample taken from people available at the time and who fit the criteria needed (e.g. people leaving a supermarket)
Pros of quota sampling Allows small sample to represent population No sampling frame required Quick and inexpensive Allows comparison between groups
Cons of quota sampling Potential for bias Can be more time to divide population into groups after More in-depth studies need an increasing number of different groups Some people may not be willing to take part
Pros of opportunity sampling Easy to carry out Inexpensive
Cons of opportunity sampling Unlikely to be proportional sample Researcher's ability can affect answer People might not want to be interviewed/asked
Asking first 50 people you see on Monday morning outside Tesco. Suggest 2 improvements Variety of places, variety of days/times
When finding class boundaries for continuous data (e.g. class 34 - 36) Careful - they've been rounded to nearest mm (unless you were given it as inequalities, so : class boundaries 33.5, 36.5 midpoint 0.5(33.5 + 36.5) = 35 Class width 3
What is discrete data Can only be certain values e.g. integers (unless an average of discrete data). "Bar charts to graph discrete data because the separate bars emphasize the distinct nature of each value"
What is continuous data Any value. "histograms and scatterplots because continuous spectrum"
Numerical data name and spelling Quantitative
Non-numerical data name and spelling Qualitative
What are class boundaries The max and min values that belong in a group
Describe how random numbers are used to select a sample Select ____ random numbers from computer generation. Ignore repeats. Match to people.
How to express a modal class example 300 - 500
Midpoint of discrete points equation (n + 1)/2 (e.g. 1, 2, 3, 4, 5)
Midpoint of continuous data equation n/2 (e.g. halfway between 36 and 37)
How to find mean of a histogram Either do midpoint of each bar x density or write out table with class intervals next to frequency
When weird class boundaries are given for standard deviation Remember you're trying to simplify down to the 5 points you will sum
How to do standard deviation using calculator Menu 2 (stat) Calc SET 1 var freq: list 2 1 variable
What stuff do you need to say for a standard deviation question Sigma f x^2 = Sigma fx = Sigma f = Then plug in
Outliers +- 2 standard deviations from mean MORE THAN Q3 + 1.5(IQR) LESS THAN Q1 - 1.5(IQR)
Continuous data how to find UQ and LQ Divide n by 4 Can then x3
Discrete data how to find UQ and LQ For even n: Divide n by 4 (or 3n by 4 for UQ). If whole, quartile is between this value and the one above. If not whole, round up and take that data point. For odd n, use (n+1) instead of n
How to work out skews If median equidistant from quartiles, symmetrical. If median closer to LQ then positively skewed. If closer to UQ then negatively skewed.
What is the 10th percentile 10th lowest percent
How to find median and quartiles from histogram Find position (continuous so is simple). LQ = Lower end of LQ class + (Frequency into class/Frequency density of LQ class)
How is standard deviation affected by the coding of data and what is data coding? Data coding is changing the form of data to make it simpler to work with. Mean is just subbed in as normal to formula to find other side of coding. Standard deviation not affected by + or - (everything just shifted - spread the same)
What is the modal group of a histogram The group with the highest frequency density
After excluding outliers from data, what things change? Range changes but median and quartiles stay the same as if no outliers
When finding standard deviation, what must you take care to do with the average of x? Keep it super accurate because will have huge impacts on result
why does grouped data use Xm instead of Xi? It doesn't really need to be, but it shows it's using class midpoints instead of actual stuff
If you see lower case sigma(n - 1) or lower case sigma the n-1 one is the same as s. The lower case sigma often is used for population standard deviation (different equation)
what does Xi mean for standard deviation? Just a standin for whatever's on the x axis e.g. £
What to call bar graph with clusters of bars Compound bar graph
What to add in Q about usefulness of certain graphical methods So patterns over time can be identified more easily
When asked to identify value of change in a graph Say increase/decrease
Downsides of adding much higher data entries to a graph Would have to expand y axis of graph too. Would make small changes harder to detect
Pie chart purpose and why not suitable for time based population count Can be used to show how total quantity is divided into categories Total for time based count not meaningful due to some people being double counted.
Half of population male. 1/4 of population over 65. Can you work out how much of population is men over 65? No evidence they are independent. (can't assume that half of over 65s are male)
When describing correlations Mention strength Must say what the correlation is between e.g. negative correlation between population density and distance from capital
Regression lines (aka 'least squares regression line') Lobf type. Straight line that minimizes sum of squares of distances of each point from the line. Can only be used to estimate the dependent variable for a known independent value - not vice versa - would have to flip graph and redraw line.
What form does a linear regression line take? y = a + bx Generally emphasis is on interpreting line in context, not working out a and b
R values for correlations -1 =< r =< 1 r = 1 perfect positive correlation r = 0 no correlation r = -1 perfect negative correlation
e.g. for why polling your friends might not yield good conclusions about age at which education/training ended vs hourly pay Not varied ages and low sample size. Also due to similar ages, if they are low ages, then people who just left education will earn less than someone in workforce for ages at first but not forever
'Give an interpretation of the value of the gradient of this regression line' IN CONTEXT (must say), as the daily mean windspeed increases by 1 knot, the daily maximum gust increases by 1.82 knots
Justify the use of a linear regression line in this case Strong positive correlation so lobf will be accurate
What does the y intercept represent on a linear regression line equation? Daily maximum gust is 7.23 when daily mean windspeed is 0
When calculation least squares regression line Weirdly the distance is measured straight up/down
What type of data set can you use regression lines on? Bivariate data sets
Extrapolation vs interpolation Making prediction outside range of the given data gives much less reliable estimate (y intercept still works - that is there to show the positioning of the line, though it can't be used to estimate at independent = 0 if range not there)
5 pick 3 Permutation. how many ways to choose where order matters. n!/(n-r)! r objects chosen from set of n objects
5 choose 3 Choosing 3 letters where order doesn't matter n!/((n-r)!r!) [2 ns then 2 rs]
Binomial expansion formula (a+b)^n = a^n + (nC1)(a^(n-1))b etc
Combinations definition a selection of a number of objects where order of selection isn't important
Permutations definition ORDERED arrangement of a number of object
Find first 4 terms of binomial expansion in ascending powers of x Includes x^0
What is a correlation that looks like a quadratic called? Quadratic correlation
Defining positive correlation LINEAR.
When comparing data Always reference SKEW, iqr, and median
'Expected result' The mean of X. Literally just np
Stuff to remember for hypothesis tests Define p. Define hypotheses. Define X. Assuming H0, X ~ (). 0.05 > 0.04 so it is significant, sufficient evidence at the 5% significance level to reject H0 and accept H1, contextualise.
What to add if the hypothesis test is two-tailed If two-tailed, say the H1 is =/= because the test is to investigate if the proportion is different rather than higher/lower
What is 'the p value' The chance of the event occurring following H0
What is the critical region Range of whole numbers which are significant. First value inside region called critical value. two-tailed has 2 critical regions. Can see if test statistic falls in critical region. Expressed as inequalities
What is the actual significance level Probability of data being in the critical region if H0 is true. For continuous, same as significance level. For discreet, might differ
Created by: Pyrogearos2
 

 



Voices

Use these flashcards to help memorize information. Look at the large card and try to recall what is on the other side. Then click the card to flip it. If you knew the answer, click the green Know box. Otherwise, click the red Don't know box.

When you've placed seven or more cards in the Don't know box, click "retry" to try those cards again.

If you've accidentally put the card in the wrong box, just click on the card to take it out of the box.

You can also use your keyboard to move the cards as follows:

If you are logged in to your account, this website will remember which cards you know and don't know so that they are in the same box the next time you log in.

When you need a break, try one of the other activities listed below the flashcards like Matching, Snowman, or Hungry Bug. Although it may feel like you're playing a game, your brain is still making more connections with the information to help you out.

To see how well you know the information, try the Quiz or Test activity.

Pass complete!
"Know" box contains:
Time elapsed:
Retries:
restart all cards