Question 1

Useful equation for union

Accepted Answer

P(AuB) = P(A) + P(B) - P(AnB)

If mutually exclusive, = P(A) + P(B) because P(AnB) = 0

Question 2

Equation for given

Accepted Answer

Probability of A given B = P(A l B) = P(AnB)/(P(B)

Limiting set to just P(B) then seeing what proportion of that is also P(A)

If independent, P(A l B) = P(A) [just work it out]

Super careful to do right way around if (B l A). Given under

Question 3

Exhaustive

Accepted Answer

P(quehoawub) = 1

Question 4

What to always do for probability question

Accepted Answer

Draw venn diagram - really nice for working out the different sector values, and is especially vital for non independent

Question 5

Sample spaces

Accepted Answer

Should draw for 'X of 2 numbers' questions

For given:
number of applicable values in the restricted space/restricted sample space

Question 6

Does AnB' = (AnB)'

Accepted Answer

No

Question 7

(AuBuC)' =

Accepted Answer

A'nB'nC'

(also flips sign when expanding)

Question 8

A few pointers for hypothesis testing

Accepted Answer

Don't assume significance is 5% - check

Do both tails in calculator to choose most significant (most thorough is to check significance of > and < in the bcd menu when doing calc). Don't have to show why you chose - just write the smallest tail.

Question 9

Rules if A and B are independent

Accepted Answer

P(AnB) = P(A) x P(B)
P(A l B) = P(A)

Question 10

Rules if A and B are mutually exclusive

Accepted Answer

P(AnB) = 0
P(AuB) = P(A) + P(B)

Question 11

If you see 'given that'

Accepted Answer

IMMEDIATELY write out law for conditional probability (like even before finishing reading Q)

Vital for preventing mistakes and working marks

Question 12

If you see the words 'are independent'

Accepted Answer

IMMEDIATELY write out laws for independence (like even before finishing reading Q)

Vital for preventing mistakes and working marks

Question 13

nCr =

Accepted Answer

(n over r) = n!/r!(n-r)!

Question 14

Hypothesis testing don't forget

Accepted Answer

X squiggle B

Question 15

If random variable X is normally distributed, we write

Accepted Answer

X squiggle N(μ, σ^2)

μ = mean
σ = standard deviation

Question 16

Normal distribution averages

Accepted Answer

Symmetrical so mean = mode = median

Question 17

+- 1 standard deviation

Accepted Answer

Contains 68% of the data

(in formula booklet i think)

Question 18

+- 2 standard deviation

Accepted Answer

Contains 95% of the data

Question 19

+- 3 standard deviation

Accepted Answer

Contains 99.7% of the data

Question 20

+- 5 standard deviation

Accepted Answer

Contains all of the data (pretty much)

Question 21

Ways to prove not normally distributed

Accepted Answer

Skews and discrete (positive skew means bump to the left)

Discrete not normal but can approximate

Question 22

What is the y axis on the normal distribution

Accepted Answer

Probability density (like freq density but the probability of it lying within the area. Total area under curve is therefore 1)

Question 23

Straight line of the probability density

Accepted Answer

*Uniform* distribution

Question 24

When using distribution menu on calc

Accepted Answer

ALWAYS view value with the OPTN F1 - it truncates otherwise

Question 25

Using distribution menu for inverse normal

Accepted Answer

Can edit the p value to get the x value(s)

Question 26

e.g. Find c such that P(26<y<c) = 0.2

Accepted Answer

Draw graph.

P(Y<c) - P(Y<26) = 0.36

Use calc for P(Y<26)

Question 27

e.g. Find d such that P(25-d < y < 25+d) = 0.6

Accepted Answer

Draw graph. The outer bounds are 0.2 each. then P(Y<25+d) = 0.8 easy on calc

Question 28

Probability

Accepted Answer

Not %

Question 29

Standard normal distribution

Accepted Answer

If X squiggle N(μ, σ^2) and Z squiggle N(0,1)

Then P(X < x) is the same as P(Z < (x - μ)/σ).

The transformation of x into (x - μ)/σ is called standardising

Question 30

P(Z < (x - μ)/σ)   [all of it] sometimes written as

Accepted Answer

ϕ((x - μ)/σ)

Phi is a functionish thing meaning P(Z <. Allows for more coherent working when inversing.

ONLY FOR STANDARD

Question 31

e.g. standardise X squiggle N(10, 2^2)

P(X < 6)

Accepted Answer

X squiggle N(0,1)

6-10/2 = -2

P(Z<-2) = P(X<6)

Question 32

When to do standardising

Accepted Answer

Good for when mean and/or standard deviation are unknown

Also good for comparisons - is just showing amount of standard deviations above/below the mean of zero. P(1 < Z < 2) means is between 1 and 2 standard deviations above

Question 33

e.g. Length may be modelled by normal distribution with standard deviation = 3. 15% of components longer than 16cm. Find the mean.

Accepted Answer

X squiggle N(μ, 3^2)
P(X > 16) = 0.15
P(X < 16) = 0.85 (because inverse does up to a point)

Z squiggle N(0,1)
P(Z < (16-μ)/3) = 0.85
16-μ/3 = ϕ^-1(0.85) = 1.036 [making sure to use (0,1)]
etc

Question 34

ϕ^-1(0.8)

Accepted Answer

Means area UP TO the result of this is 0.8. i.e. Only does P(X<  )

Same style thing for normal inverse

Question 35

What step to do after finding the mean with the standard normal

Accepted Answer

Do a plausibility check: for this question its about 1 standard deviation below 16cm, which makes sense because the ϕ^-1(0.85) is 1.036, which is close to one standard deviation above mean.

Question 36

Normal distribution points of inflection

Accepted Answer

Points 1 standard deviation above and below mean

Question 37

Normal approximations

Accepted Answer

Discrete values.

Inclusive vs exclusive matters

Question 38

Requirements for Normal approximations

Accepted Answer

Steps in possible values are small compared to standard deviation.

Continuity corrections are made

Question 39

Continuity corrects examples

Accepted Answer

(106 <= X <= 110) integers becomes (105.5 <= X <= 110.5) continuous

'less than 120' is P(X<119.5) [so it doesnt include the 120 itself]

'at least 110' is P(X >= 109.5) to catch the ones that round up to 110 so is 1 - P(X<109.5)

Question 40

Binomial to normal approx

Accepted Answer

Bi(n,p) -> N(np, σ^2), where σ = sqrt(np(1-p))

Question 41

Distribution of sample means

Accepted Answer

For a random sample of size n taken from a random variable X, the sample mean X (with bar above) is approximately normally distributed with Xbar squiggle N(μ,(σ^2)/n)

Question 42

Sample means

Accepted Answer

e.g. average of 3 numbers between 1 and 2 is a sample mean. Then tons of these sample means are made. We expect the sample means to vary about the population mean μ (mean doesnt change from the mean of the random variable)

Question 43

Standard deviation of Xbar

Accepted Answer

sqrt(σ^2/n) = σ/sqrt(n)

Question 44

Increasing sample size effect on sample means

Accepted Answer

Increasing sample size means increasing amount of random values drawn to make each sample.

Sample means xbar (lower case for each one I think is the convention) will be closer to true mean μ, so standard deviation of Xbar reduces

Question 45

Ways to prove normally distributed

Accepted Answer

Symmetry
Almost all data within 3 standard deviations of mean (give values of upper and lower bounds)
95% of data within 2 standard deviations from mean (give upper and lower)
Continuous

Question 46

X and Xbar

Accepted Answer

Dont confuse them. X is each randomly chosen item. Xbar is the mean length of each sample.

X might not be normally distributed but Xbar will be.

Question 47

Normal hypothesis testing e.g. Rods believed to be 30cm, with standard deviation of 1. A sample of 20 rods gives mean of 29.5

Set-up of answer

Accepted Answer

Only talking about one sample. Hypotheses always in term of population parameter e.g. mean

H0: μ = 30
H1: μ < 30

Let X be the length of the rod

Question 48

Normal hypothesis testing e.g. Rods believed to be 30cm, with standard deviation of 1. A sample of 20 rods gives mean of 29.5

Answer

Accepted Answer

Assuming H0, X squiggle N(30,1^2)
so Xbar squiggle N(30, 1^2/20) -> Xbar squiggle N(30, (sqrt0.05)^2)

Use the μ and σ values for P(Xbar < 29.5) = 0.127 (table given in Q. The p value is the prob of the observed value or more extreme)

Conclude

Question 49

Why we did (sqrt0.05)^2

Accepted Answer

To remind us of std dev not variance - ALWAYS do

Question 50

Using critical region for normal hypothesis testing

Accepted Answer

Use inverse normal of the significance level to find the critical region then compare to the value.

Critical regions good for if testing multiple samples.

If looking for upper region of 10%, do inverse of 0.95

Question 51

How to state critical values

Accepted Answer

Critical region is Xbar < 232.7 or Xbar > 247.3 (have to be separated because no value does both)

Question 52

Using critical region

Accepted Answer

Must prove with BOTH regions

Question 53

'Sample standard deviation'

Accepted Answer

Watch out for sample parameters rather than population

Question 54

When can we use sample standard deviation as an estimate for the whole population

Accepted Answer

If the population standard deviation is not given and n is large enough (30+)

Mention in your answer

We still divide by sqrt(n) to find the standard deviation of the sample means

Question 55

Function of binomial vs normal hypothesis test

Accepted Answer

Binomial is testing for a probability/proportion.

Normal is testing for a population mean using a sample

Question 56

r value

Accepted Answer

+ve r value means +ve gradient of lobf.

aka 'product moment correlation coefficient' PMCC

Question 57

r value hypothesis tests

Accepted Answer

H0 is always r = 0.

They give you the critical values to do it with - essentially just choose the right tail

Question 58

PMCC vs SRCC

Accepted Answer

PMCC is the r value, but the SRCC (R means Rank) is replacing the raw data with its rank in the data set

Question 59

How to use SRCC

Accepted Answer

Replace raw data. Calculate PMCC using these ranks. Interpret the coefficient as rho = 0.86 for the original data and draw conclusions about association

rho doesnt tell us about correlation - tells us if the data is strongly *associated*

Question 60

How to adjust SRCC for repeat data points

Accepted Answer

Need to be adjusted so data isnt pushed out of sync

Question 61

r = 1

Accepted Answer

Perfectly straight line through origin (not necessarily y = x)

Question 62

rho = 1

Accepted Answer

Must be a *strictly* increasing function

Question 63

r vs rho

Accepted Answer

r value for PMCC.

r for the rank correlation (the correlation between the rank numbers, but doesnt tell us about the data set)

rho for the SRCC of the original data

Question 64

If you have the equation of the lobf

Accepted Answer

Dont estimate by drawing on graph, sub in instead

Maths Y13 Stats

Maths Spring Y13

Question	Answer
Useful equation for union	P(AuB) = P(A) + P(B) - P(AnB) If mutually exclusive, = P(A) + P(B) because P(AnB) = 0
Equation for given	Probability of A given B = P(A l B) = P(AnB)/(P(B) Limiting set to just P(B) then seeing what proportion of that is also P(A) If independent, P(A l B) = P(A) [just work it out] Super careful to do right way around if (B l A). Given under
Exhaustive	P(quehoawub) = 1
What to always do for probability question	Draw venn diagram - really nice for working out the different sector values, and is especially vital for non independent
Sample spaces	Should draw for 'X of 2 numbers' questions For given: number of applicable values in the restricted space/restricted sample space
Does AnB' = (AnB)'	No
(AuBuC)' =	A'nB'nC' (also flips sign when expanding)
A few pointers for hypothesis testing	Don't assume significance is 5% - check Do both tails in calculator to choose most significant (most thorough is to check significance of > and < in the bcd menu when doing calc). Don't have to show why you chose - just write the smallest tail.
Rules if A and B are independent	P(AnB) = P(A) x P(B) P(A l B) = P(A)
Rules if A and B are mutually exclusive	P(AnB) = 0 P(AuB) = P(A) + P(B)
If you see 'given that'	IMMEDIATELY write out law for conditional probability (like even before finishing reading Q) Vital for preventing mistakes and working marks
If you see the words 'are independent'	IMMEDIATELY write out laws for independence (like even before finishing reading Q) Vital for preventing mistakes and working marks
nCr =	(n over r) = n!/r!(n-r)!
Hypothesis testing don't forget	X squiggle B
If random variable X is normally distributed, we write	X squiggle N(μ, σ^2) μ = mean σ = standard deviation
Normal distribution averages	Symmetrical so mean = mode = median
+- 1 standard deviation	Contains 68% of the data (in formula booklet i think)
+- 2 standard deviation	Contains 95% of the data
+- 3 standard deviation	Contains 99.7% of the data
+- 5 standard deviation	Contains all of the data (pretty much)
Ways to prove not normally distributed	Skews and discrete (positive skew means bump to the left) Discrete not normal but can approximate
What is the y axis on the normal distribution	Probability density (like freq density but the probability of it lying within the area. Total area under curve is therefore 1)
Straight line of the probability density	Uniform distribution
When using distribution menu on calc	ALWAYS view value with the OPTN F1 - it truncates otherwise
Using distribution menu for inverse normal	Can edit the p value to get the x value(s)
e.g. Find c such that P(26<y<c) = 0.2	Draw graph. P(Y<c) - P(Y<26) = 0.36 Use calc for P(Y<26)
e.g. Find d such that P(25-d < y < 25+d) = 0.6	Draw graph. The outer bounds are 0.2 each. then P(Y<25+d) = 0.8 easy on calc
Probability	Not %
Standard normal distribution	If X squiggle N(μ, σ^2) and Z squiggle N(0,1) Then P(X < x) is the same as P(Z < (x - μ)/σ). The transformation of x into (x - μ)/σ is called standardising
P(Z < (x - μ)/σ) [all of it] sometimes written as	ϕ((x - μ)/σ) Phi is a functionish thing meaning P(Z <. Allows for more coherent working when inversing. ONLY FOR STANDARD
e.g. standardise X squiggle N(10, 2^2) P(X < 6)	X squiggle N(0,1) 6-10/2 = -2 P(Z<-2) = P(X<6)
When to do standardising	Good for when mean and/or standard deviation are unknown Also good for comparisons - is just showing amount of standard deviations above/below the mean of zero. P(1 < Z < 2) means is between 1 and 2 standard deviations above
e.g. Length may be modelled by normal distribution with standard deviation = 3. 15% of components longer than 16cm. Find the mean.	X squiggle N(μ, 3^2) P(X > 16) = 0.15 P(X < 16) = 0.85 (because inverse does up to a point) Z squiggle N(0,1) P(Z < (16-μ)/3) = 0.85 16-μ/3 = ϕ^-1(0.85) = 1.036 [making sure to use (0,1)] etc
ϕ^-1(0.8)	Means area UP TO the result of this is 0.8. i.e. Only does P(X< ) Same style thing for normal inverse
What step to do after finding the mean with the standard normal	Do a plausibility check: for this question its about 1 standard deviation below 16cm, which makes sense because the ϕ^-1(0.85) is 1.036, which is close to one standard deviation above mean.
Normal distribution points of inflection	Points 1 standard deviation above and below mean
Normal approximations	Discrete values. Inclusive vs exclusive matters
Requirements for Normal approximations	Steps in possible values are small compared to standard deviation. Continuity corrections are made
Continuity corrects examples	(106 <= X <= 110) integers becomes (105.5 <= X <= 110.5) continuous 'less than 120' is P(X<119.5) [so it doesnt include the 120 itself] 'at least 110' is P(X >= 109.5) to catch the ones that round up to 110 so is 1 - P(X<109.5)
Binomial to normal approx	Bi(n,p) -> N(np, σ^2), where σ = sqrt(np(1-p))
Distribution of sample means	For a random sample of size n taken from a random variable X, the sample mean X (with bar above) is approximately normally distributed with Xbar squiggle N(μ,(σ^2)/n)
Sample means	e.g. average of 3 numbers between 1 and 2 is a sample mean. Then tons of these sample means are made. We expect the sample means to vary about the population mean μ (mean doesnt change from the mean of the random variable)
Standard deviation of Xbar	sqrt(σ^2/n) = σ/sqrt(n)
Increasing sample size effect on sample means	Increasing sample size means increasing amount of random values drawn to make each sample. Sample means xbar (lower case for each one I think is the convention) will be closer to true mean μ, so standard deviation of Xbar reduces
Ways to prove normally distributed	Symmetry Almost all data within 3 standard deviations of mean (give values of upper and lower bounds) 95% of data within 2 standard deviations from mean (give upper and lower) Continuous
X and Xbar	Dont confuse them. X is each randomly chosen item. Xbar is the mean length of each sample. X might not be normally distributed but Xbar will be.
Normal hypothesis testing e.g. Rods believed to be 30cm, with standard deviation of 1. A sample of 20 rods gives mean of 29.5 Set-up of answer	Only talking about one sample. Hypotheses always in term of population parameter e.g. mean H0: μ = 30 H1: μ < 30 Let X be the length of the rod
Normal hypothesis testing e.g. Rods believed to be 30cm, with standard deviation of 1. A sample of 20 rods gives mean of 29.5 Answer	Assuming H0, X squiggle N(30,1^2) so Xbar squiggle N(30, 1^2/20) -> Xbar squiggle N(30, (sqrt0.05)^2) Use the μ and σ values for P(Xbar < 29.5) = 0.127 (table given in Q. The p value is the prob of the observed value or more extreme) Conclude
Why we did (sqrt0.05)^2	To remind us of std dev not variance - ALWAYS do
Using critical region for normal hypothesis testing	Use inverse normal of the significance level to find the critical region then compare to the value. Critical regions good for if testing multiple samples. If looking for upper region of 10%, do inverse of 0.95
How to state critical values	Critical region is Xbar < 232.7 or Xbar > 247.3 (have to be separated because no value does both)
Using critical region	Must prove with BOTH regions
'Sample standard deviation'	Watch out for sample parameters rather than population
When can we use sample standard deviation as an estimate for the whole population	If the population standard deviation is not given and n is large enough (30+) Mention in your answer We still divide by sqrt(n) to find the standard deviation of the sample means
Function of binomial vs normal hypothesis test	Binomial is testing for a probability/proportion. Normal is testing for a population mean using a sample
r value	+ve r value means +ve gradient of lobf. aka 'product moment correlation coefficient' PMCC
r value hypothesis tests	H0 is always r = 0. They give you the critical values to do it with - essentially just choose the right tail
PMCC vs SRCC	PMCC is the r value, but the SRCC (R means Rank) is replacing the raw data with its rank in the data set
How to use SRCC	Replace raw data. Calculate PMCC using these ranks. Interpret the coefficient as rho = 0.86 for the original data and draw conclusions about association rho doesnt tell us about correlation - tells us if the data is strongly associated
How to adjust SRCC for repeat data points	Need to be adjusted so data isnt pushed out of sync
r = 1	Perfectly straight line through origin (not necessarily y = x)
rho = 1	Must be a strictly increasing function
r vs rho	r value for PMCC. r for the rank correlation (the correlation between the rank numbers, but doesnt tell us about the data set) rho for the SRCC of the original data
If you have the equation of the lobf	Dont estimate by drawing on graph, sub in instead
For something like an exponential graph, should r be used?	Probably not - would be too low. Ranks likely more reasonable.
If hypothesis says 'positive correlation'	Do SRCC and use rho for hypothesis ??? I dont know what this note means - check with someone smart if its correct

"Know" box contains:
Time elapsed:
Retries: