click below
click below
Normal Size Small Size show me how
Statistics
| Question | Answer |
|---|---|
| Define population | The whole set of items that are of interest |
| Define sample | Some subset of the population intended to represent the population |
| Define sampling unit | Each individual thing in the population that can be sampled |
| Define sampling frame | Often sampling units of a population are individually named or numbered to for a list |
| Define census | Data collected from the entire population |
| Define simple random sample | -Each sample has an equal chance of selection -Each item has number -Random number generator |
| Advantages of simple random sample | -No bias -Easy -Cheap -Equal selection chance |
| Disadvantages of simple random sample | -Not suitable for large population -Sampling frame needed |
| Define systematic sample | -Elements ordered into list -Every kth element -k=pop size/samp size -Start at random number between 1 and k |
| Advantages of systematic sample | -Simple -Quick -Suitable for large populations |
| Disadvantages of systematic sample | -Sampling frame needed -Can introduce bias is sampling frame is not random |
| Define stratified sample | -Population divided into strata -Simple random sample for each group -Samp size/pop size sampled from each group -Used when sample is large and divided into groups |
| Advantages of stratified sample | -Reflects population structure -Proportional representation within population |
| Disadvantages of stratified sample | -Population clearly classified into strata -Selection within strata suffer from same disadvantages as simple random |
| Define quota sample | -Population divided into groups according to sampling frame -Interviewer selects quotas to reflect groups proportions |
| Advantages of quota sample | -Small sample is still representative -Easy -Cheap -Comparable |
| Disadvantages of quota sample | -Can introduce bias -Population divided into groups -Non responses not recorded |
| Define opportunity sample | Sample taken from people at the time, who meet criteria |
| Advantages of opportunity sample | -Easy -Cheap |
| Disadvantages of opportunity sample | -Not representative -Dependant on researcher |
| What is the equation for a stratified sample | Strata size x sample size/total population |
| What is the difference between qualitative and quantitative data | Qualitative- Descriptive Quantitative- Numerical |
| What is the difference between discreet and continuous data | Discreet- Only takes certain values Continuous- Takes all values |
| How do we find outliers | Greater than Q3+k(Q3-Q1) Less than Q1-k(Q3-Q1) |
| Define cleaning the data | Removing outliers |
| What do we plot for cumulative frequency diagrams | End point against cumulative frequency |
| What is the equation for frequency density | Frequency density=(Frequency x k)/Class width |
| When do we use a histogram | Continuous data |
| When do we use a bar chart | Discreet data |
| What do we comment on when comparing data | -Measure of location -Measure of spread |
| What axis is the independent variable on | X |
| What axis is the dependent variable on | Y |
| Define bivariate | There are pairs of values for two variable |
| Define causal relationship | Change in one variables causes a change in the other |
| Define interpolation | Estimating a variable within the data range |
| Define extrapolation | Estimating a variable outside the data range |
| What is the purpose of a regression line | To minimise standard deviation |
| When can we use regression lines | For data within the data range |
| Define mutually exclusive | If one event happens the other events can't happen |
| If events are mutually exclusive: P(A or B)= | P(A)+P(B) |
| Define independent events | One event has no effect on the other |
| If events are independent P(A and B)= | P(A) x P(B) |
| Define random variable | A variable whose value depends on the outcome of a random event |
| Define discreet uniform distribution | All probabilities are the same |
| ΣP(X=x)= | 1 |
| When can you model a random variable with a binomial distribution | -Fixed no of trials(n) -2 possible outcomes -Fixed probability of success (P) -Trials are independent of each other |
| P(X<Y)= | P(X≤Y-1) |
| P(X≥Y)= | 1-P(X≤Y-1) |
| P(X>Y)= | 1-P(X≤Y) |
| Define test statistic | The result of the experiment or the statistic that is calculated |
| Define null hypothesis | The hypothesis you assume to be correct |
| Define alternate hypothesis | Tells you about the parameter if your assumption is wrong |
| Define critical region | A region of the probability distribution which if the test statistic falls within it would cause you to reject the null hypothesis |
| Define critical value | The first value to fall in the critical region |
| What is the actual significance level | The probability of incorrectly rejecting the null hypothesis |
| What are the steps of a one tailed hypothesis test | -Formulate a model for test statistic -Identify suitable null and alternate hypotheses -Calculate the probability of test statistic being observed assuming null hypothesis is true -Compare to significance level -Write conclusion in context of question |
| What must you do for a two tailed hypothesis test | Halve the significance level |
| If y+ax^n then logy= | loga+nlogx |
| If y=kb^x then logy= | logk+xlogb |
| What does the PMCC describe | The strength and direction of the correlation |
| When can the PMCC be used | If there is LINEAR correlation |
| If we are hypothesis testing for correlation what are the null and alternate hypothesis | H0: p=0 H1: p≠0 |
| How do we write, the probability that B occurs given that A has already occurred | P(P|A) |
| What is the rule for independent events and conditional probability | P(A|B)=P(A|B')=P(A) |
| P(A)+P(B)-P(A∩B)= | P(A∪B) |
| (P(B∩A))/(P(A))= | P(B|A) |
| What are the probability symbols for and and or | And=∩ Or=∪ |