click below
click below
Normal Size Small Size show me how
Statistics
Chapter 1
Statistics | The science of collecting, organizing, summarizing and analyzing information to draw conclusions or answer questions. In addition statistics is also about providing a measure of confidence in any conclusions. |
Data | a fact or proposition used to draw a conclusion or make a decision. Data can be numerical or non-numerical. Data describes characteristics of an individual. |
One goal of statistics | To describe and understand sources of variability |
Population | the entire group of individuals to be studied |
Individual | person or object that is a member of the population being studied |
Sample | a subset of the population that is being studied |
Statistic | a numerical summary based on a sample Statistic = sample |
Inferential statistics | uses methods that take results from a sample and extends them to a population |
Parameter | a numerical summary of a population Parameter=population |
Process of Statistics | 1. Identify the research objective 2. Collect the data 3. Describe the data 4. Perform inference |
Variables | characteristics of the individuals within the population |
Qualitative or Categorical variables | allow for classification of individual based on some attribute or characteristic |
Quantitative variables | provide numerical measures of individuals (can be added or subtracted) |
Discrete variable | a quantitative variable that has either a finite number of possible values or a countable number of possible values |
Continuous variable | a quantitative variable that has an infinite number of possible values |
2 types of quantitative variables | Discrete and Continuous |
Levels of measurements of a variable | 1. Nominal level of measurement 2. Ordinal level of measurement 3. Interval level of measurement 4. Ratio level of measurement NOIR |
Nominal level of measurement (EXP: color of eyes) | *The values of the variable name, label, or category *Do not allow for the values of the variables to be arranged in a ranked specific order. |
Anecdotal | information being conveyed is based on causal observation not scientific research. |
Descriptive Statistics | Consists of organizing and summarizing data. Descriptive statistics describe data through numerical summaries, tables and graphs. |
Ordinal level of measurement (EXP: shoe size) | *has properties of nominal level *naming scheme allows the values of the variable to be ranked or arranged in a ranked, specific order |
Interval level of measurement (EXP: thermometer) | *has properties of ordinal level of measurement *differences in the value of the variable have meaning *a value of ZERO does NOT mean the absence of the quantity *arithmetic operations like addition and subtraction can be done |
Ratio level of measurement (EXP:GPA) | *has properties of the interval level of measurement *ratios of the values of the variable have meaning *a value of ZERO DOES mean the absence of quantity *multiplication and division can be done |
Observational Studies | measures the value of the response variable without attempting to influence the value of either the response or explanatory variables. (This allows researcher to only claim ASSOCIATION!) |
Response variable | What happens AT THE END OF THE EXPERIMENT OR OBSERVATION |
Explanatory variable | What factors you are looking at |
Designed experiment | If a researcher assigns the individuals in a study to a certain group, intentionally changes the value of the explanatory variable, and then records the value of the response variable for each group (ALLOWS THE RESEARCHER TO CLAIM CAUSATION!) |
Confounding | an explanatory variable that was considered in a study whose effect cannot be distinguished from a second explanatory variable in a study. (was considered, was measured but could not for whatever reason be separated) |
Lurking variable | is an explanatory variable that was not considered in a study, but that affects the value of the response variable in a study. |
Types of Observational Studies | 1. Cross-sectional 2. Case-controlled studies 3. Cohort studies |
Cross-sectional study | Observational studies that collect information about individuals at a specific point in time, or over a very short period of time (a.k.a SNAP SHOT STUDIES) |
Case-controlled study | Observational studies that are retrospective, which requires individuals to look back in time or require the researcher to look at existing records. In case-control studies, individuals who have certain characteristics are matched with those that do not. |
Cohort study | Observational study that 1st identifies a group on individuals to participate in the study called the cohort. Observed over a long period of time and characteristics about the individuals are recorded, a.k.a prospective or forward looking studies. |
Census | a list of all individuals in a population along with certain characteristics of each individual |
Random sampling | the process of using chance to select individuals from a population to be included in the sample |
What letter is used to identify sample of size? | n |
What letter is used to identify total population size? | N |
Simple random sampling | if every possible sample of size n has an equally likely chance of occuring |
frame | a list of all the individuals in the population of interest numbered 1 to N. |
What are the 3 types of items we can use to obtain random numbers? | 1. Graphing calculator 2. Statistical software (stat crunch) 3. Random number table |
What are the 3 sampling methods? | 1. Stratified sample 2. Systematic sample 3. Cluster sample |
Stratified Sample | obtained by seperating the population into nonoverlapping groups called strata and then obtaining a simple random sample from each stratum. The individuals within each stratum should be homogenous (similar) in some way. |
Systematic Sample | obtained by selecting every k'th individual from the population. The 1st individual selected is a random number between 1 and k. |
5 Steps in systematic sampling: | 1.approximate population size N 2 Determine the sample size desired, n 3.Compute N/n and round down to nearest integer. This value is k 4.Randomly select a number between 1 and k, this is called p 5.The sample will consist of: p,p+k,p+2k,...,p+(n-1)k |
Cluster Sample | obtained by selecting all individuals within a randomly selected collection or group of individuals. (divide class into 5 groups, randomly select a # 1-5, maybe 3, and then survey all individuals from group 3) |
Convenience sample | a sample that is obtained in which the indivduals in the sample are easily obtained CAUTION!! |
Multistage sampling (EXP:NEILSEN) | 1.Divides country into geographical areas (strata). The strata are typically city blocks in urban areas and geograhic regions in rural areas ~6000 randomly select 2.Sends reps to selected strata & lists the h/h within strata, homes randomly selected. |
What are the 3 sources of bias? | 1.Sampling Bias 2.Response Bias 3.Nonresponse Bias |
Sampling Bias | technique used to obtain the individuals to be in the sample tends to favor one part of the population over another |
Undercoverage | undercoverage results in sampling bias, this occurs when the propoprtion of one segment of the population is lower in a sample than it is in the population. |
Nonresponse Bias | exists when individuals selected to be in the sample do not respond to the survey may have different opionion from those who do |
Response Bias | exists when the answers on a survey do not reflect the true feelings of the respondent. |
4 types of response bias: | 1. Interviewer error 2. Misrepresented answers 3. Wording of questions 4. Order of questions or words |
Data-entry error | incorrectly entered data in a database |
Nonsampling errors | errors that result from sampling bias, nonresponse bias, response bias, or data-entry error. Could be present in a complete census of the population |
Sampling error | an error that results from using a sample to estimate information about a population-gives imcomplete infomation about a population |
Experiment | a controlled study conducted to determine the effect of varying one or more explanatory variables or factors has on a response variable. |
Treatment | any combination of the values of the factors |
Explantory variables | factors |
Experimental unit (or subject) | a person, object or some other well-defined item upon which a treatment is applies |
Placebo | an innocuous medication, such as a sugar tablet, that tastes, looks and smells like the experimental medication |
Blinding | nondisclosure of the treatment |
Single-blind | experiment is one in which the experimental unit (or subject) does not know which treatment he/she is receiving |
Double blind | experiment is one in which neither the experimental unit nor the researcher in contact with the experimental unit knows which treatment the experimental unit is recieving |
What are the steps in designing an experiment? | 1.Identify the problem to be solved (claim) 2.Determine the factors that affect the response variable 3.Determine the # of experimental units 4.Determine the level of predictor variables&randomize 5.Conduct 6.Test claim |
What is meaned by determining the factors that affect the response variable? | Determined which factors are to be fixed (the control), which factors will be manipulated and which factors will be uncontrolled. |
Why randomize? | Randomizing the experimental units to various treatment groups so that the effects of variables whose level cannot be controlled is minimized. The idea is that randomization "averages out" the effect of uncontrolled predictor variables |
Replication | occurs when each treatment is applied to more than one experimental unit.Helps to assure that the effect of a treatment is not due to some characteristic of a single experimenal unit. Its recommended that each treatment group have the same # of exp. units |
Completely randomized design | one in which each experimental unit is randomly assigned to treatment. (BIG DEAL) |
What are the 3 types of experiements? | 1.Completely randomized 2.Matched-pairs design 3.Randomized block design |
Matched-pairs design | an experimental design in which the experimental units are pared up. The paris are matched up so that they are somehow related. There are only 2 levels of treatment in a matched-pairs design |
Randomized block design | When the experimental units are divided into homogeneous groups called blocks. Within each block the experimental units are randomly assigned to treatments. |
Confounding | when the effects of two factors (explanatory variables) on the response variable cannot be distinguished. |