Help

Options

focusNode

Didn't know it?
click below

Knew it?
click below

Don't Know

Remaining cards (0)

Know

retry

shuffle

restart

0:00

Embed Code - If you would like this activity on your web page, copy the script below and paste it into your web page.

Normal Size Small Size show me how

stat Midterm front

stat flashcards

Front	Back
Sample	Subgroup of the population
Sampling	Process of selecting sample from population
Random sampling	Independent selection
Descriptive vs. Inferential Statistics	– Descriptive: primary purpose is to describe some aspect of the data Inferential: primary purpose is to infer (to estimate or to make a decision, test a hypothesis)
All inferential statistics have the following in common:	– use of some descriptive statistic – use of probability – potential for estimation – sampling variability – sampling distributions – use of a theoretical distribution – two hypotheses, two decisions, two types of error
Research defined	Structured Problem Solving
Scientific methods: steps (cyclic)	– 1. encounter and identify problem – 2. formulate hypotheses, define variables – 3. think through consequences of hypotheses – 4. design & run study, collect data, compute statistics, test hypotheses – 5. draw conclusions
Variable	entity that is free to take on different values
ndependent variable (IV)	its values are manipulated by the researcher, comes first in time
Dependent variable (DV)	measured by researcher, follows the IV in time
Population	Target group for inference
Extraneous variable (EV)	controlled by researcher • randomization of subjects to groups • keep all subjects constant on EV • include EV in the design of the experiment
Predictor variable (PV)	comes first in time but there is no manipulation, analogous to IV.
Criterion variable (CV):	follows PV in time, analogous to DV.
Causal relationship:	IV causes the DV
Predictive relationship:	PV predicts the CV
2 Types of research	1. experimental 2. observational
True experiment	• manipulation of IV • randomization of subjects to groups • causal relationship between IV and DV
Observational research	• no manipulation • minimal control of EV • predictive relationship between PV and CV
Stem and Leaf Display	• The first digit(s) of a score form the stem, the last digit(s) form the leaf. • We want 10-20 total number of stems. • Number of stems per digit depends on total number of stems: can do 1, 2, or 5 stems per digit.
Description With Statistics Aspects or characteristics of data that we can describe are:	– Middle – Spread – Skewness – Kurtosis
Other words that describe Middle	central tendency, location, center
Statistics that Measure middle are:	mean, median, mode • “Middle” is the aspect of data we want to describe. • We describe/measure the middle of data in a population with the parameter m (‘mu’); we usually don’t know m, so we estimate it with X-bar.
Other words that describe Spread	variability, dispersion, skatter
Statistics that Measure spread are:	range, variance, standard deviation, midrange • “Spread” is the aspect of data we want to describe. • Any statistic that describes/measures spread should have these characteristics: it should – Equal zero when the spread is zero. – Inc
Skewness	=departure from symmetry – Positive skewness = tail (extreme scores) in positive direction – Negative skewness = tail (extreme scores) in negative direction (The Few name the Skew)
Kurtosis	peakedness relative to normal curve
Sample Mean	-The sample mean is the sum of the scores divided by the number of scores, and is symbolized by X-bar, X = SX/N -For example, for X1=4, X2=1, X3=7, N=3, SX=12 and X = SX/N = 12/3 = 4 • Characteristics: – X-bar is the balance point
Sample Median	• The median is the middle of the ordered scores, and is symbolized as X50. • Median position (as distinct from the median itself) is (N+1)/2 and is used to find the median. • Example: X1=4, X2=1, X3=7, then N=3. • Characteristic
Sample Mode	• The mode is the most frequent score. • Examples: – 1 1 4 7, the mode is 1. – 1 1 4 7 7, there are two modes, 1 and 7. – 1 4 7, there is no mode. • Characteristics: – Has problems: more than one, or none; maybe not in the mid
Spred cont.	• We describe/measure the spread of data in a sample with the statistics: – Range = high score-low score. – Midrange, MR. – Sample variance, s². – Sample standard deviation, s. – Unbiased variance estimate, s². – s. • We des
Midrange (MR)	• Formula is MR=UH-LH – UH=upper hinge – LH=lower hinge – Hinges cut off 25% of the data in each tail • Hinge position is ([median position]+1)/2. – [median position] is the whole number part of the median position (remember, median p
Hinge position	([median position]+1)/2 – [median position] is the whole number part of the median position (remember, median pos.=(N+1)/2) • Use hinge position to count in from the tails to find the hinges.
Sample Standard Deviation, sSample Variance, s²	• Definitional formula: s²=S(X-X)²/N, the average squared deviation from X-bar. Sample Standard Deviation= s Unbiased Variance Estimate, s²
Box-plots	• A pictorial description that uses a box to show the middle of the data and lines called whiskers to show the tails of a distribution.
3 Parts to Box Plot	1.) Box 2.) Wiskers 3.) Outliers
Box	– Upper end is at the UH, lower end is at the LH - Line across the middle is X50
Whiskers	– Whiskers are lines drawn from the ends of the box (the hinges) to adjacent values, UAV & LAV. – Adjacent values are the first real data values inside the inner fences. – Inner fences, upper and lower • Upper, UIF=UH+1.5MR • Lower, LIF= L
Outliers	Outliers: outside whiskers, marked with
Midrange (MR)	UH- LH
z Scores	• The aspect of the data we want to describe/measure is relative position. • z scores are statistics that describe the relative position of something in its distribution.
Z score formula	z is something minus its mean divided by its standard deviation.
z score characteristics	– The mean of a distribution of z scores is zero. – The variance of a distribution of z scores is one. – The shape of a distribution of z scores is reflective, the shape is the same as the shape of the distribution of the Xs.
Characteristics of Normal Distributions	– Symmetric, continuous, unimodal. – Bell-shaped. – Scores range from -¥ to +¥ . – Mean, median, and mode are all the same value. – Each distribution has two parameters, m and s².
Use of Z score	• We use this distribution to get probabilities associated with a z score (probability, proportion, and area under the curve are synonymous). - look up z in table to find probabilities.
Correlation	– Defined as the degree of linear relationship between X and Y. – Is measured/described by the statistic r.
Regression	– Is concerned with the prediction of Y from X Forms a prediction equation to predict Y from X Uses the formula for a straight line, Y’=bX+a. – Y’ is the predicted Y score on the criterion variable. – b is the slope, b=DY/ D X=rise/run. –
r=	r=SzXzY/N, the average product of z scores for X and Y – Works with two variables, X and Y – -1<r<1, r measures positive or negative relationships – Measures only the degree of linear relationship – r2=proportion of variability in Y that is e
r2=	proportion of variability in Y that is explained by X.
Correlation: Undefined	If there is no spread in X or Y, then r is undefined. Note that any z is undefined if the standard deviation is zero, and r=SzXzY/N.
Population correlation coefficient,	r (rho)
regression cont.	• Linear only. • Generalize only for X values in your sample. • Actual observed Y is different from Y’ by an amount called error, e, that is, Y=Y’+e. • Error in regression is e=Y-Y’. • Many different potential regression
Line of Best Fit	The statistics b and a are computed so as to minimize the sum of squared errors, – Se2=S(Y-Y’)2 is a minimum. – This is called the Least Squares Criterion.
Partition total spread	– Total = Explained + Not Explained – This is true for proportion of spread and amount of spread. • Proportion: 1 = r2 + (1-r2) • Amount: s2y = s2y r2 + s2y(1-r2)
Probability	Defined as relative frequency of occurence.
Sample space	all possible outcomes of an experiment
Elementary event	a single member of the sample space
Event	any collection of elementary events
p(elementary event	1/(total number)
p(event)	(number in the event)/(total number)
Conditional probability	• p(A\|B)=(number in [A and B])/(number in B) • The probability of A in the redefined (reduced) sample space of B.
Big 3 Probability Rules	1. independence 2. mulitplication, mutually exclusive 3.) addition
Independence (1)	events A and B are independent if • p(A\|B)=p(A) • The A probability is not changed by reducing the sample space to B.
Multiplication (And) Rule (2)	• p(A and B)=p(A)p(B\|A)=p(A\|B)p(B)
Mutually exclusive:	• Events A and B do not have any elementary events in common. • Events A and B cannot occur simultaneously. • p(A and B)=0
Addition (Or) Rule (3)	p(A or B)=p(A)+p(B)-p(A and B)
The sampling distribution of X-bar	– Has the purpose of any sampling distribution: to obtain probabilities… – Has the definition of any sampling distribution: the distribution of a statistic. – Has specific characteristics: • Mean: mX = m • Variance: s2X =s2/N • Shape i
Hypothesis testing	is the process of testing tentative guesses about relationships between variables in populations. These relationships between variables are evidenced in a statement , a hypothesis, about a population parameter.
Test statistic	a statistic used only for the purpose of testing hypotheses; e.g. zX
Assumptions	conditions placed on a test statistic necessary for its valid use in hypothesis testing;– for zX, the assumptions are that the population is normal in shape and that the observations are independent.
Null hypothesis	the hypothesis that we test; Ho.
Alternative hypothesis	where we put what we believe; H
Significance level	he standard for what we mean by a “small” probability in hypothesis testing; a. The significance level is the small probability used in hypothesis testing to determine an unusual event that leads you to reject Ho. – The significance level is sym
Direcetional v. Non-Directional Hypothesis	>,<, or = • Directional hypotheses specify a particular direction for values of the parameter. – IQ of deaf children example: Ho: m>100, H1: m<100. • Non-directional hypotheses do not specify a particular direction for values of the paramet
One- and two-tailed tests	– A one-tailed test is a statistical test that uses only one tail of the sampling distribution of the test statistic. – A two-tailed test is a statistical test that uses two tails of the sampling distribution of the test statistic.
Critical values	values of the test statistic that cut off a or a/2 in the tail(s) of the theoretical reference distribution.
Rejection values	the values of the test statistic that lead to rejection of Ho
p-Value Decision Rules	• Reject Ho if – ½ the SAS p-value <a, and – the observed zX is in the tail specified by H1.

Created by: kell5765

Popular Math sets

Algebra Terms

0-7 Multiplication Facts

Learn your multiplication facts

Multiplication: 0-12

Fraction/Decimal/Percent

Integer Operations

Multiplication: 0-12

Adding Facts/Subtracting Facts

SOL 6.9, 6.10, 6.11, 6.12

Vocabulary

Geometric Concepts: Classifying Figures and Understanding Volume

Collect, Represent, and Interpret Data

"Know" box contains:
Time elapsed:
Retries: