Question 1

Concordant

Accepted Answer

A pair is concordant if 
(xi-xj)*(yi-yj) > 0

Question 2

Discordant

Accepted Answer

A pair is discordant if
(xi-xj)*(yi-yj) < 0

Question 3

Non-informative Censoring

Accepted Answer

We assume that censored individuals are at the same risk of failure as those who are still alive and uncensored; This implies that the censoring process is independent of survival time

Question 4

Turnbull estimator

Accepted Answer

Used for interval censored data, assumes non-informative censoring

Question 5

Interval censored data

Accepted Answer

When the time of death is known to happen between two points, but the exact time is unknown. The nonparametric survival function for this type of data is not continuous. Left and right censored are types of this

Question 6

Survival Function

Accepted Answer

Non-increasing function, value 1 at time 0 because it deals with probabilities, 0 at time = infinity. Same as CDF but backwards (1-CDF) = survival function, CDF = 1-S(f)

Question 7

Right Censored Data

Accepted Answer

A subjects actual time-to-event is known only to occur after a given time (they dropped out, study ended)

Question 8

Left-censored Data

Accepted Answer

Arises when Ti is known only to occur before some censoring time

Question 9

Kaplan Meier Estimator

Accepted Answer

The limit of the life-table estimator when intervals are taken so small that only at most one distinct observation occurs within an interval; used only for right censored or exact data; step function; jumps down at death

Question 10

Signed Rank Test (wilcoxon Signed rank test)

Accepted Answer

Assume no ties and no diff=0
Calculate Paired Differences & abs value of paired differences
Rank absolute value
Attach original sings
Compute  SR+
(Can be used for paired data)

Question 11

SR+ for a signed rank test

Accepted Answer

Observed value of the sum of the positive signed ranks

Question 12

P-value for a Signed Ranked Test

Accepted Answer

Fraction of SR+ values that are greater than or equal to SR+observed (one-sided) over 2^n (sample size)

Question 13

Ties and 0's in Signed Rank Test

Accepted Answer

Average ties as usual
Use a method of ranking with zeros or omit zeros (SAS omits zeros)

Question 14

Zeros in the Signed Rank Test

Accepted Answer

If there are not too many, ranking with or without them will give similar results

Question 15

Sign Test

Accepted Answer

(can be used for paired data, does not consider ranks)

Question 16

SN+ in the sign test

Accepted Answer

The number of positive differences; follows the binomial distribution with p=.5

Question 17

Sign Test Hypotheses

Accepted Answer

H0: Ɵd = 0 or  Ha: Ɵd≠ 0

Question 18

Exact P-value for Sign Test

Accepted Answer

1 - binomcdf(n, probofsuccess, value of interest)

Question 19

Large Sample Approximation for Sign Test

Accepted Answer

(SN+ - n/2) / ( sqrt(n/4)) = Z
Then use normcdf to get p-value

Question 20

Advantages of the Sign Test

Accepted Answer

Easy to perform
Protects against outliers
Efficient for heavy tailed distributions
Don't need actual data, just signs of differences

Question 21

Disadvantages of Sign Test

Accepted Answer

Not as powerful as Signed Rank Test

Question 22

LogRank Test Assumption

Accepted Answer

The times of interest are either exactly known or right-censored

Question 23

Logrank Test Hypotheses

Accepted Answer

H0: F0(t) = F1(t) <-- CDF
...Since S(t) = 1-F(t)
    S0(t)= S1(t)
Ha: Survival functions are not equal (two-sided)
Could also be not equal w/ strict inequality for at least one time, t

Question 24

Pearson Correlation Coefficient

Accepted Answer

Measures strength, direction of LINEAR data

Question 25

Hypotheses for Pearson Correlation Coefficient

Accepted Answer

rho equals 0, rho does not equal 0
correlation, no correlation

Question 26

Estimated Slope and Pearson's CC

Accepted Answer

r * (Sy / Sx)

Question 27

Assumptions for Pearson's CC

Accepted Answer

Quantitative/ Numerical Data
Linearity

Question 28

Permutation Test for Rho

Accepted Answer

Same hypotheses
n! possible permutations

Question 29

Steps for Permutation test for Rho

Accepted Answer

Calculate robs (observed correlation)
Permute y's among x's n! ways
Calculate r for each permutation
Get p-value

Question 30

P-value for Permutation test for Rho

Accepted Answer

P-value = 
# of r values >= robs / (R <--permutations we consider)

Question 31

Spearman Correlation Coefficient

Accepted Answer

Measures extent to which y increases with x by comparing the ranks of the x's (1 to n) with the ranks of the y's (1 to n)

Question 32

Benefits of Spearman's Correlation Coefficient

Accepted Answer

Doesn't need quantitative data, only need to be able to rank it
Doesn't need to look linear

Question 33

Kendall's Tau

Accepted Answer

A measure of association between two variables based on counts of concordant and discordant pairs

Question 34

Calculation Tau

Accepted Answer

1 point for Concordant Pairs
0 Points for discordant Pairs
.5 points for tied pairs
Total points and 
Tau = 2((totalpoints)/(nchoose2)) - 1
Approximately Distributed Normally

Question 35

Chi Square Test

Accepted Answer

Individuals are placed in two categories based on two overlapping characteristics

Question 36

Hypotheses for Chi Square Test (SRS)

Accepted Answer

"Testing independence between rows and columns"
H0 : Pij = Pi.*P.j
Ha : Not H0 (not independent)

Question 37

Hypotheses for Chi Square Test (Stratified of Completely Randomized Design)

Accepted Answer

"Test of Homogeneity"
H0 : Pi|j = Pi|j' (no association between rows and columns)
Ha : Not H0

Question 38

Kruskal Wallis Test

Accepted Answer

Used to obtain a nonparametric rank test for comparing K treatments. Test Statistic is equivalent to F stat (applied to ranks). Natural extension of WRST for location(center of distribution)

Question 39

Hypotheses for Kruskal Wallis Test

Accepted Answer

H0: F1(x) = F2(x) =...=...=Fn(x)
Ha: At least one F(x) is different

Question 40

Permutation Distribution based on Kruskal Wallis Test stat

Accepted Answer

Has Chi-squared distribution with k-1 DF.

Question 41

Pairwise Comparisons

Accepted Answer

# of Treatment Groups choose 2 = possible comparisons

Question 42

Bonferroni

Accepted Answer

Alpha Adjustment Technique... 
Alpha / (# of comparisons)

Question 43

Bonferroni in the NonParametric Setting

Accepted Answer

Sample mean Ranks are used instead of sample means
N(N+1)/12 * sample variance instead of sqrtMSE

Question 44

Kruskal Wallis Test Steps

Accepted Answer

1. Rank Data across all trt groups
2. Find average of ranks btwn groups
Get K statistic

Question 45

Possible Permutations in Kruskal Wallis

Accepted Answer

(Number of Obs!) / (n1!n2!...nk!)

Question 46

p-value for K-W Stat

Accepted Answer

Calculate p-value stat for all possible permutation=KW*;  # of KW*>= Observed / total permutations considered

Question 47

Experiment-Wise Error Rate

Accepted Answer

Committing a type 1 error in multiple comparison tests is greater than in 2 sample test

Question 48

Bonferroni

Accepted Answer

Alpha adjustment technique (new alpha is FWalpha/number of comparisons); can be too conservative

Question 49

Protected LSD

Accepted Answer

Only run multiple comparisons is F-test is significant

Question 50

Tukey's HSD

Accepted Answer

based on Q distribution; effect significant if mean differences are greater than qstat*sqrt(MSE/n)

Question 51

Tukey

Accepted Answer

Invented Box-plot, stem and leaf plot, HSD & q statistic

Question 52

Pooled Standard Deviation (Sp)

Accepted Answer

sqrt(MSE) ..(from ANOVA table)
Use all n points in formula... (n1-1)S1^2/(n1) etc

Question 53

Standard Error

Accepted Answer

Sp* sqrt(1/n1 + 1/n2)

Question 54

Fcrit for multiple comparisons

Accepted Answer

Use DF = N-K

Question 55

Bonferroni Confidence Interval

Accepted Answer

width for each pairwise comparison is always the same

Question 56

Reasons to adjust for multiple comparisons

Accepted Answer

Overall confidence interval goes down
Probability of family wise type one error would go up

Question 57

Kolmogrov Smirnov Test

Accepted Answer

Designed to detect differences in location (center), scale (variability), or shape of two distributions

Question 58

Two sample t-test has correct type1 error rate and highest power among unbiased tests if...

Accepted Answer

the populations are normal with known but equal variances

Question 59

Conclusion

Accepted Answer

A statement about the alternative hypothesis

Question 60

Permutation test

Accepted Answer

Any test that finds the p-value as the proportion of regroupings that lead to a statistic as extreme or more extreme than what was observed

Question 61

Test the median when

Accepted Answer

The only assumption met is SRS (normality is violated, small sample size, skewness and outliers)

Question 62

Power

Accepted Answer

probability of correctly rejecting the null; (1-Beta)

Question 63

Bernoulli Trial

Accepted Answer

a trial or experiment with two possible outcomes

Question 64

Ansari-Bradley

Accepted Answer

Test on variances, won't work if medians are different, Rank from both ends, C=Sum of group 1 ranks.

Question	Answer
Concordant	A pair is concordant if (xi-xj)*(yi-yj) > 0
Discordant	A pair is discordant if (xi-xj)*(yi-yj) < 0
Non-informative Censoring	We assume that censored individuals are at the same risk of failure as those who are still alive and uncensored; This implies that the censoring process is independent of survival time
Turnbull estimator	Used for interval censored data, assumes non-informative censoring
Interval censored data	When the time of death is known to happen between two points, but the exact time is unknown. The nonparametric survival function for this type of data is not continuous. Left and right censored are types of this
Survival Function	Non-increasing function, value 1 at time 0 because it deals with probabilities, 0 at time = infinity. Same as CDF but backwards (1-CDF) = survival function, CDF = 1-S(f)
Right Censored Data	A subjects actual time-to-event is known only to occur after a given time (they dropped out, study ended)
Left-censored Data	Arises when Ti is known only to occur before some censoring time
Kaplan Meier Estimator	The limit of the life-table estimator when intervals are taken so small that only at most one distinct observation occurs within an interval; used only for right censored or exact data; step function; jumps down at death
Signed Rank Test (wilcoxon Signed rank test)	Assume no ties and no diff=0 Calculate Paired Differences & abs value of paired differences Rank absolute value Attach original sings Compute SR+ (Can be used for paired data)
SR+ for a signed rank test	Observed value of the sum of the positive signed ranks
P-value for a Signed Ranked Test	Fraction of SR+ values that are greater than or equal to SR+observed (one-sided) over 2^n (sample size)
Ties and 0's in Signed Rank Test	Average ties as usual Use a method of ranking with zeros or omit zeros (SAS omits zeros)
Zeros in the Signed Rank Test	If there are not too many, ranking with or without them will give similar results
Sign Test	(can be used for paired data, does not consider ranks)
SN+ in the sign test	The number of positive differences; follows the binomial distribution with p=.5
Sign Test Hypotheses	H0: Ɵd = 0 or Ha: Ɵd≠ 0
Exact P-value for Sign Test	1 - binomcdf(n, probofsuccess, value of interest)
Large Sample Approximation for Sign Test	(SN+ - n/2) / ( sqrt(n/4)) = Z Then use normcdf to get p-value
Advantages of the Sign Test	Easy to perform Protects against outliers Efficient for heavy tailed distributions Don't need actual data, just signs of differences
Disadvantages of Sign Test	Not as powerful as Signed Rank Test
LogRank Test Assumption	The times of interest are either exactly known or right-censored
Logrank Test Hypotheses	H0: F0(t) = F1(t) <-- CDF ...Since S(t) = 1-F(t) S0(t)= S1(t) Ha: Survival functions are not equal (two-sided) Could also be not equal w/ strict inequality for at least one time, t
Pearson Correlation Coefficient	Measures strength, direction of LINEAR data
Hypotheses for Pearson Correlation Coefficient	rho equals 0, rho does not equal 0 correlation, no correlation
Estimated Slope and Pearson's CC	r * (Sy / Sx)
Assumptions for Pearson's CC	Quantitative/ Numerical Data Linearity
Permutation Test for Rho	Same hypotheses n! possible permutations
Steps for Permutation test for Rho	Calculate robs (observed correlation) Permute y's among x's n! ways Calculate r for each permutation Get p-value
P-value for Permutation test for Rho	P-value = # of r values >= robs / (R <--permutations we consider)
Spearman Correlation Coefficient	Measures extent to which y increases with x by comparing the ranks of the x's (1 to n) with the ranks of the y's (1 to n)
Benefits of Spearman's Correlation Coefficient	Doesn't need quantitative data, only need to be able to rank it Doesn't need to look linear
Kendall's Tau	A measure of association between two variables based on counts of concordant and discordant pairs
Calculation Tau	1 point for Concordant Pairs 0 Points for discordant Pairs .5 points for tied pairs Total points and Tau = 2((totalpoints)/(nchoose2)) - 1 Approximately Distributed Normally
Chi Square Test	Individuals are placed in two categories based on two overlapping characteristics
Hypotheses for Chi Square Test (SRS)	"Testing independence between rows and columns" H0 : Pij = Pi.*P.j Ha : Not H0 (not independent)
Hypotheses for Chi Square Test (Stratified of Completely Randomized Design)	"Test of Homogeneity" H0 : Pi\|j = Pi\|j' (no association between rows and columns) Ha : Not H0
Kruskal Wallis Test	Used to obtain a nonparametric rank test for comparing K treatments. Test Statistic is equivalent to F stat (applied to ranks). Natural extension of WRST for location(center of distribution)
Hypotheses for Kruskal Wallis Test	H0: F1(x) = F2(x) =...=...=Fn(x) Ha: At least one F(x) is different
Permutation Distribution based on Kruskal Wallis Test stat	Has Chi-squared distribution with k-1 DF.
Pairwise Comparisons	# of Treatment Groups choose 2 = possible comparisons
Bonferroni	Alpha Adjustment Technique... Alpha / (# of comparisons)
Bonferroni in the NonParametric Setting	Sample mean Ranks are used instead of sample means N(N+1)/12 * sample variance instead of sqrtMSE
Kruskal Wallis Test Steps	1. Rank Data across all trt groups 2. Find average of ranks btwn groups Get K statistic
Possible Permutations in Kruskal Wallis	(Number of Obs!) / (n1!n2!...nk!)
p-value for K-W Stat	Calculate p-value stat for all possible permutation=KW; # of KW>= Observed / total permutations considered
Experiment-Wise Error Rate	Committing a type 1 error in multiple comparison tests is greater than in 2 sample test
Bonferroni	Alpha adjustment technique (new alpha is FWalpha/number of comparisons); can be too conservative
Protected LSD	Only run multiple comparisons is F-test is significant
Tukey's HSD	based on Q distribution; effect significant if mean differences are greater than qstat*sqrt(MSE/n)
Tukey	Invented Box-plot, stem and leaf plot, HSD & q statistic
Pooled Standard Deviation (Sp)	sqrt(MSE) ..(from ANOVA table) Use all n points in formula... (n1-1)S1^2/(n1) etc
Standard Error	Sp* sqrt(1/n1 + 1/n2)
Fcrit for multiple comparisons	Use DF = N-K
Bonferroni Confidence Interval	width for each pairwise comparison is always the same
Reasons to adjust for multiple comparisons	Overall confidence interval goes down Probability of family wise type one error would go up
Kolmogrov Smirnov Test	Designed to detect differences in location (center), scale (variability), or shape of two distributions
Two sample t-test has correct type1 error rate and highest power among unbiased tests if...	the populations are normal with known but equal variances
Conclusion	A statement about the alternative hypothesis
Permutation test	Any test that finds the p-value as the proportion of regroupings that lead to a statistic as extreme or more extreme than what was observed
Test the median when	The only assumption met is SRS (normality is violated, small sample size, skewness and outliers)
Power	probability of correctly rejecting the null; (1-Beta)
Bernoulli Trial	a trial or experiment with two possible outcomes
Ansari-Bradley	Test on variances, won't work if medians are different, Rank from both ends, C=Sum of group 1 ranks.
Omnibus Test	A test designed to pick up differences among treatments regardless of the nature of the differences between them
Correction for Ansari-Bradley	If medians are different, make then equal w/ addition or subtraction, apply to entire data set and re-run test
Permutation Principle	Says that the permutation distribution is an appropriate reference distribution for determining the p-value for a test
The type 1 error for a T-test will be close to alpha for large samples from any continuous distribution because of...	the central limit theorem
Permutation	A rearrangement of objects in which the order does matter
Disadvantages of a permutation test	Can be time consuming, large sample sizes require a lot of regroupings
T-test vs. wilcoxon	Heavy tailed distribution = wilcoxon; light tailed = t-test
A symmetric distribution	the binomial distribution when Pi=.5 (left >.5; right <.5)
Combination	A rearranging of objects in which order does NOT matter
U	The number of pairs for which Xi > Yj (if x=y we add .5 to U)
Efficiency of a test A to B	eff(AtoB) = NB/NA ... if eff >1 then A requires a smaller sample size; if eff <1 then B does
Mann Whitney & Wilcoxon Rank Sum	Are equivalent in that they are a function of one another (W = ((N(N+1))/2) + U)
K-S test statistic	The maximum W stat times the absolute value of the difference between the two estimated W stats
n! / (n-r)!	The number of ways to choose r things from a total of n things (nPr)
Characteristics of a binomial Experiment	1.there are n bernoulli trials where n is known in advance 2. X is the number of successes in n trials 3. n trials are independent 4. true probability (Pi) is the same for every trial
Nonparametric Methods	1.Binomial Distribution 2. Permutation 3. Bootstrap resampling 4. Smoothing and Non-least squares
(n choose r)	n!/(r!(n-r)! if order doesnt matter....nCr
Independence	2 things are independent if the probability of one event occurring does not effect the probability or another event occurring
Theta.5 vs ThetaH	True Median vs Some hypothesized value
Advantages of a Permutation test	Used on small sample sizes, no normality assumption, no equal variances
tcdf function	tcdf(Tobt, 9999, DF) [multiply by two if you need two-sided]
binomCDF function	binomCDF(n,probability,value of interest) : does probability less than & equal to value of interest (times 2 for two sided)
CDF graph	Probability on Y axis, littlex value on x axis, step function, open points on right side
Exact Test on Binomial Distribution	WITH A CALCULATOR
Approximate Method for the Binomial Distribution	Z = B-.5n / sqrt(.25n) : P(Z>=Zobt) <- plug into NormCDF
NormCDF function	(Zobt, 9999, mean,stddev)
Continuity Correction	B- or B+ .5 approximate method formula
advantages to the sign test	only looked at positive or negative signs (paired test so indicates type of difference btwn pairs)
Disadvantages to the sign test	Didn't account for magnitude of the differences

"Know" box contains:
Time elapsed:
Retries:

Stats 256 Exam