When finding the P value for a hypothesis test

its 2xP(T or Z >whatever value of W)

What are the 3 Axioms of Probability?

1) P(A) ≥ 0 for all events A. 2) P(S) = 1 (whole sample space). 3) If A1, A2,... are mutually exclusive, then P(A1 ∪ A2 ∪ ...) = P(A1) + P(A2) + ...

What is the Addition Rule P(A∪B)?

P(A∪B) = P(A) + P(B) − P(A∩B). Subtract the intersection to avoid double-counting the overlap between A and B.

What is De Morgan's Law?

(A∪B)^c = A^c ∩ B^c. The complement of a union equals the intersection of the complements. Flip the operation and complement each part.

What is Conditional Probability P(A|B)?

P(A|B) = P(A∩B) / P(B). Probability of A GIVEN B occurred. Numerator is the joint probability; denominator is the probability of the condition.

What is the Conditional Complement rule?

P(A^c | B) = 1 − P(A|B). The complement rule still holds inside a conditional — the probabilities of A and not-A given B must sum to 1.

What is the Law of Total Probability?

If {B1,...Bm} is a partition, then P(A) = P(A|B1)P(B1) + ... + P(A|Bm)P(Bm). Break a hard probability into weighted conditional probabilities over all cases.

What is Bayes' Theorem?

P(A|B) = P(B|A)·P(A) / P(B). Reverses conditional probabilities. P(A) is the prior; P(A|B) is the posterior after observing B.

When are events A and B independent?

A and B are independent iff P(A∩B) = P(A)·P(B). Equivalently P(A|B) = P(A) — knowing B happened gives no information about A.

What is the Permutations formula P(n,k)?

P(n,k) = n! / (n−k)!. Ordered arrangements of k objects from n distinct objects WITHOUT replacement. Order matters.

What is the Combinations formula C(n,k)?

C(n,k) = n! / (k!(n−k)!). Ways to choose k objects from n WITHOUT replacement where order does NOT matter. Also written "n choose k."

What is permutations WITH replacement?

n^k. The number of ordered arrangements of k objects chosen from n, where each object can be reused. Order matters and repetition is allowed.

What is the Equiprobable Probability formula?

P(Q) = |Q| / |S|. When all outcomes are equally likely, probability = number of favorable outcomes divided by total outcomes in sample space S.

What is the Binomial Probability formula?

P(X=k) = C(n,k)·p^k·(1−p)^(n−k). Probability of exactly k successes in n independent trials each with success probability p.

What is the Bernoulli Distribution — PMF, mean, variance?

PMF: P(X=1)=p, P(X=0)=1−p. Mean = p. Variance = p(1−p). Models a single trial with two outcomes: success (1) or failure (0).

What is the Geometric Distribution — PMF, mean, variance?

PMF: PX(k) = p(1−p)^(k−1) for k=1,2,... Number of trials until the FIRST success. Mean = 1/p. Variance = (1−p)/p².

What is the Binomial Distribution — PMF, mean, variance?

PMF: PX(k) = C(n,k)·p^k·(1−p)^(n−k) for k=0,...,n. Counts successes in n independent trials. Mean = np. Variance = np(1−p).

What is the Pascal Distribution — PMF, mean, variance?

PMF: PX(k) = C(k−1,m−1)·p^m·(1−p)^(k−m) for k=m,m+1,... Trials needed for m successes. Mean = m/p. Variance = m(1−p)/p². Geometric is Pascal with m=1.

What is the Poisson Distribution — PMF, mean, variance?

PMF: PX(k) = λ^k · e^(−λ) / k! for k=0,1,2,... Models rare events. Mean = λ. Variance = λ. The parameter λ is both mean AND variance.

What is the Discrete Uniform Distribution — PMF, mean, variance?

PMF: PX(k) = 1/(b−a+1) for k=a,a+1,...,b. All integers equally likely. Mean = (a+b)/2. Variance = ((b−a+1)²−1)/12.

What is the Continuous Uniform Distribution — PDF, mean, variance?

PDF: fX(x) = 1/(b−a) for a ≤ x ≤ b, else 0. Mean = (a+b)/2. Variance = (b−a)²/12. All values in [a,b] are equally likely.

What is the Exponential Distribution — PDF, mean, variance?

PDF: fX(x) = λe^(−λx) for x ≥ 0. Models time between events. Mean = 1/λ. Variance = 1/λ². λ is the rate parameter; higher λ means shorter average wait.

What is the Erlang Distribution — PDF, mean, variance?

PDF: fX(x) = λ^n · x^(n−1) · e^(−λx) / (n−1)! for x ≥ 0. Sum of n independent Exponential(λ) r.v.s. Mean = n/λ. Variance = n/λ².

What is the Gaussian (Normal) Distribution — PDF and CDF?

PDF: fX(x) = (1/σ√(2π))·e^(−(x−µ)²/2σ²). CDF: FX(x) = Φ((x−µ)/σ). Mean = µ. Variance = σ². Φ is the standard Normal CDF.

What does Φ(z) mean and what is its complement rule?

Φ(z) = P(Z ≤ z) for Z ~ Normal(0,1). Complement rule: Φ(−z) = 1 − Φ(z). Standardize any Normal: z = (x−µ)/σ, then look up Φ.

What is the relationship between PDF fX(x) and CDF FX(x)?

fX(x) = dFX(x)/dx. FX(x) = ∫[−∞ to x] fX(u)du. PDF is the derivative of CDF; CDF is the integral of PDF.

How do you compute P(a ≤ X ≤ b)?

P(a ≤ X ≤ b) = ∫[a to b] fX(x)dx = FX(b) − FX(a). Integrate the PDF over the interval, or subtract CDF values at the endpoints.

What is Expected Value E[X] for a discrete random variable?

E[X] = Σ x·PX(x) over all x in the range. Weighted average of all possible values, each weighted by its probability.

What is E[g(X)] for a continuous random variable?

E[g(X)] = ∫[−∞ to +∞] g(x)·fX(x)dx. Integrate g(x) weighted by the PDF. For discrete: E[g(X)] = Σ g(x)·PX(x).

What are the Linear Expectation and Variance rules?

E[aX+b] = a·E[X]+b. Var(aX+b) = a²·Var(X). Constants shift the mean but not variance; scaling multiplies variance by the square of the scale factor.

What is Variance Var(X) and its shortcut formula?

Var(X) = E[(X−µ)²] = E[X²] − µ². Average squared deviation from the mean. Shortcut: compute E[X²] then subtract the square of the mean.

What is the PMF Normalization condition?

Σ PX(x) = 1 over all x. All probabilities must sum to 1. For continuous: ∫ fX(x)dx = 1. A valid distribution must satisfy this.

What is the Joint CDF FX,Y(x,y)?

FX,Y(x,y) = P(X ≤ x, Y ≤ y). The probability that X is at most x AND Y is at most y simultaneously.

How do you find Marginal PDFs from a joint PDF?

fX(x) = ∫[−∞ to +∞] fX,Y(x,y)dy. Integrate out Y to get the marginal of X. Similarly fY(y) = ∫ fX,Y(x,y)dx. Removes dependence on the other variable.

What is the Conditional PDF fX|Y(x|y)?

fX|Y(x|y) = fX,Y(x,y) / fY(y). Density of X given Y=y. Divide joint density by the marginal of the conditioning variable Y.

When are X and Y independent (joint distributions)?

X and Y are independent iff fX,Y(x,y) = fX(x)·fY(y) for continuous, or PX,Y(x,y) = PX(x)·PY(y) for discrete. Joint equals the product of the marginals.

What is Linearity of Expectation for joint r.v.s?

E[g(X) + h(Y)] = E[g(X)] + E[h(Y)]. Always true regardless of whether X and Y are independent. Expectation distributes over sums.

What is Covariance Cov(X,Y)?

Cov(X,Y) = E[(X−µX)(Y−µY)] = E[XY] − µX·µY. Measures how X and Y vary together. Positive = move together; negative = move oppositely; zero = uncorrelated.

What is the Correlation Coefficient ρ(X,Y)?

ρ(X,Y) = Cov(X,Y) / (σX·σY). Normalized covariance; always in [−1,1]. ρ=1 perfect positive linear relationship; ρ=−1 perfect negative; ρ=0 uncorrelated.

Var(X+Y) = Var(X) + Var(Y) + 2·Cov(X,Y). If X and Y are independent, Cov=0, so Var(X+Y) = Var(X)+Var(Y).

What is E[Wn] and Var(Wn) for a sum of n i.i.d. r.v.s?

Wn = X1+...+Xn. E[Wn] = n·E[X]. Var(Wn) = n·Var(X). Std dev = σX·√n. The mean scales linearly with n but std dev only grows as √n.

What is the Sample Mean X̄ — its mean and variance?

X̄ = (1/n)·ΣXi. E[X̄] = µX (unbiased). Var(X̄) = Var(X)/n. Std dev = σX/√n. As n grows, X̄ becomes more concentrated around the true mean.

What is the Central Limit Theorem (CLT)?

For large n, Wn = ΣXi is approximately Normal(n·µX, σX·√n). Equivalently, X̄ ≈ Normal(µX, σX/√n). Written: FWn(w) ≈ Φ((w−nµX)/(σX·√n)). Works regardless of the original distribution.

What is the Continuity Correction for CLT?

P(a ≤ W ≤ b) ≈ P(a−0.5 ≤ WCLT ≤ b+0.5). Used when W is a discrete integer-valued r.v. approximated by a continuous Normal. Expand interval by 0.5 on each side.

What is the Weak Law of Large Numbers?

lim[n→∞] P(|X̄ − µX| 0. The sample mean converges in probability to the true mean as n → ∞. Foundation of statistical estimation.

What is Bias of an estimator?

B(θ̂) = E[θ̂] − θ. Systematic error: how far the estimator's expected value is from the true parameter θ. Unbiased means B=0. Example: X̄ is unbiased for µ.

What is Mean Squared Error (MSE) of an estimator?

MSE(θ̂) = E[(θ̂−θ)²] = Var(θ̂) + [B(θ̂)]². Total error = variance + bias². When unbiased, MSE = Variance. Lower MSE means a better estimator.

What is the Sample Variance S² and why n−1?

S² = (1/(n−1))·Σ(Xi−X̄)². Dividing by n−1 (not n) makes S² an unbiased estimator of σ². The −1 corrects for estimating X̄ from the same data.

What is the Likelihood Function and MLE?

L(θ) = Π fX(xi;θ). The MLE is θ̂ = argmax L(θ) — the parameter making observed data most probable. Often maximize log L = Σ log fX(xi;θ) for easier computation.

Help

Options

focusNode

Didn't know it?
click below

Knew it?
click below

Don't Know

Remaining cards (0)

Know

retry

shuffle

restart

0:00

Embed Code - If you would like this activity on your web page, copy the script below and paste it into your web page.

Normal Size Small Size show me how

Stats and Prob

Question	Answer
When finding the P value for a hypothesis test	its 2xP(T or Z >whatever value of W)
X bar	sample mean
u	mean
What are the 3 Axioms of Probability?	1) P(A) ≥ 0 for all events A. 2) P(S) = 1 (whole sample space). 3) If A1, A2,... are mutually exclusive, then P(A1 ∪ A2 ∪ ...) = P(A1) + P(A2) + ...
What is the Addition Rule P(A∪B)?	P(A∪B) = P(A) + P(B) − P(A∩B). Subtract the intersection to avoid double-counting the overlap between A and B.
What is De Morgan's Law?	(A∪B)^c = A^c ∩ B^c. The complement of a union equals the intersection of the complements. Flip the operation and complement each part.
What is Conditional Probability P(A\|B)?	P(A\|B) = P(A∩B) / P(B). Probability of A GIVEN B occurred. Numerator is the joint probability; denominator is the probability of the condition.
What is the Conditional Complement rule?	P(A^c \| B) = 1 − P(A\|B). The complement rule still holds inside a conditional — the probabilities of A and not-A given B must sum to 1.
What is the Law of Total Probability?	If {B1,...Bm} is a partition, then P(A) = P(A\|B1)P(B1) + ... + P(A\|Bm)P(Bm). Break a hard probability into weighted conditional probabilities over all cases.
What is Bayes' Theorem?	P(A\|B) = P(B\|A)·P(A) / P(B). Reverses conditional probabilities. P(A) is the prior; P(A\|B) is the posterior after observing B.
When are events A and B independent?	A and B are independent iff P(A∩B) = P(A)·P(B). Equivalently P(A\|B) = P(A) — knowing B happened gives no information about A.
What is the Permutations formula P(n,k)?	P(n,k) = n! / (n−k)!. Ordered arrangements of k objects from n distinct objects WITHOUT replacement. Order matters.
What is the Combinations formula C(n,k)?	C(n,k) = n! / (k!(n−k)!). Ways to choose k objects from n WITHOUT replacement where order does NOT matter. Also written "n choose k."
What is permutations WITH replacement?	n^k. The number of ordered arrangements of k objects chosen from n, where each object can be reused. Order matters and repetition is allowed.
What is the Equiprobable Probability formula?	P(Q) = \|Q\| / \|S\|. When all outcomes are equally likely, probability = number of favorable outcomes divided by total outcomes in sample space S.
What is the Binomial Probability formula?	P(X=k) = C(n,k)·p^k·(1−p)^(n−k). Probability of exactly k successes in n independent trials each with success probability p.
What is the Bernoulli Distribution — PMF, mean, variance?	PMF: P(X=1)=p, P(X=0)=1−p. Mean = p. Variance = p(1−p). Models a single trial with two outcomes: success (1) or failure (0).
What is the Geometric Distribution — PMF, mean, variance?	PMF: PX(k) = p(1−p)^(k−1) for k=1,2,... Number of trials until the FIRST success. Mean = 1/p. Variance = (1−p)/p².
What is the Binomial Distribution — PMF, mean, variance?	PMF: PX(k) = C(n,k)·p^k·(1−p)^(n−k) for k=0,...,n. Counts successes in n independent trials. Mean = np. Variance = np(1−p).
What is the Pascal Distribution — PMF, mean, variance?	PMF: PX(k) = C(k−1,m−1)·p^m·(1−p)^(k−m) for k=m,m+1,... Trials needed for m successes. Mean = m/p. Variance = m(1−p)/p². Geometric is Pascal with m=1.
What is the Poisson Distribution — PMF, mean, variance?	PMF: PX(k) = λ^k · e^(−λ) / k! for k=0,1,2,... Models rare events. Mean = λ. Variance = λ. The parameter λ is both mean AND variance.
What is the Discrete Uniform Distribution — PMF, mean, variance?	PMF: PX(k) = 1/(b−a+1) for k=a,a+1,...,b. All integers equally likely. Mean = (a+b)/2. Variance = ((b−a+1)²−1)/12.
What is the Continuous Uniform Distribution — PDF, mean, variance?	PDF: fX(x) = 1/(b−a) for a ≤ x ≤ b, else 0. Mean = (a+b)/2. Variance = (b−a)²/12. All values in [a,b] are equally likely.
What is the Exponential Distribution — PDF, mean, variance?	PDF: fX(x) = λe^(−λx) for x ≥ 0. Models time between events. Mean = 1/λ. Variance = 1/λ². λ is the rate parameter; higher λ means shorter average wait.
What is the Erlang Distribution — PDF, mean, variance?	PDF: fX(x) = λ^n · x^(n−1) · e^(−λx) / (n−1)! for x ≥ 0. Sum of n independent Exponential(λ) r.v.s. Mean = n/λ. Variance = n/λ².
What is the Gaussian (Normal) Distribution — PDF and CDF?	PDF: fX(x) = (1/σ√(2π))·e^(−(x−µ)²/2σ²). CDF: FX(x) = Φ((x−µ)/σ). Mean = µ. Variance = σ². Φ is the standard Normal CDF.
What does Φ(z) mean and what is its complement rule?	Φ(z) = P(Z ≤ z) for Z ~ Normal(0,1). Complement rule: Φ(−z) = 1 − Φ(z). Standardize any Normal: z = (x−µ)/σ, then look up Φ.
What is the relationship between PDF fX(x) and CDF FX(x)?	fX(x) = dFX(x)/dx. FX(x) = ∫[−∞ to x] fX(u)du. PDF is the derivative of CDF; CDF is the integral of PDF.
How do you compute P(a ≤ X ≤ b)?	P(a ≤ X ≤ b) = ∫[a to b] fX(x)dx = FX(b) − FX(a). Integrate the PDF over the interval, or subtract CDF values at the endpoints.
What is Expected Value E[X] for a discrete random variable?	E[X] = Σ x·PX(x) over all x in the range. Weighted average of all possible values, each weighted by its probability.
What is E[g(X)] for a continuous random variable?	E[g(X)] = ∫[−∞ to +∞] g(x)·fX(x)dx. Integrate g(x) weighted by the PDF. For discrete: E[g(X)] = Σ g(x)·PX(x).
What are the Linear Expectation and Variance rules?	E[aX+b] = a·E[X]+b. Var(aX+b) = a²·Var(X). Constants shift the mean but not variance; scaling multiplies variance by the square of the scale factor.
What is Variance Var(X) and its shortcut formula?	Var(X) = E[(X−µ)²] = E[X²] − µ². Average squared deviation from the mean. Shortcut: compute E[X²] then subtract the square of the mean.
What is the PMF Normalization condition?	Σ PX(x) = 1 over all x. All probabilities must sum to 1. For continuous: ∫ fX(x)dx = 1. A valid distribution must satisfy this.
What is the Joint CDF FX,Y(x,y)?	FX,Y(x,y) = P(X ≤ x, Y ≤ y). The probability that X is at most x AND Y is at most y simultaneously.
How do you find Marginal PDFs from a joint PDF?	fX(x) = ∫[−∞ to +∞] fX,Y(x,y)dy. Integrate out Y to get the marginal of X. Similarly fY(y) = ∫ fX,Y(x,y)dx. Removes dependence on the other variable.
What is the Conditional PDF fX\|Y(x\|y)?	fX\|Y(x\|y) = fX,Y(x,y) / fY(y). Density of X given Y=y. Divide joint density by the marginal of the conditioning variable Y.
When are X and Y independent (joint distributions)?	X and Y are independent iff fX,Y(x,y) = fX(x)·fY(y) for continuous, or PX,Y(x,y) = PX(x)·PY(y) for discrete. Joint equals the product of the marginals.
What is Linearity of Expectation for joint r.v.s?	E[g(X) + h(Y)] = E[g(X)] + E[h(Y)]. Always true regardless of whether X and Y are independent. Expectation distributes over sums.
What is Covariance Cov(X,Y)?	Cov(X,Y) = E[(X−µX)(Y−µY)] = E[XY] − µX·µY. Measures how X and Y vary together. Positive = move together; negative = move oppositely; zero = uncorrelated.
What is the Correlation Coefficient ρ(X,Y)?	ρ(X,Y) = Cov(X,Y) / (σX·σY). Normalized covariance; always in [−1,1]. ρ=1 perfect positive linear relationship; ρ=−1 perfect negative; ρ=0 uncorrelated.
What is Var(X+Y)?	Var(X+Y) = Var(X) + Var(Y) + 2·Cov(X,Y). If X and Y are independent, Cov=0, so Var(X+Y) = Var(X)+Var(Y).
What is E[Wn] and Var(Wn) for a sum of n i.i.d. r.v.s?	Wn = X1+...+Xn. E[Wn] = n·E[X]. Var(Wn) = n·Var(X). Std dev = σX·√n. The mean scales linearly with n but std dev only grows as √n.
What is the Sample Mean X̄ — its mean and variance?	X̄ = (1/n)·ΣXi. E[X̄] = µX (unbiased). Var(X̄) = Var(X)/n. Std dev = σX/√n. As n grows, X̄ becomes more concentrated around the true mean.
What is the Central Limit Theorem (CLT)?	For large n, Wn = ΣXi is approximately Normal(n·µX, σX·√n). Equivalently, X̄ ≈ Normal(µX, σX/√n). Written: FWn(w) ≈ Φ((w−nµX)/(σX·√n)). Works regardless of the original distribution.
What is the Continuity Correction for CLT?	P(a ≤ W ≤ b) ≈ P(a−0.5 ≤ WCLT ≤ b+0.5). Used when W is a discrete integer-valued r.v. approximated by a continuous Normal. Expand interval by 0.5 on each side.
What is the Weak Law of Large Numbers?	lim[n→∞] P(\|X̄ − µX\| < ε) = 1 for any ε > 0. The sample mean converges in probability to the true mean as n → ∞. Foundation of statistical estimation.
What is Bias of an estimator?	B(θ̂) = E[θ̂] − θ. Systematic error: how far the estimator's expected value is from the true parameter θ. Unbiased means B=0. Example: X̄ is unbiased for µ.
What is Mean Squared Error (MSE) of an estimator?	MSE(θ̂) = E[(θ̂−θ)²] = Var(θ̂) + [B(θ̂)]². Total error = variance + bias². When unbiased, MSE = Variance. Lower MSE means a better estimator.
What is the Sample Variance S² and why n−1?	S² = (1/(n−1))·Σ(Xi−X̄)². Dividing by n−1 (not n) makes S² an unbiased estimator of σ². The −1 corrects for estimating X̄ from the same data.
What is the Likelihood Function and MLE?	L(θ) = Π fX(xi;θ). The MLE is θ̂ = argmax L(θ) — the parameter making observed data most probable. Often maximize log L = Σ log fX(xi;θ) for easier computation.
What are the critical values zα and zα/2?	P(Z > zα) = α and P(Z > zα/2) = α/2 where Z ~ Normal(0,1). So zα = Φ⁻¹(1−α) and zα/2 = Φ⁻¹(1−α/2). For 95% CI: α=0.05 and zα/2 ≈ 1.96.
What is a Confidence Interval for the mean using z?	P(X̄ − δ ≤ µ ≤ X̄ + δ) ≥ 1−α, where δ = zα/2·S/√n. A 95% CI uses z=1.96. Interpretation: 95% of such intervals constructed this way will contain the true µ.
When do you use Student's t instead of z for a CI?	When n is small AND σ is unknown. Use δ = tα/2,n−1·S/√n where t has n−1 degrees of freedom. The t-distribution has heavier tails; it approaches Normal as n→∞.
What is the Confidence Interval for variance σ²?	xl = (n−1)S²/χ²(α/2,n−1) and xh = (n−1)S²/χ²(1−α/2,n−1). Uses the chi-squared distribution with n−1 degrees of freedom. The interval is asymmetric.
How do you test H0: µ=µ0 vs H1: µ≠µ0 (two-sided)?	Test statistic W = (X̄−µ0)/(σ/√n). Reject H0 if \|W\| > zα/2. If σ unknown use S; if n small use tα/2,n−1. Deviations in either direction count as evidence.
How do you test H0: µ≤µ0 vs H1: µ>µ0 (one-sided upper)?	Test statistic W = (X̄−µ0)/(σ/√n). Reject H0 if W > zα. One-sided test — only large positive values of W provide evidence against H0.
How do you test H0: µ≥µ0 vs H1: µ<µ0 (one-sided lower)?	Test statistic W = (X̄−µ0)/(σ/√n). Reject H0 if W < −zα. One-sided test — only large negative values of W provide evidence against H0.
How do you compare two means H0: µA=µB vs H1: µA≠µB?	Test statistic W = (X̄A−X̄B) / √((σA²+σB²)/n). Reject H0 if \|W\| > zα/2. Assumes equal n; use S if σ unknown; use tα/2,2n−2 if n is small.
What is the Chi-squared goodness-of-fit test?	X² = Σ(Oi−Ei)²/Ei where Oi=observed count, Ei=expected count in bin i. Reject H0 (data fits the distribution) if X² > χ²(α,k−1) where k=number of bins.
What is the p-value in hypothesis testing?	The p-value is the lowest significance level α at which H0 would be rejected. If p-value < α, reject H0. A smaller p-value means stronger evidence against H0.
What is the Simple Linear Regression model?	Y = β0 + β1·X + ε. β0 = intercept (Y when X=0). β1 = slope (change in Y per unit increase in X). ε = random error with mean 0.
What are the OLS formulas for β̂0 and β̂1?	β̂1 = sxy/sxx where sxy=Σ(xi−x̄)(yi−ȳ) and sxx=Σ(xi−x̄)². β̂0 = ȳ − β̂1·x̄. The regression line always passes through (x̄, ȳ).
What are sxx, syy, and sxy in linear regression?	sxx = Σ(xi−x̄)² (spread of X). syy = Σ(yi−ȳ)² (spread of Y). sxy = Σ(xi−x̄)(yi−ȳ) (co-spread of X and Y). Building blocks of OLS regression.
What are predicted values ŷi and residuals ei?	ŷi = β̂0 + β̂1·xi (fitted value on the regression line). ei = yi − ŷi (residual = actual minus predicted). Residuals represent the unexplained part of Y.
What is r² in linear regression and what does it mean?	r² = sxy² / (sxx·syy). Proportion of variance in Y explained by X. Ranges 0 to 1. r²=1 means perfect linear fit; r²=0 means X explains none of the variation in Y.
What is the Standard Error of the slope Sβ1?	Sβ1 = √( (1/(n−2))·Σei² / Σ(xi−x̄)² ). Measures uncertainty in the slope estimate. Used in test: W = β̂1/Sβ1 to test if β1=0.
How do you test whether X and Y are correlated in regression?	H0: β1=0 vs H1: β1≠0. Test statistic W = β̂1/Sβ1. Reject H0 if \|W\| > zα/2. If n is small, use tα/2,n−2. Rejecting H0 means slope is significantly nonzero.
Mid1 A Q1 SB-a — A={X is even}={2,4,6}, B={X>4}={5,6}. What is (A∩B)^c?	First find A∩B = {6}, then take the complement. (A∩B)^c = {1,2,3,4,5}. Key: De Morgan's is NOT needed here — just complement the intersection directly.
Mid1 A Q1 SB-b — A={2,4,6}, B={5,6}, die roll. What theory do you need to find P(A\|B)?	Conditional Probability: P(A\|B) = P(A∩B)/P(B). A∩B={6}, so P(A∩B)=1/6. P(B)=2/6. Answer = (1/6)/(2/6) = 1/2.
Mid1 A Q1 SB-c — A={2,4,6}, B={5,6}, die roll. What theory do you need to find P(B\|A)?	Conditional Probability: P(B\|A) = P(A∩B)/P(A). A∩B={6}, P(A∩B)=1/6. P(A)=3/6. Answer = (1/6)/(3/6) = 1/3.
Mid1 A Q1 SB-d — Events A={2,4,6}, B={5,6}, C={1,2,3}. Which pairs are disjoint?	Two events are disjoint if their intersection is empty. B∩C = {} (no overlap between {5,6} and {1,2,3}). So B and C are disjoint. Check all pairs: A∩B={6}≠∅, A∩C={2}≠∅, B∩C=∅.
Mid1 A Q1 SB-e — A={2,4,6}, B={5,6} on a fair die. How do you prove A and B are independent?	Independence check: P(A)·P(B) must equal P(A∩B). P(A)=1/2, P(B)=1/3, P(A∩B)=1/6. Since (1/2)·(1/3) = 1/6 = P(A∩B), A and B ARE independent.
CONCEPT — How do you identify a Conditional Probability problem?	Look for the word "given" or the \| symbol. Setup: you already know one event occurred, and want the probability of another. Formula: P(A\|B) = P(A∩B)/P(B). Always find the joint and the conditioning event's probability.
CONCEPT — How do you identify an Independence vs Disjoint problem?	Disjoint: events CANNOT happen together — P(A∩B)=0. Independent: events don't AFFECT each other — P(A∩B)=P(A)·P(B). WARNING: disjoint events with nonzero probability are NEVER independent.
Mid1 A Q2 SB-a — X~Geometric(0.4). Calculate P(X<2 \| X<3).	Conditional probability. P(X<2\|X<3) = P(X<2 AND X<3)/P(X<3) = P(X=1)/P(X=1 or X=2). PX(1)=0.4, PX(2)=0.4(0.6)=0.24. Answer = 0.4/(0.4+0.24) = 0.625.
Mid1 A Q2 SB-b — Y~Binomial(8, 0.2). Calculate P(Y≥2).	Use complement: P(Y≥2) = 1 - P(Y<2) = 1 - [P(Y=0)+P(Y=1)]. P(Y=0)=C(8,0)(0.2)^0(0.8)^8=0.1677. P(Y=1)=C(8,1)(0.2)^1(0.8)^7=0.3355. Answer = 1 - 0.1677 - 0.3355 = 0.4968.
Mid1 A Q2 SB-c — Y~Binomial(8,0.2). Make 10 measurements of Y, find mean number of times Y=1.	This is a Binomial WITHIN a Binomial. Each measurement has P(Y=1)=0.3355. N = number of times Y=1 in 10 trials ~ Binomial(10, 0.3355). Mean = n·p = 10 × 0.3355 = 3.355.
CONCEPT — How do you identify a Geometric distribution problem?	Key phrases: "first success," "keep trying until," "number of trials until." PMF: PX(k)=p(1-p)^(k-1). Mean=1/p. The experiment stops as soon as you succeed.
CONCEPT — How do you identify a Binomial distribution problem?	Key phrases: "exactly k successes," "out of n trials," fixed number of trials, each trial is independent with same probability p. PMF: C(n,k)·p^k·(1-p)^(n-k). Mean=np.
CONCEPT — When should you use the complement to find a probability?	Use complement when computing "at least one" or "at least k" directly is hard. P(at least one) = 1 - P(none). P(Y≥2) = 1 - P(Y=0) - P(Y=1). The complement is often far fewer terms to calculate.
Mid1 A Q3 SB-a — 10 M&Ms chosen, P(blue)=0.3. What is P(exactly 4 blue)?	Binomial distribution with n=10, p=0.3. P(N=4) = C(10,4)·(0.3)^4·(0.7)^6 = 210·0.0081·0.1176 ≈ 0.2001.
Mid1 A Q3 SB-b — Choose 2 M&Ms, P(R)=0.5, P(W)=0.2, P(B)=0.3. What is P(same color)?	P(same color) = P(RR)+P(WW)+P(BB) = (0.5)²+(0.2)²+(0.3)² = 0.25+0.04+0.09 = 0.38. Key: treats each candy as independent draws.
Mid1 A Q3 SB-c — Choose 3 M&Ms, P(R)=0.5, P(W)=0.2, P(B)=0.3. What is P(one of each color)?	P(one specific order like RWB) = 0.5×0.2×0.3 = 0.03. There are 3! = 6 permutations of 3 colors. So P = 6×0.03 = 0.18. Key: multiply by number of orderings!
CONCEPT — How do you recognize a problem needing permutations of arrangements?	When outcome has multiple distinguishable items and ORDER doesn't matter for the event, count the number of orderings (k! for k distinct items) and multiply by the probability of one specific order.
Mid1 A Q4 SB-a — 75% parts from supplier A (1% defective), 25% from B (2.5% defective). What is P(defective)?	Law of Total Probability: P(D) = P(D\|A)·P(A) + P(D\|B)·P(B) = 0.01·0.75 + 0.025·0.25 = 0.0075 + 0.00625 = 0.01375.
Mid1 A Q4 SB-b — Same setup. Given a part is defective, what is P(it came from supplier B)?	Bayes' Theorem: P(B\|D) = P(D\|B)·P(B)/P(D) = (0.025·0.25)/0.01375 = 0.00625/0.01375 ≈ 0.454.
Mid1 A Q4 SB-c — Same setup. What is P(part from supplier A \| part is NOT defective)?	Bayes' with complement: P(A\|D^c) = P(D^c\|A)·P(A)/P(D^c) = (0.99·0.75)/(1-0.01375) = 0.7425/0.98625 ≈ 0.7529.
CONCEPT — How do you identify a Law of Total Probability problem?	Look for a partition (mutually exclusive, exhaustive groups like supplier A vs B, or disease vs no disease) and you want the overall probability of an event. Formula: P(event) = sum of P(event\|group_i)·P(group_i).
CONCEPT — How do you identify a Bayes' Theorem problem?	You're asked to REVERSE the conditioning: you know P(B\|A) but want P(A\|B). Key phrase: "given that it IS defective/positive/outcome, what is the probability it CAME FROM group X?" Formula: P(A\|B) = P(B\|A)·P(A)/P(B).
Mid1 A Q5 SB-a — Baseball throw, p=0.1, max 3 throws, stop if you win. Find the PMF of X (	throws made).
Mid1 A Q5 SB-b — Same baseball game. What is P(you win the prize)?	P(win) = 1 - P(NNN) = 1 - (0.9)^3 = 1 - 0.729 = 0.271. Always check: can you use the complement to avoid summing multiple cases?
Mid1 A Q5 SB-c — Same baseball game. What is P(X=2 \| you win)?	Conditional probability: P(X=2\|win) = P(X=2 AND win)/P(win). If X=2 you must have won on throw 2 (NY), so P(X=2 AND win) = P(X=2) = 0.09. P(X=2\|win) = 0.09/0.271 ≈ 0.332.
CONCEPT — How do you set up a non-standard sequential probability problem?	Draw a tree diagram. At each branch: success (probability p, stops) or failure (probability 1-p, continues). Multiply along branches for each outcome. Check probabilities sum to 1. Only use Geometric when there is NO cap on trials.
Mid1 A Q6 SB-a — Pull 3 tiles from 26-letter bag. What is P(at least one of {x,y,z,p,d,q})?	Complement method: P(at least one) = 1 - P(none of them). P(none) = (20/26)·(19/25)·(18/24) ≈ 0.438. Answer = 1 - 0.438 = 0.562. Key: sampling WITHOUT replacement so probabilities change each draw.
Mid1 A Q6 SB-b — Best-of-7 series, C wins each game with p=0.6. What is P(series ends in exactly 5 games)?	Pascal distribution: series ends in 5 games means one team wins game 5 (their 4th win) with 3 wins in first 4 games. P(C wins in 5) = C(4,3)·(0.6)^4·(0.4)^1. P(D wins in 5) = C(4,3)·(0.4)^4·(0.6)^1. Total = 0.2074+0.06144 = 0.2688.
CONCEPT — How do you identify a Pascal (Negative Binomial) distribution problem?	Look for: "series ends when one side reaches k wins," "mth success occurs on the nth trial." The last trial MUST be the mth success. Formula: P(X=n) = C(n-1,m-1)·p^m·(1-p)^(n-m). The key: fix the last event as the success, permute the earlier ones.
CONCEPT — How do you identify sampling WITHOUT replacement vs WITH replacement?	Without replacement: each draw changes the pool (like drawing tiles, cards, people). Probabilities change each draw. Use products of fractions: (n/N)·((n-1)/(N-1))... With replacement: pool stays the same, use p^k. "Randomly select from a bag/deck/group"
Mid1 B Q1 SB-a — A={X is odd}={1,3,5}, B={X<2}={1}, die roll. Find (A∩B)^c.	A∩B = {1}. Complement = everything not in {1} = {2,3,4,5,6}.
Mid1 B Q1 SB-e — A={1,3,5}, B={1,2}, fair die. Verify A and B are independent.	P(A)=3/6=1/2, P(B)=2/6=1/3, A∩B={1}, P(A∩B)=1/6. Check: P(A)·P(B) = 1/2·1/3 = 1/6 = P(A∩B). ✓ Independent.
Mid1 B Q2 SB-a — 80% parts from A (1% defective), 20% from B (2.5% defective). What is P(defective)?	Law of Total Probability: P(D) = P(D\|A)·P(A) + P(D\|B)·P(B) = 0.01·0.8 + 0.025·0.2 = 0.008 + 0.005 = 0.0130.
Mid1 B Q2 SB-b — Same setup (80/20 split). Given defective, what is P(from supplier B)?	Bayes': P(B\|D) = P(D\|B)·P(B)/P(D) = (0.025·0.2)/0.0130 = 0.005/0.0130 ≈ 0.3846.
Mid1 B Q2 SB-c — Same setup (80/20 split). Given NOT defective, what is P(from supplier A)?	Bayes' with complement: P(A\|D^c) = P(D^c\|A)·P(A)/P(D^c) = (0.99·0.8)/(1-0.0130) = 0.792/0.9870 ≈ 0.8024.
Mid1 B Q3 SB-b — Best-of-7 series, C wins with p=0.70. What is P(series ends in exactly 5 games)?	Pascal: P(C wins in 5) = C(4,3)·(0.7)^4·(0.3)^1 = 4·0.2401·0.3 = 0.2881. P(D wins in 5) = C(4,3)·(0.3)^4·(0.7)^1 = 4·0.0081·0.7 = 0.0227. Total ≈ 0.3108.
CONCEPT — What is the key difference between Exam A Q4 and Exam B Q2 (supply chain)?	Same problem structure (Bayes' + Total Probability) but different split percentages: Version A uses 75%/25% while Version B uses 80%/20%. The METHOD is identical — always set up P(D) first with Total Probability, then use Bayes' for the reverse conditiona
CONCEPT — General strategy: given a probability word problem, how do you pick the right tool?	Step 1: Is there a condition ("given")? → Conditional probability or Bayes'. Step 2: Fixed n trials, count successes? → Binomial. Step 3: Trials until first success, no cap? → Geometric. Step 4: Trials until mth success / series ends? → Pascal. Step 5: Mu
Mid2 A Q1 SB-a — Bears: µ=450 kg, σ=90 kg, Normal. What fraction weigh MORE than 550 kg?	Standardize: z = (550−450)/90 = 1.11. P(X>550) = 1 − Φ(1.11) = 1 − 0.8665 = 0.1335. Key step: always standardize first with z=(x−µ)/σ, then use the complement 1−Φ(z).
Mid2 A Q1 SB-b — Same bear distribution. Given a bear is below-average weight (X<450), what is P(X>300)?	Conditional on X<450: P(X>300\|X<450) = P(300<X<450)/P(X<450). P(X<450)=0.5. Standardize 300: z=(300−450)/90=−1.67. Φ(−1.67)=1−0.9525=0.0475. Numerator = 0.5−0.0475 = 0.4525. Answer = 0.4525/0.5 = 0.905.
Mid2 A Q1 SB-c — Same bear distribution. What minimum weight puts a bear in the heaviest 5%?	Set P(X>x)=0.05, so P(X≤x)=0.95. Inverse normal: (x−µ)/σ = Φ⁻¹(0.95) = 1.65. Solve: x = 450 + 1.65×90 = 598.5 kg. Key: "heaviest 5%" means top 5%, so set CDF = 0.95 and solve for x.
Mid2 B Q1 SB-b — Bears µ=450, σ=90. Given below-average weight (X<450), what is P(X>400)?	P(X>400\|X<450) = P(400<X<450)/P(X<450). P(X<450)=0.5. z=(400−450)/90=−0.53. Φ(−0.53)=1−0.7019=0.2981. Numerator=0.5−0.2981=0.2019. Answer=0.2019/0.5=0.409 (≈0.409).
Mid2 B Q1 SB-c — Same bear distribution. What minimum weight is in the heaviest 10%?	P(X>x)=0.10, so FX(x)=0.90. Φ⁻¹(0.90)=1.28. x = 450 + 1.28×90 = 565 kg. Compare to Version A (5% → 1.65 → 598.5 kg): higher percentile cutoff means higher weight.
CONCEPT — How do you identify a Normal Distribution / standardization problem?	Key phrases: "normally distributed," "mean µ and std dev σ," "what fraction," "what probability," "what weight/score/value." Steps: 1) Standardize z=(x−µ)/σ. 2) Look up Φ(z). 3) Use complement (1−Φ) for upper tail. 4) For inverse ("what value"), set Φ(z)=
CONCEPT — How do you find a percentile / inverse Normal value?	"Top k%" → P(X>x)=k/100 → FX(x)=1−k/100 → z=Φ⁻¹(1−k/100) → x=µ+z·σ. Key z-values: top 5%: z=1.645. Top 10%: z=1.28. Top 1%: z=2.33. Top 2.5%: z=1.96.
CONCEPT — How do you handle a conditional Normal probability?	P(a<X<b \| X<c) = P(a<X<b AND X<c) / P(X<c). Simplify the numerator by finding the overlap of the two conditions. Then standardize both boundaries and subtract CDF values. P(X<µ)=0.5 always for symmetric Normal.
Mid2 A Q3 — fY(y)=4y(1−y²) for 0≤y≤1. What is the mean E[Y]?	E[Y] = ∫[0 to 1] y·4y(1−y²)dy = ∫[0 to 1] (4y²−4y⁴)dy = [4y³/3 − 4y⁵/5] from 0 to 1 = 4/3 − 4/5 = 8/15 ≈ 0.533.
Mid2 A Q3 — fY(y)=4y(1−y²). What is Var(Y)?	First find E[Y²] = ∫[0 to 1] y²·4y(1−y²)dy = ∫(4y³−4y⁵)dy = [y⁴−4y⁶/6] from 0 to 1 = 1−2/3 = 1/3. Var(Y) = E[Y²]−(E[Y])² = 1/3−(8/15)² = 1/3−64/225 ≈ 0.0489. σY = 0.221.
Mid2 B Q6 — fX(x)=4x(1−x²) for 0≤x≤1. Calculate mean, variance, and std dev of X.	E[X]=8/15 (same as Version A Q3 by symmetry of the PDF form). E[X²]=1/3. Var(X)=1/3−(8/15)²=0.0489. σX=0.221. Both versions have the SAME PDF, just different variable names Y vs X.
CONCEPT — How do you calculate E[X] and Var(X) from a custom PDF?	Step 1: E[X] = ∫ x·fX(x)dx. Multiply x by the PDF and integrate over the support. Step 2: E[X²] = ∫ x²·fX(x)dx. Step 3: Var(X) = E[X²] − (E[X])². Step 4: σ = √Var(X). Always check normalization: ∫fX(x)dx = 1 first.
Mid2 A Q4 SB-a — Customers arrive at 20/hr. What is P(next customer arrives within 5 minutes)?	Exponential distribution. Convert rate: λ = 20/60 = 1/3 per minute. P(X≤5) = 1 − e^(−λ·5) = 1 − e^(−5/3) ≈ 0.811. Alternatively via Poisson: λ for 5 min = 5/3, P(at least 1) = 1 − e^(−5/3) = 0.811.
Mid2 A Q4 SB-b — Same arrival rate. What is P(at least 2 customers arrive in 10 minutes)?	Use Poisson with λ = 20/60 × 10 = 10/3 per 10-min window. P(X≥2) = 1−P(X=0)−P(X=1) = 1−e^(−λ)−λe^(−λ) = 1−[0.0357+0.1189] = 0.845.
Mid2 A Q4 SB-c — Same arrival rate, 10 minutes since last customer. What are mean and std dev of wait for next arrival?	Exponential is memoryless — past waiting time doesn't matter! λ = 1/3 min⁻¹. Mean = 1/λ = 3 min. Std dev = 1/λ = 3 min. Key: for Exponential, mean = std dev = 1/λ.
CONCEPT — How do you identify an Exponential vs Poisson problem?	Both involve random arrivals/events. Exponential: models the TIME between events (continuous). Poisson: models the COUNT of events in a fixed time window (discrete). Same λ, different questions. "Time until next" → Exponential. "How many in t minutes" → P
CONCEPT — What is the memoryless property of the Exponential distribution?	P(X > s+t \| X > s) = P(X > t). If you've already waited s minutes with no arrival, the distribution of remaining wait time is the same as if you just started. "It's been 10 minutes" is irrelevant — mean and std dev are still 1/λ.
CONCEPT — How do you convert arrival rates between time units?	If rate is given per hour (e.g., 20/hr), convert to per-minute by dividing by 60: λ=20/60=1/3 min⁻¹. For a Poisson count over t minutes: λ_window = λ·t. For Exponential (time to next): use λ in consistent units. Always match units before plugging into for
Mid2 A Q2 SB-a — Boilers A and B: P(A=1)=0.4, P(B=1)=0.7, P(A=1,B=1)=0.2. Fill in the joint PMF table.	P(A=0,B=0)=0.1, P(A=1,B=0)=0.2 (column sum for B=0 is 0.3, P(A)=0.4 total). P(A=0,B=1)=0.5, P(A=1,B=1)=0.2. Row sums: P(B=0)=0.3, P(B=1)=0.7. Column sums: P(A=0)=0.6, P(A=1)=0.4. Use constraints: each row and column must sum to marginals.
Mid2 A Q2 SB-b — Same boiler joint PMF. What is P(exactly one boiler running)?	P(exactly one) = P(A=0,B=1) + P(A=1,B=0) = 0.5 + 0.2 = 0.7. Read directly from off-diagonal cells of the joint PMF table.
Mid2 A Q2 SB-c — Same boilers. Find Cov(A,B) and ρ(A,B).	E[AB] = ΣΣ a·b·P(A=a,B=b) = 1·1·0.2 = 0.2. µA=0.4, µB=0.7. Cov(A,B) = E[AB]−µA·µB = 0.2−0.28 = −0.08. σA=√(0.4·0.6)=0.490, σB=√(0.7·0.3)=0.458. ρ = −0.08/(0.490·0.458) = −0.356.
CONCEPT — How do you build a joint PMF table from marginals and one joint value?	You're given P(A=1), P(B=1), and P(A=1,B=1). Fill table using: joint cell = given. Other cells = marginal minus joint. Example: P(A=1,B=0) = P(A=1)−P(A=1,B=1). P(A=0,B=1) = P(B=1)−P(A=1,B=1). P(A=0,B=0) = 1−P(A=1)−P(B=1)+P(A=1,B=1).
CONCEPT — How do you calculate Covariance from a joint PMF?	Cov(X,Y) = E[XY]−µX·µY. Find E[XY] = ΣΣ x·y·PX,Y(x,y). Only nonzero when both x≠0 AND y≠0. Get µX and µY from marginals. Negative covariance means when one goes up the other tends to go down.
Mid2 A Q5 SB-a — Z~Geometric(0.4), B={Z<3}={1,2}. Find the conditional PMF PZ\|B(z).	P(B)=P(Z=1)+P(Z=2)=0.4+0.24=0.64. PZ\|B(1) = 0.4/0.64 = 0.625. PZ\|B(2) = 0.24/0.64 = 0.375. PZ\|B(z) = 0 for z≥3. Key: divide each in-event PMF value by P(B).
Mid2 B Q5 SB-a — Z~Geometric(0.5), B={Z<3}={1,2}. Find the conditional PMF PZ\|B(z).	P(Z=1)=0.5, P(Z=2)=0.5×0.5=0.25. P(B)=0.75. PZ\|B(1)=0.5/0.75=2/3≈0.667. PZ\|B(2)=0.25/0.75=1/3≈0.333. Same method as Version A, different p.
Mid2 A Q5 SB-b — Y~Binomial(2, 0.5). Calculate E[Y³].	Range: RY={0,1,2}. P(Y=0)=0.25, P(Y=1)=0.50, P(Y=2)=0.25. E[Y³] = 0³·0.25 + 1³·0.50 + 2³·0.25 = 0 + 0.5 + 2 = 2.5. Key: for E[g(Y)], compute g(y) for each value, multiply by probability, sum up.
Mid2 A Q5 SB-c — X~Poisson(λ=2.0). Calculate E[4X+2].	Linearity of expectation: E[4X+2] = 4·E[X]+2. For Poisson, E[X]=λ=2.0. Answer = 4·2+2 = 10. Never expand the Poisson PMF for linear functions — always use E[aX+b]=aµ+b.
Mid2 B Q5 SB-c — X~Poisson(λ=2.0). Calculate E[3X−1].	E[3X−1] = 3·E[X]−1 = 3·λ−1 = 3·2−1 = 5. Same linearity rule as Version A.
CONCEPT — How do you calculate E[g(X)] for a non-linear function like g(X)=X³?	You CANNOT use E[X³] = (E[X])³. Instead use E[g(X)] = Σ g(x)·PX(x) for discrete, or ∫g(x)fX(x)dx for continuous. List all values in the range, compute g at each, multiply by probability, sum. Linearity only works for g(X)=aX+b.
Mid2 A Q6 SB-a — X~Uniform(0,2). Write the PDF and CDF of X.	PDF: fX(x) = 1/2 for 0≤x≤2, 0 otherwise. CDF: FX(x) = 0 for x<0; x/2 for 0≤x≤2; 1 for x>2. To get CDF, integrate PDF from −∞ to x: ∫[0 to x] (1/2)du = x/2.
Mid2 B Q2 SB-a — X~Uniform(0,3). Write the PDF and CDF of X.	PDF: fX(x) = 1/3 for 0≤x≤3, 0 otherwise. CDF: FX(x) = 0 for x<0; x/3 for 0≤x≤3; 1 for x≥3. General pattern for Uniform(0,b): PDF=1/b, CDF=x/b.
Mid2 A Q6 SB-c — X~Uniform(0,2), Y=e^X. Derive the PDF of Y.	CDF method: FY(y) = P(Y≤y) = P(e^X≤y) = P(X≤ln y) = FX(ln y) = (ln y)/2. Differentiate: fY(y) = d/dy[(ln y)/2] = fX(ln y)·(1/y) = (1/2)·(1/y) = 1/(2y) for 1≤y≤e². Key: Y=e^X ranges from e^0=1 to e^2 as X ranges from 0 to 2.
Mid2 B Q2 SB-c — X~Uniform(0,3), Y=e^X. Derive the PDF of Y.	Same CDF method: FY(y) = P(X≤ln y) = FX(ln y) = (ln y)/3. fY(y) = 1/(3y) for 1≤y≤e³. The only difference from Version A: divide by 3 instead of 2, and range extends to e³.
CONCEPT — How do you derive the PDF of a transformed random variable Y=g(X)?	CDF method (always works): 1) Write FY(y)=P(Y≤y)=P(g(X)≤y). 2) Solve for X: P(X≤g⁻¹(y))=FX(g⁻¹(y)). 3) Differentiate: fY(y) = fX(g⁻¹(y))·\|d/dy[g⁻¹(y)]\|. For Y=e^X: g⁻¹(y)=ln y, derivative=1/y. New support: y ranges from g(a) to g(b).
CONCEPT — How do you find the CDF from a PDF by integration?	FX(x) = ∫[−∞ to x] fX(u)du. For piecewise PDFs: in each region, integrate the PDF from the lower boundary to x and add any accumulated probability from previous regions. Always check: FX(−∞)=0 and FX(+∞)=1.
CONCEPT — Midterm 2 master topic map: what concept does each question test?	Q1 (both versions): Normal distribution, standardizing, inverse normal (percentiles). Q2/Q3: Joint PMF table OR continuous PDF mean/variance/std dev. Q4: Exponential/Poisson (arrivals). Q5: Conditional PMF, E[g(X)], linearity of expectation. Q6 (A) / Q2 (
Mid3 A Q1 SB-a — Containers: µ=21 tons, σ=2.3 tons. What are the mean and std dev of the TOTAL weight of 100 containers?	CLT for sums: E[W100] = n·µ = 100·21 = 2100 tons. σ(W100) = σ·√n = 2.3·√100 = 23.0 tons. Key: mean scales by n, std dev scales by √n — NOT by n.
Mid3 A Q1 SB-b — Same containers. What is P(total weight of 100 > 2130 tons)?	Standardize the sum: z = (2130−2100)/23 = 30/23 = 1.304. P(W>2130) = 1−Φ(1.304) = 1−0.9032 = 0.0968. Key: use the CLT sum's mean=2100 and σ=23 as the Normal parameters.
Mid3 A Q1 SB-c — 50 existing containers weigh 1070 tons total. 50 new ones are added. What is P(grand total > 2130)?	The NEW 50 containers have mean=50·21=1050, σ=√50·2.3=16.26. Need the new 50 to weigh more than 2130−1070=1060 tons. z=(1060−1050)/16.26=0.615. P(W_new>1060)=1−Φ(0.615)=1−0.7291=0.27.
CONCEPT — What is the CLT formula for a SUM of n i.i.d. r.v.s?	Wn = X1+...+Xn. By CLT: Wn ≈ Normal(n·µ, σ·√n). So P(Wn > w) = 1−Φ((w−nµ)/(σ√n)). Mean grows as n, std dev grows as √n. Use this when asked about a total/sum of many independent items.
CONCEPT — How do you handle a "fixed amount already accumulated + new arrivals" CLT problem?	Subtract what's already there from the target. Only apply the CLT to the NEW random portion. New total target = overall target − fixed amount. Then use the CLT on just the new n items with their own mean and σ√n.
Mid3 A Q2 — Joint PDF fX,Y(x,y)=x+y for 0≤x≤1, 0≤y≤1. Calculate P(X+Y>1).	Sketch: region is upper-right triangle above the line y=1−x. Set up iterated integral: ∫[0 to 1]dx ∫[1−x to 1] (x+y)dy. Inner integral = [xy+y²/2] from 1−x to 1 = (x+1/2)−(x(1−x)+(1−x)²/2). Simplify to ∫[0 to 1](x²+x−x²/2)dx = 2/3.
CONCEPT — How do you set up a double integral over a non-rectangular region?	Step 1: Sketch the region. Find the boundary line (e.g., y=1−x means X+Y=1). Step 2: Determine limits — for region above y=1−x: x from 0 to 1, y from (1−x) to 1. Step 3: Integrate inner integral first (over y), then outer (over x). Always sketch first.
Mid3 A/B Q3 SB-a — n=9, X̄=12.05, S²=1.87. Calculate 95% CI using CLT (z).	S=√1.87=1.367. For 95%, α=0.05, zα/2=z0.025=1.96. δ = 1.96·1.367/√9 = 1.96·1.367/3 = 0.893. CI: µ ∈ [12.05−0.89, 12.05+0.89] = [11.16, 12.94].
Mid3 A/B Q3 SB-b — Same data. Calculate 95% CI using the t-distribution.	n=9 → df = n−1 = 8 → t(0.025, 8) = 2.306. δ = 2.306·1.367/√9 = 2.306·1.367/3 = 1.051. CI: µ ∈ [12.05±1.05] = [11.00, 13.10]. Wider than z-interval because t has heavier tails for small n.
Mid3 A/B Q3 SB-c — Same data. Calculate 95% CI for the variance σ².	Use χ² with df=n−1=8. χ²(0.025,8)=17.53 (upper), χ²(0.975,8)=2.18 (lower). xl=(n−1)S²/χ²upper = 8·1.87/17.53 = 0.85. xh=(n−1)S²/χ²lower = 8·1.87/2.18 = 6.86. σ² ∈ [0.85, 6.86]; σ ∈ [0.92, 2.62].
CONCEPT — When do you use z vs t for a confidence interval?	Use z (1.96 for 95%): when n is large (≥30) OR σ is known. Use t (tα/2,n−1): when n is small AND σ is unknown. Same formula: δ=critical value · S/√n. The t-value is always larger than z for the same α, giving a wider interval — reflects extra uncertainty
CONCEPT — How do you calculate a CI for variance vs a CI for the mean?	CI for mean uses z or t. CI for variance uses χ² distribution with df=n−1. xl=(n−1)S²/χ²(α/2,n−1) and xh=(n−1)S²/χ²(1−α/2,n−1). Then CI for σ is [√xl, √xh]. The variance CI is NOT symmetric around S².
Mid3 A Q4 SB-a — Y~Poisson, estimated mean=6.3, so σ≈√6.3≈2.51. How many measurements n needed for CI width ±1% of mean (α=0.05)?	Want δ=1% of 6.3=0.063. Formula: δ=zα/2·σ/√n → 0.063=(1.96·2.51)/√n → √n=1.96·2.51/0.063=78.0 → n=(78.0)²=6098≈6100.
Mid3 B Q3 SB-a — Y~Poisson, estimated mean=6.1, σ≈√6.1≈2.47. How many measurements for CI width ±1% (α=0.05)?	Want δ=1% of 6.1=0.061. 0.061=(1.96·2.47)/√n → √n=1.96·2.47/0.061=79.3 → n≈6299≈6300. Same method as Version A — for Poisson σ=√λ.
CONCEPT — How do you find the sample size n needed for a target CI width?	Set δ = zα/2·σ/√n and solve for n: n = (zα/2·σ/δ)². For ±δ CI (half-width), use this directly. For Poisson with unknown λ, approximate σ≈√λ̂. Round UP to the next integer. Larger target precision (smaller δ) → much larger n needed.
Mid3 A Q4 SB-b — Existing CI: 5.05±0.23 at α=0.05, n=1000. What is the new δ at α=0.01 using same data?	Back out S: 0.23=z0.025·S/√1000=1.96·S/31.62 → S=3.71. New δ at α=0.01: z0.005=2.576. δ=2.576·3.71/√1000=0.302. Shortcut: new δ = old δ·(z_new/z_old) = 0.23·(2.576/1.96) = 0.302.
Mid3 B Q3 SB-b — Existing CI: 5.05±0.28 at α=0.05, n=1000. Find new δ at α=0.01.	Back out S: 0.28=1.96·S/√1000 → S=4.52. New δ=2.576·4.52/√1000=0.368. Or shortcut: 0.28·(2.576/1.96)=0.368. Wider CI because α=0.01 requires more confidence than α=0.05.
CONCEPT — How does changing α affect the confidence interval width?	Smaller α (more confidence) → larger critical value z → WIDER interval. δ scales proportionally with z. Shortcut: new δ = old δ × (z_new / z_old). Example: going from 95% to 99% CI multiplies width by 2.576/1.96 ≈ 1.315.
Mid3 A Q5 SB-a — fX(x)=2x⁻³ for x≥1. Calculate P(X>2).	P(X>2) = ∫[2 to ∞] 2x⁻³ dx = [2·x⁻²/(−2)] from 2 to ∞ = [−x⁻²] from 2 to ∞ = 0−(−1/4) = 1/4.
Mid3 A Q5 SB-b — Same fX(x)=2x⁻³, B={X>2}. Find conditional PDF fX\|B(x) and calculate E[X\|B].	fX\|B(x) = fX(x)/P(B) = (2x⁻³)/(1/4) = 8x⁻³ for x≥2. E[X\|B] = ∫[2 to ∞] x·8x⁻³dx = ∫[2 to ∞] 8x⁻²dx = [8·x⁻¹/(−1)] from 2 to ∞ = [−8/x] from 2 to ∞ = 0−(−4) = 4.
CONCEPT — How do you find a conditional PDF given an event B?	fX\|B(x) = fX(x)/P(B) for x in B, and 0 otherwise. Step 1: Find P(B) by integrating fX over B. Step 2: Divide fX by P(B) — this re-normalizes the PDF to the restricted region. Step 3: Use this new PDF to compute conditional expectations.
Mid3 A Q6 SB-a — MLE for Uniform(a,b): â=min{xi}. Is this estimator positively biased, negatively biased, or unbiased?	Positively biased. Every observation xi > a (since P(X=a)=0 for continuous), so min{xi} > a always. Therefore E[min{xi}] > a, meaning E[â]−a > 0. Positive bias: estimator overshoots the true value on average.
Mid3 B Q6 SB-a — MLE for Uniform(a,b): b̂=max{xi}. Is this estimator positively biased, negatively biased, or unbiased?	Negatively biased. Every observation xi < b (since P(X=b)=0), so max{xi} < b always. Therefore E[max{xi}] < b, meaning E[b̂]−b < 0. Negative bias: estimator undershoots the true value on average.
CONCEPT — What is bias and how do you determine its sign?	Bias B(θ̂) = E[θ̂]−θ. Positive bias: estimator tends to OVERSHOOT (be larger than) the true value. Negative bias: estimator tends to UNDERSHOOT. Unbiased: E[θ̂]=θ exactly. For min/max estimators of boundaries, ask: is it physically possible for the estima
Mid3 A Q6 SB-b — Z~Exponential(λ), observations Z=2 and Z=3. Find the MLE of λ.	L(λ) = fZ(2)·fZ(3) = λe^(−2λ)·λe^(−3λ) = λ²e^(−5λ). Set dL/dλ=0: 2λe^(−5λ)−5λ²e^(−5λ)=0 → divide by λe^(−5λ): 2−5λ=0 → λ̂=2/5=0.4. Note: MLE of λ for Exponential = 1/x̄ = 2/(2+3) = 2/5. ✓
Mid3 B Q6 SB-b — Z~Exponential(λ), observations Z=2 and Z=4. Find the MLE of λ.	L(λ) = λe^(−2λ)·λe^(−4λ) = λ²e^(−6λ). dL/dλ = 2λe^(−6λ)−6λ²e^(−6λ) = 0 → 2−6λ=0 → λ̂=1/3. Check: 1/x̄ = 2/(2+4) = 2/6 = 1/3. ✓ Same method as Version A, different observations.
CONCEPT — How do you find the MLE for the Exponential rate parameter λ?	For n observations {x1,...,xn}: L(λ) = Π λe^(−λxi) = λⁿ·e^(−λΣxi). Take dL/dλ=0 (or d/dλ of log L = 0): nλⁿ⁻¹e^(−λΣxi) − Σxi·λⁿe^(−λΣxi)=0 → n = λ·Σxi → λ̂ = n/Σxi = 1/x̄. MLE for Exponential rate = reciprocal of the sample mean.
CONCEPT — What is the general MLE procedure?	Step 1: Write the likelihood L(θ) = Π fX(xi;θ) (product of PDFs/PMFs at each observation). Step 2: Often easier to maximize log-likelihood: ℓ(θ) = Σ log fX(xi;θ). Step 3: Differentiate with respect to θ, set equal to 0. Step 4: Solve for θ̂. Step 5: Verif
CONCEPT — Midterm 3 master topic map: what concept does each question test?	A: Q1=CLT for sums/totals. Q2=double integral over joint PDF. Q3=CIs (z, t, chi-squared). Q4=sample size calculation + adjusting CI width. Q5=conditional PDF and E[X\|B]. Q6=MLE bias and deriving MLE. B: same topics in different order (Q1↔Q4, Q2↔Q5, Q3↔Q3,

Created by: desto

"Know" box contains:
Time elapsed:
Retries: