Question 1

Sample

Accepted Answer

a random collection of observations from a population

Question 2

Population

Accepted Answer

all possible observations

Question 3

Frequentist statistics

Accepted Answer

probability of an outcome as how it is likely to happen in a very long run of events

Question 4

Random variable

Accepted Answer

a variable whose values are not known for certain before a sample is taken

Question 5

Sample space

Accepted Answer

the set of all possible outcomes of a random variable

Question 6

Probability distribution

Accepted Answer

Distribution of observing each outcome in the sample space (tells you how high, or low, the likelihood of seeing an outcome is

Question 7

Probability mass function

Accepted Answer

a mathematical equation used to describe a curve assigned to the probability of seeing each potential outcome

Question 8

Probability density function

Accepted Answer

used for continuous variables, vertical axis is the probability density of the variable (f(x))

Question 9

for a continuous variable x = ?

Accepted Answer

0; it is impossible to get the probability of a very specific outcome if it is continuous

Question 10

Expected value (E(x)

Accepted Answer

the mean of the probability distribution of a random variable ("the average"); the most likely outcome of everything in the sample space

Question 11

Mean

Accepted Answer

the arithmetic average value

Question 12

Standard deviation

Accepted Answer

the dispersion of the data about the mean

Question 13

Kurtosis

Accepted Answer

The "fatness" of the tails of the distribution; degree of outliers in the distribution

Question 14

Skewness

Accepted Answer

Refers to deviations in the distribution's symmetry

Question 15

Right-skewed

Accepted Answer

mean > median > mode

Question 16

Left-skewed

Accepted Answer

mode > median > mean

Question 17

Normal (Gaussian) distribution

Accepted Answer

Symmetric distribution where most distributions are around the mean (bell curve)

Question 18

Normal distribution features

Accepted Answer

Symmetric bell shape (curve), mean = median at center, parameters = mean and standard deviation,
68% of data w/in 1 SD of mean
95% of data w/in 2 SD of mean
99.7% of data w/in 3 SD of mean

Question 19

Lognormal distribution

Accepted Answer

Right-skewed data whose logarithm is normally distributed

Question 20

Lognormal distribution features

Accepted Answer

Continuous w/ normal, skewed distribution
Low mean values
Large variance
All-positive values
Parameters = mean and standard deviation
* If you take the log, the distribution shifts to normal (data transformation)

Question 21

Exponential distribution

Accepted Answer

Model elapsed time between two events
Continuous probability distribution describing waiting time until next event in a Poisson process (waiting time before next event)

Question 22

Exponential distribution features

Accepted Answer

Single parameter distribution
Parameter = rate

Question 23

Beta distribution

Accepted Answer

Used to represent percentages, proportions, or probability outcomes

Question 24

Beta distribution features

Accepted Answer

Defined on interval [0, 1]
Parameters = alpha and beta (two positive shape parameters that appear as exponents of the random variable and control shape of distribution)

Question 25

Bernoulli distribution

Accepted Answer

Single trial with two possible outcomes
Ex., coin toss
Outcomes are "success" or "failure"

Question 26

Binomial distribution

Accepted Answer

A sequence of Bernoulli events
* The probability distribution of the number of successes in a set number of independent trials

Question 27

Binomial distribution features

Accepted Answer

Parameters = number of trials (n) and probability of success ina single trial (p)
Expected value of a binomial trial "x" is the number of times a success is expected to occur in n total trials

Question 28

Multinomial distribution

Accepted Answer

Generalization of the binomial distribution
When there are more than two possible outcomes, which is the joint possibility distribution of multiple outcomes from n fixed trials

Question 29

Poisson distribution

Accepted Answer

The probability that an event (or number of events) may occur
Describes variables representing # of occurrences of a particular event in an interval of time/space

Question 30

Poisson distribution features

Accepted Answer

Parameters: expected value = variance
Generally right skewed

Question 31

Uniform distribution

Accepted Answer

All outcomes are equally likely

Question 32

Sample size

Accepted Answer

# of observations in the sample (n)

Question 33

Statistics

Accepted Answer

Measured characteristics of the SAMPLE
(Ex., sample mean)

Question 34

Parameters

Accepted Answer

Characteristics of the POPULATION
(Ex., population mean)

Question 35

Random (simple) sampling

Accepted Answer

Basic method of collecting observations in a sample
Any observation has the same probability of being collected
Aim is to sample in a manner that doesn't create bias/favor any observation being selected

Question 36

Random sample = ?

Accepted Answer

Independent and identically distributed (IID)

Question 37

Independent (IID)

Accepted Answer

Sample items are all independent events
Knowledge of the value of one variable gives no information about the other and vice versa

Question 38

Identically distributed (IID)

Accepted Answer

No overall trends
The distribution doesn't fluctuate, and all items in the sample are taken from the same probability distribution

Question 39

Random sampling is usually ?

Accepted Answer

Haphazard
Populations must be defined at the start of a study therefore there are spatial and temporal limits

Question 40

More than 1 in 20 US teens have diagnosed anxiety or depression
(Parameter or statistic)

Accepted Answer

Statistic, # describes whole population of US teens. It is impossible to collect info from every member

Question 41

Latvian women are the tallest on the planet w/ a mean height of 170 cm (Parameter or statistic)

Accepted Answer

Statistic, describes the whole population. Not feasible to measure the height of every Latvian woman

Question 42

The median annual income of all 37 employees at Company Y is $42,000 (Parameter or statistic)

Accepted Answer

Parameter, It is ALL employees at Y. not just a portion of them

Question 43

The avg final math exam scores of all seniors from high school A have increased from 70% to 78% in the past decade (Parameter or statistic)

Accepted Answer

Parameter, % changed referred to the entire population of high schoolers

Question 44

A good estimator (i.e., statistic) of a population parameter should have the following characteristics:

Accepted Answer

Unbiased - Expected value of the sample statistic should = the parameter
Consistent - Sample size increases then the statistic will get closer to the population parameter
Efficient - It has the lowest variance among all competing statistics

Question 45

The two broad types of estimation are ? and ?

Accepted Answer

Point estimate
Interval estimate

Question 46

Point estimate

Accepted Answer

provides a single value which estimates a population parameter

Question 47

Parts of a point estimate

Accepted Answer

Mean: estimator of population mean, weighted by 1/n
Median: middle measurement of data set
Trimmed mean
Windsorized mean

Question 48

Trimmed mean

Accepted Answer

Mean calculated after omitting a proportion (usually 5%) of highest and lowest observations

Question 49

Windsorized mean

Accepted Answer

Same as for trimmed means except the omitted observations are replaced by the nearest remaining value

Question 50

Interval estimate

Accepted Answer

Provides a range of values that might include the parameter with a known probability (Ex., confidence intervals)

Question 51

Range

Accepted Answer

The difference between the largest and smallest observation
Simplest measure of spread but no clear link between sample range and population range
Generally increases as sample size increases

Question 52

Sample variance

Accepted Answer

Estimate of the population variance

Question 53

Sample variance steps

Accepted Answer

1. Calculate mean
2. Subtract the mean and square the result
3. Work out the average of those differences
If the result is s^2 and you need a length, you must square root your result

Question 54

Standard deviation

Accepted Answer

Square root of sample variance

Question 55

Coefficient of variation

Accepted Answer

Ratio of standard deviation to the mean and shows the extent of variability in relation to the mean of the population

Question 56

Interquartile range

Accepted Answer

difference between the first quartile (the observation which has 25% of the observations below it) and the third quartile (the observation which has 25% of the observations above it)
>Used in construction of box plots

Question 57

Median absolute deviation (MAD)

Accepted Answer

less sensitive to outliers than the other measures and is the sensible measure of spread to present in association with medians

Question 58

Confidence intervals

Accepted Answer

A range of values where you can be relatively confident the true value will be

Question 59

Central limit theorem

Accepted Answer

The sampling distribution of a sample mean is approximately normal if the sample size is large enough (n > 30), even if the population distribution is not normal
Distribution of averages (means)

Question 60

Formula for confidence interval estimate for a population

Accepted Answer

Sample mean +/- critical value * standard error

Question 61

Standard error

Accepted Answer

The standard deviation of the sample means
(s)/(square root(n))

Question 62

Critical value

Accepted Answer

Conversion of the sample mean to a critical value by using a t-distribution

Question 63

How to obtain a "t" critical value

Accepted Answer

can be obtained from the t-distribution based on the degree of freedom (df) and the confidence level you are using

Question 64

"t" critical value

Accepted Answer

Used during statistical tests to assess the statistical significance of the difference between wo sample means, the constriction of confidence intervals and in linear regression analysis

BIOS480 Exam 1

Definitions and info for exam 1 of BIOS480

"Know" box contains:
Time elapsed:
Retries: