Question 1

Cases

Accepted Answer

Objects described by a set of data. Cases may be customers, companies, subjects in a study, or other objects

Question 2

Label

Accepted Answer

A special variable used in some data sets to distinguish the different cases

Question 3

Variable

Accepted Answer

A variable is a characteristic or case

Question 4

Categorical variable

Accepted Answer

A categorical variable places a case into one of several groups or categories

Question 5

Quantitative variable

Accepted Answer

Takes numerical values for which arithmetic operations such as adding and averaging makes sense

Question 6

Examining a distribution

Accepted Answer

Look for the overall pattern and for striking deviations from that pattern; e.g. the overall pattern of a histogram can be described by its shape, center and spread

Question 7

Symmetric and Skewed distributions

Accepted Answer

A distribution is symmetric if the right and left sides of the histogram are approximately mirror images of each other; a distribution is skewed to the right(left) if the right(left) side of the histogram extends farther out than the left (right) side

Question 8

Mean

Accepted Answer

The mean is the average of a set of observations: a calculated "central" value of a set of numbers.

Question 9

Median

Accepted Answer

The median  is the midpoint of a distribution, the number such that half the observations are smaller and the other half are larger

Question 10

Mode

Accepted Answer

The mode of a set of observations is the observation that is repeated more often than any other

Question 11

Quartiles

Accepted Answer

Each of four equal groups into which a population can be divided according to the distribution of values of a particular variable

Question 12

Five Number Summary

Accepted Answer

The five number summary of a distribution consists of the smallest observation, the first quartile, the median, the third quartile, and the largest observation, written in order from smallest to largest.

Question 13

Standard Deviation

Accepted Answer

The variance of a set of observations is essentially the average of the squares of the deviations of the observations from their mean. The standard deviation is the square root of the variance .

Question 14

Density curves

Accepted Answer

A density curve is a curve that is always on or above the horizontal axis and has area exactly 1 underneath it; describes the overall pattern of a distribution; the area under the curve and above any range of values is the proportion of all observations t

Question 15

Empirical Rule

Accepted Answer

In the Normal distribution with mean mu (μ) and standard deviation sigma σ: 68% of the observations fall within σ of μ;95% of the observations fall within 2σ of μ;99.7% of the observations fall within 3σ of μ

Question 16

Standard Normal distribution

Accepted Answer

The standard normal distribution is the Normal distribution N(0,1) with mean 0 and standard deviation 1

Question 17

Histogram

Accepted Answer

A diagram consisting of rectangles whose area is proportional to the frequency of a variable and whose width is equal to the class interval

Question 18

Bar Chart

Accepted Answer

A diagram in which the numerical values of variables are represented by the height or length of lines or rectangles of equal width.

Question 19

Pie Chart

Accepted Answer

A type of graph in which a circle is divided into sectors that each represent a proportion of the whole

Question 20

Response variable

Accepted Answer

A response variable measures an outcome of a study

Question 21

Explanatory variable

Accepted Answer

An explanatory variable explains or influences changes in a response variable

Question 22

Scatterplot

Accepted Answer

A scatterplot displays the relationship between two quantitative variables measured on the same individuals

Question 23

Scatterplot Association

Accepted Answer

Two variables are positively associated when above average values of one tend to accompany above average values of the other, two variables are negatively associated when above average values of one tend to accompany below average values of the other

Question 24

Scatterplot Strength

Accepted Answer

The strength of a relationship is determined by how close the points in the scatterplot lie to a simple form such as a line

Question 25

Scatterplot Direction

Accepted Answer

If the relationship has a clear direction, we speak of either positive association (high values of the two variables tend to occur together) or negative association (high values of one variable tend to occur with low values of the other variable)

Question 26

Outlier

Accepted Answer

An individual value that falls outside the overall pattern of a distribution

Question 27

Correlation

Accepted Answer

The correlation measures the direction and strength of the linear relationship between two qualitative variables (usually written as r).

Question 28

Important facts about correlation

Accepted Answer

r itself has no unit of measurement, it is just a number; r is always between -1 and 1; r near 0 imply a very weak linear relationship; r=-1 or 1 occur only in case of a perfect linear relationship; r measures the strength of only a linear relationship

Question 29

Probability

Accepted Answer

The probability of any outcome of a random phenomenon is the proportion of times the outcome would occur in a very long series of repititions

Question 30

ANOVA

Accepted Answer

ANOVA or analysis of variance,is a statistical method in which the variation in a set of observations is divided into distinct components.

Question 31

Linear Regression

Accepted Answer

Simple linear regression is a statistical method that allows us to summarize and study relationships between two continuous (quantitative) variables, the explanatory variable and the response variable

Question 32

Hypothesis testing

Accepted Answer

The theory, methods, and practice of testing a hypothesis by comparing it with the null hypothesis. The null hypothesis is only rejected if its probability falls below a predetermined significance level α

Question 33

Confidence Interval

Accepted Answer

A confidence interval gives an estimated range of values which is likely to include an unknown population parameter, the estimated range being calculated from a given set of sample data

Question 34

P-value

Accepted Answer

The p-value is the level of marginal significance within a statistical hypothesis test representing the probability of the occurrence of a given event.

Question 35

Critical value

Accepted Answer

A critical value is the point (or points) on the scale of the test statistic beyond which we reject the null hypothesis, and is derived from the level of significance α of the test

Question 36

Level of Significance

Accepted Answer

The significance level, also denoted as alpha or α, is the probability of rejecting the null hypothesis when it is true.

"Know" box contains:
Time elapsed:
Retries:

Statistics