Help

Options

focusNode

Didn't know it?
click below

Knew it?
click below

Don't Know

Remaining cards (0)

Know

retry

shuffle

restart

0:00

Embed Code - If you would like this activity on your web page, copy the script below and paste it into your web page.

Normal Size Small Size show me how

QM Exam 1

Term	Definition
data warehouses	vast digital repositories that record and store data electronically
Big Data	describe data sets so large that traditional methods of storage and analysis are inadequate
transactional data	data collected for recording the companies' transactions
data mining or predictive analytics	process of using data, especially transactional data to make decisions and predictions
business analytics	describes any use of data and statistical analysis to drive business decisions from data whether the purpose is predictive or simply descriptive
data	numerical, alphabetic, or alphanumerical; useless unless we know what it represents
context	answering the questions who, what when, why, where, and how can make data values meaningful
data table	clearly shows who the data was about and what was measured
cases	rows of a data table correspond to individual __________
variables	some recorded characteristics
respondents	individuals who answer a survey
subjects/participants	people on whom we experiment
experimental units	animals, plants, websites, and other inanimate subjects
records	rows in a database
metadata	typically contains information about how, when, and where (and possible why) the data were collected; who each case represents; and the definitions of all variables
spreadsheet	a name that comes from bookkeeping ledgers of financial information
relational database	two or more separate data tables are linked together so that information can be merged across them
categorical/qualitative variable	when the values of a variable are simply the names of categories
quantitative variable	when the values of a variable are measured numerical quantities
identifier variables	categorical variables whose only purpose is to assign a unique identifier code to each individual in the data set
ordinal	the variable is ______________ when the value of a categorical variable have an intrinsic order
nominal	categorical variable with unordered categories
cross-sectional data	several variables are measured at the same time point
frequency table	records the counts for each of the categories of the variable
area principle	says that the area occupied by a part of the graph should correspond to the magnitude of the value it represents
bar chart	displays the distribution of a categorical variable, showing the counts for each category next to each other for easy comparison
relative frequency bar chart	replace the counts with percentages in order to draw attention to the relative proportion of visits from each Source
pie chart	shows how w whole group breaks into several categories
contingency tables	they show how individuals are distributed along each variable depending on, or contingent on, the value of the other variable
marginal distribution	when presented like this, at the margins of a contingency table, the frequency distribution of either one of the variables is called __________
cell	any intersection of a row and column of the table; gives the count for a combination of values of the two variables
total percent, row percent, or column percent	most statistics programs offer a choice for contingency tables
conditional distribution	shows the distribution of one variable for just those cases that satisfy a condition on another
independent	in a contingency table, when the distribution of one variable is the same for all categories of another variable, we say that the two variables are ________
segmented (or stacked) bar chart	treats each bar as the "whole" and divides it proportionally into segments corresponding to the percentage in each group
mosaic plot	looks like a segmented bar chart, but obeys the area principle better by making the bars proportional to the sizes of the groups
Simpson's Paradox	only combine compatible measurements for comparable individuals
bins	give the distribution of the quantitative variable and provide the building blocks for the display of the distribution called a histogram
histogram	plots the bin counts as the heights of bars
gaps	indicate a region where there are no values
relative frequency histogram	alternative is to report the percentage of cases in each bin
stem-and-leaf displays	like histograms, but they also show the individual values
quantitative data condition	the data must be values of a quantitative variable whose units are known
shape, center, and spread	when you describe a distribution, you should pay attention to these three things
shape	we describe the shape of a distribution in terms of its modes, its symmetry, and whether it has any gaps or outlying values
modes	humps of a histogram
unimodal	a distribution whose histogram has one main hump
bimodal	distributions whose histograms have two humps
multimodal	histograms with three or more humps
uniform	a distribution whose histogram doesn't appear to habe any mode and in which all the bars are approximately the same height
symmetric	the halves of a distribution on either side of the center look, at least approximately, like mirror images
tails	the (usually) thinner ends of a distribution
skewed	if one tail stretches out farther than the other, the distribution is said to be ________ to the side of the longer tail
outliers	any stragglers that stand off away from the body of the distribution
mean (average)	add up all the values of the variable, x, and divide that sum by the number of data values
median	the value that splits the histogram into two equal areas
range	the difference between the extremes: max-min
lower quartile (Q1)	value for which one quarter of the data lie below it
upper quartile (Q3)	value for which one quarter of the data lie above it
interquartile rage (IQR)	summarizes the spread by focusing on the middle half of the data; it's defined as the difference between the two quartiles: Q3-Q1
variance	the average of the squared deviations
standard deviation	we want measures of spread to have the same units as the data, so we usually take the square root of the variance, giving the __________
standardized value	the resulting value of standard deviation
z-score	tells us how many standard deviations a value is from its mean
five-number summary	reports a distribution's median, quartiles, and extremes (max and min)
boxplot	displays the information from a five-number summary
stationary	when a time series has no strong trend or change in variability
time series plot	a display of values against time
re-express/transform	one way to make a skewed distribution more symmetric is to ___________ the data by applying a simple function to all the data values
scatterplot	plots one quantitative variable against another
direction	pattern that can either be negative, positive, or neither
form	straight, curved, exotic, no pattern?
straight line relationship/linear form	will appear as a cloud or swarm of points stretched out in a generally consistent, straight form
strength	tightly clustered in a single stream or so variable and spread out that we can barely discern a trend or pattern?
explanatory or predicator variable	variable on the x-axis
response variable	variable on the y-axis
independent and dependent variables	the idea is that the y-variable depends on the x-variable and the x-variable act independently to make y respond
correlation coefficient	a numerical measure of the direction and strength of a linear association
correlation	measures the strength of the linear association between two quantitative variables
quantitative variables condition	correlation applies only to quantitative variables
linearity condition	correlation measures the strength only of the linear association and will be misleading if the relationship is not straight enough
outlier condition	unusual observations can distort the correlation and can make an otherwise small correlation look big or, on the other hand, hide a large correlation
lurking variable	some third variable that affects both of the variables you have observed
linear model	just an equation of a straight line through the data
predicted value	the prediction for y found for each x-value in the data; found by substituting the x-value in the regression equation; values on the fitted line
residual	the difference between the predicted value and the observed value
line of best fit/least squares line	the line for which the sum of the squared residuals is smallest
slope	b1 is given in y-units per x-unit. differences of one unit in x are associated with differences of b1 units in predicted values of y
intercept	the value of the line when the x-variable is zero
regression lines	common name for least squares lines
regression to the mean	because the correlation is always less than 1.0 in magnitude, each predicted y tends to be fewer standard deviations from its mean than its corresponding x is from its mean
quantitative data condition	pretty easy to check, but don't be fooled by categorical data recorded as numbers
linearity assumption	the regression model assumes that the relationship between the variables is, in fact, linear
linearity condition	the two variables must have a linear association, or the model won't mean a thing and decisions you base on the model may be wrong
outlier condition	make sure that no points need special attention
independence assumption	assumption that the residuals are independent of each other
equal spread condition	new assumption about the standard deviation around the line gives us this new condition
R-squared	all regression analyses include this statistic, although by tradition, it is written with a capital letter; a fraction of a whole, it is often given a percentage
Spearman rank correlation	works with the ranks of the data rather than their values
random phenomena	we can't predict the individual outcomes, but we can hope to understand characteristics of their long-run behavior
trial	each attempt of a random phenomena
outcome	generated be each trial of a random phenomena
event	more general term to refer to outcomes or combinations of outcomes
sample space	a special event; the collection of all possible outcomes
probability	the percentage of the callers who qualify
independence	the outcome of one trial doesn't influence or change the outcome of another
Law of Large Numbers (LLN)	states that if the events are independent, then as the number of trials increases, the long-run relative frequency of any outcome gets closer and closer to a single value
empirical probability	because it is based on repeatedly observing the event's outcome, this definition of probability is often called ____________
theoretical probability	when we have equally likely outcomes
personal probability	we call this kind of probability subjective
probability	a number between 0 and 1
probability assignment rule	the probability of the set of all possible outcomes must be 1. P(S) =1
complement rule	the probability of an event occurring is 1 minus the probability that doesn't occur. P(A)=1-P(A^c)
multiplication rule	to find the probability that two independent events occur, we multiply the probabilities; P(A and B)=P(A) x P(B), provided that A and B are independent
disjoint or mutually exclusive	two events are _________ if they have no outcome in common
addition rule	allows us tot add the probabilities of disjoint events to get the probability that either event occurs: P(A or B)=P(A) + P(B), provided that A and B are disjoint
general addition rule	does not require disjoint events: P(A or B)=P(A) + P(B) - B(A and B) for any two events A and B
marginal probability	uses a marginal frequency (from either the total row or total column) to compute the probability
joint probabilities	probability that two events occur together
conditional probability	a probability that takes into account a given condition
general multiplication rule	for compound events that does not require the events to be independent: P(A and B)=P(A) x P(B\|A) for any two events A and B
independent	events A and B are __________ whenever P(B\|A)=P(B)
tree diagram	probability tree used to help think through the decision-making process

Created by: pace_sauce

Popular Business sets

Text and Document Formatting, Arkansas Frameworks

Excel

Personal Characteristics and Skills of a Successful Entrepreneur

Examples of Personal Characteristics and Skills of a successful Entrepreneur

Glossary for Internal Operations of a Large Business

Descriptions of the 5 stages of the Entrepreneurial Process

Entrepreneurial Process Activities

Business Plan Parts

Bus Plan Parts Ex

Unit 1 - Hardware and Software

Business Voc. Day 2

A4 CBA Excel Vocab

"Know" box contains:
Time elapsed:
Retries: