click below
click below
Normal Size Small Size show me how
chapter 1-3 statz
| Question | Answer |
|---|---|
| consists of methods of organizing and summarizing information; includes the construction of graphs, charts, and tables and the calculation of various descriptive measures such as averages, measures of variation, and percentiles | descriptive statistics |
| the collection of all individuals or items under consideration in a statistical study | population |
| that part of the population from which information is obtained | sample |
| consists of methods for drawing and measuring the reliability of conclusions about a population based on information obtained from a sample of the population | inferential statistics |
| in this study, researches simply observe characteristics and take measurements, as in a sample survey | observational study |
| in this study, researches impose treatments and controls and then observe characteristics and take measurements | designed experiment |
| a sampling procedure for which each possible sample of a given size is equally likely to be the one obtained | simple random sampling |
| a sample obtained by simple random sampling | simple random sample |
| what are the two types of simple random sampling? | with replacement-member of population can be selected more than once without replacement- a member of the population can be selected at most once |
| a table of randomly chosen digits | table of random numbers |
| built in programs for obtaining simple random samples | random number generators |
| when using random number generators,be aware of whether they provide samples ______ replacement or samples ________ replacement | with, without |
| LOOK AT RANDOM NUMBERS WITH TI-83 IN 1.2 | |
| what are the steps to systematic random sampling | 1) divide population by the sample and round DOWN to the nearest whole #(m) 2) use a random number table or similar device to obtain a number, k, between 1 and m 3) select for the sample of those members of the population that are numbered k, k+m,k+2m |
| what are the steps to cluster sampling | 1)divide the population into groups (clusters) 2) obtain a simple random sample of the clusters 3) use all the members of the clusters obtained in step 2 as the sample |
| obtain a sample of approximately 300 houses in a city of 947 blocks; each containing approximately 20 houses | treat one block as a cluster and number the blocks on the city map from 1 to 947. then use a table of random numbers to obtain a simple random sample of 15 blocks |
| a city council wants to build a town swimming pool. a town planner needs to sample voter opinion about using public funds to build a swimming pool | divide voters into three strata: upper income, middle income, low income. take a simple random sample from each three strata and combine them to get a single sample. |
| what are the steps to stratified random sampling with proportional allocation | 1) divide population into subpopulations (strata) 2) from each stratum, obtain a simple random sample of size proportional to the size of the stratum 3) use all members obtained in step 2 as the sample |
| in stratified random sampling with proportional allocation, what does it mean by a simple random sample of size proportional to the size of the stratum | ; that is, the sample size fro stratum equals the total sample size times the stratum size divided by the population size |
| a combination of one or more of simple random sampling, systematic random sampling, cluster sampling, and stratified sampling | multistage sampling |
| in a designed experiment, the individuals or items on which the experiment is performed | experimental units |
| when experimental units are humans, what are they called? | subjects |
| two or more treatments should be compared | control |
| the experimental units should be randomly divided into groups to avid unintentional selection bias in constituting the groups | randomization |
| a sufficient number of experimental units should be used to ensure that randomization creates groups that resemble each other closely and to increase the chances of detecting any differences among the treatments | replication |
| in the placebo and specified treatment experiments, what are the treatments | both placebo and specified treatment |
| the group receiving the specified treatment is the _______ _______ and the group receiving the placebo is the _______ _______ | treatment group, control group |
| the characteristic of th experimental outcome that is to be measured or observed | response variable |
| a variable whose effect on the response variable is of interest in the experiment | factor |
| the possible values of a factor | levels |
| each experimental condition. for one-factor experiments, the treatments are the levels of the single factor. for multiactor experiments, each treatment is a combination of levels of the factors | treatment |
| all the experimental units are assigned randomly among all treatments | completely randomized design |
| the experimental units are assigned randomly among all the treatments separately within each block | randomized block design |
| a characteristic that varies from one person or thing to another | variable |
| a non numerically valued variable | qualitative variable |
| a numerically valued variable | quantitative variable |
| a quantitative variable whose possible values can be listed | discrete variable |
| a quantitative variable whose possible values from some interval of numbers | continuous variable |
| values of a variable | data |
| values of a qualitative variable | qualitative data |
| values of a quantitative variable | quantitative data |
| values of a discrete variable | discrete data |
| values of a continuous variable | continuous data |
| _______ ________ of qualitative data is a listing of the distinct values and their frequencies | frequency distribution |
| a __________ __________ of qualitative data is a listing of the distance values and their relative frequencies | relative-frequency distribution |
| a disk divided into wedge-shaped pieces proportional to the relative frequencies of the qualitative data | pie chart |
| displays the distinct values of the qualitative data on a horizontal axis and the relative frequencies of those values on the vertical axis. | bar chart |
| the smallest value that could go in a class | lower class limit |
| the largest value that could go in a class | upper class limit |
| the difference between the lower limit of a class and the lower limit of the next higher class | class width |
| the average of the two class limits of a class | class mark |
| use this type of grouping on data you would consider continuous | cutpoint grouping |
| the smallest value that could go in a cut point class | lower class cut point |
| the smallest value that could go in the next higher class (equivalent to the lower cut point of the next higher class) | upper class cut point |
| the difference between two cut points of a class | class width |
| the average of the two cut points of a class | class midpoint |
| displays the classes of the quantitative data on a horizontal axis and the frequencies on the vertical axis. the bars touch | histogram |
| for _________ ________, we use the distance values of the observations to label the histogram's bars, with each such value centered under its bar | single value grouping |
| for _________ or _______ __________, we use the lower class limits (or equivalently lower class cut points) to label the bars | limit or cutpoint grouping |
| s table, graph, or formula that provides the values of the observations and how often they occur | distribution of a data set |
| LOOK AT THE COMMON DISTRIBUTION SHAPES | |
| the distribution of population data | population distribution, or the distribution of the variable |
| the distribution of sample data | sample distribution |
| for simple random sampling, the sample distribution approximates the population distribution. what does this mean | the larger the sample size, the better the approximation |
| the sum of the observations divided by the number of observations | mean |
| arrange the data in increasing order and find the one in the middle | median |
| the highest frequency in a data set | mode |
| in a right skewed distribution the mean is _______ than the median | greater |
| in a symmetric distribution, the mean is _______ to median | equal |
| in a left skewed distribution, the mean is ________ than the median | less |
| not sensitive to the influence of a few extreme observations | resistant |
| the ________ is a resistant measure of center, the _______ is not. | median, mean |
| the _______ is usually preferred for data sets that have extreme observations | median |
| the resistance of the mean can be improved by using _______ ______ where a specified percentage of the smallest and largest observations are removed before computing the mean | trimmed means |
| the mean of observations for a sample is called a _______ _______ and is denoted ______ | sample mean, "x-bar" |
| n = ? | sample size |
| Max - Min | range |
| the standard deviation of the observations for a sample | sample standard deviation |
| GO OVER STANDARD DEVIATION ON CALCULATOR | |
| the more variation, the larger ________ ________ | standard deviation |
| almost all of the observations in any data set lie within ______ standard deviations to either side of the mean | 3 |
| chebychevs rule | for any data set and any real number k>1, at least 100(1-1/k^2)% lie within k standard deviations to either side of the mean |
| empirical rule applies tow hat kind of data sets | data sets having a roughly bell shaped distribution |
| what is the empirical rule | 1) 68% lie within one standard deviation of either side of the mean 2) 95% lie within two standard deviations of either side of the mean 3) 99.7% lie within three standard deviations of either side of the mean |
| median of the part of the entire data set that lies at or below the median of the entire data set | first quartile |
| the median of entire data set | second quartile |
| the median of the part of the entire data set that lies at or above the median of the entire data set | third quartile |
| the difference between the first and third quartile IQR = Q3-Q1 | interquartile range |
| min, Q1, Q2, Q3, max | five number summary |
| lower limit of a data set | Q1- 1.5(IQR) |
| upper limit of a data set | Q3+ 1.5(IQR) |
| observations that fall well outside the overall pattern of the data | outliers |
| observations that lie below the lower limit or above the upper limit | potential outliers |
| most extreme values that are not potential outliers. they still lie within the lower and upper limits | adjacent values |
| what makes the box in a box plot | the 3 quartiles connected |
| what makes the whiskers in a box plot | the adjacent values (upper and lower limits) |
| LOOK AT SHAPES OF BOX PLOTS | |
| the mean of all possible observations for the entire population | population mean or mean of the variable x |
| N=? | population size |
| the standard deviation of all possible observations for the entire population | population standard deviation or standard deviation of the variable x |
| a descriptive measure for a population | paramater |
| a descriptive measure for a sample | statistic |
| what is the mean of a z score | 0 |
| what is a frequency distribution of qualitative data and why is it useful? | it is a listing of the distinct values and their frequencies. it is useful because it provides a table of the values of the observations and how often they occur |