click below
click below
Normal Size Small Size show me how
Research Methods
Research Methods, Statistical Analysis, AICP Nov 2022 Test
| Term | Definition |
|---|---|
| Three steps of the statistical process | 1) collect data, 2) describe and summarize the distribution of values in the data set, 3) interpret by means of inferential statistics and statistical modeling |
| Nominal Data | Classified into mutually exclusive groups or categories and lack intrinsic order. A zoning classification, social security number, and sex are examples of nominal data. (qualitative variable) |
| Ordinal Data | Ordered categories implying a ranking of the observations. Examples of ordinal data are letter grades, suitability for development, and response scales on a survey (e.g., 1 through 5). (qualitative data) |
| Interval Data | Data that has an ordered relationship where the difference between the scales has a meaningful interpretation. The typical example of interval data is temperature |
| Ratio Data | Gold standard of measurement, where both absolute and relative differences have a meaning. The classic example of ratio data is a distance measure |
| Quantitative Variables | Household income, level of pollution in a river. Represent interval (temperature) and ratio (distance) data. |
| Qualitative variables | Zoning classification. Represent nominal (zoning) or ordinal (letter grade) data |
| Continuous variables | Can take an infinite number of values, both positive and negative, and with as fine a degree of precision as desired. Most measurements in the physical sciences yield continuous variables. |
| Discrete variables | Can only take on a finite number of distinct values. An example is the count of the number of events, such as the number of accidents per month - cannot be negative |
| Binary/dichotomous variables | Can only take on two values, typically coded as 0 and 1. |
| Population | Totality of some entity ex # of planners preparing for AICP test |
| Sample | Subset of the population |
| Descriptive Statistics | Describe the characteristics of the distribution of values in a population or in a sample |
| Inferential Statistics | Use probability theory to determine characteristics of a population based on observations made on a sample from that population. We infer things about the population based on what is observed in the sample. |
| Distribution | Overall shape of all observed data. It can be listed as an ordered table, or graphically represented by a histogram or density plot. |
| Central tendency | A typical or representative value for distribution of observed values. (mean, median, mode) |
| Dispersion | How distribution values are spread around the central tendency |
| Symmetry | Used to describe the shape of a data distribution. |
| Skewness | If the skewness of S is zero then the distribution represented by S is perfectly symmetric. If the skewness is negative, then the distribution is skewed to the left, while if the skew is positive then the distribution is skewed to the right |
| Kurtosis | Provides a measurement about the extremities (i.e. tails) of the distribution of data, and therefore provides an indication of the presence of outliers. |
| Normal/Gaussian Distribution (Bell Curve) | Distribution is symmetric and has the additional property that the spread around the mean can be related to the proportion of observations. 95% of the observations that follow a normal distribution are within 2 standard deviations from the mean |
| Variance (stats) | A measure of how spread out a distribution is. It is computed as the average squared deviation of each number from its mean. Steep shape means the numbers are further from the predicted models. |
| Standard Deviation | Square root of the variance. |
| Coefficient of Variation | Measures the relative dispersion from the mean by taking the standard deviation and dividing by the mean. More than 15% for a survey means it should not be used. |
| Z-Score | The number of standard deviations from the mean a data point is. But more technically it's a measure of how many standard deviations below or above the population mean a raw score is. Must be 3 + standard deviations |
| Inter-quarterly Range | A measure of variability, based on dividing a data set into quartiles. Quartiles divide a rank-ordered data set into four equal parts. |
| hypothesis test | Distinguish between the null hypothesis (H0), i.e., the point of departure or reference, and the alternative hypothesis (H1), or the research hypothesis one wants to find support for by rejecting the null hypothesis |
| Areas under the normal distribution curve | • 68 % is within one standard deviation of the mean. • 95% is within two standard deviations. • 99% is within three standard deviations." |