Data Analysis Word Scramble
|
Embed Code - If you would like this activity on your web page, copy the script below and paste it into your web page.
Normal Size Small Size show me how
Normal Size Small Size show me how
| Term | Definition |
| Trend | Is present when there is a long-term upward or downward movement in a time series. |
| Cycles | are present when there is a periodic movement in a time series. The period is the time it takes for one complete up and down movement in the time series plot. This term is generally reserved for periodic movements with a period greater than one year. |
| Seasonality | is present when there is a periodic movement in a time series that has a calendar related period – for example, a year, a month, a week. |
| Irregular (random) fluctuations | are always present in any real-world time series plot. They include all of the variations in a time series that we cannot reasonably attribute to systematic changes like trend, cycles, seasonality, structural change or the presence of outliers. |
| Smoothing | is a technique used to eliminate some of the irregular fluctuations in a time series plot so that features such as trend are more easily seen. |
| Seasonal indices | are used to quantify the seasonal variation in a time series. |
| Deseasonalise | The process of accounting for the effects of seasonality in a time series |
| Reseasonalise | The process of a converting seasonal data back into its original form is called |
| Bivariate Data | are data in which each observation involves recording information about two variables for the same person or thing. An example would be the heights and weights of the children in a preschool. |
| Residuals | The vertical distance from a data point to the straight line |
| Interpolation | Predicting within the range of data |
| Extrapolation | Predicting outside the range of data |
| Slope | Gradient on a linear graph |
| Coefficient of determination | gives a measure of the predictive power of a regression line |
| Residual plot | can be used to test the linearity assumption by plotting the residuals against the EV. |
| Correlation coefficient | gives a measure of the strength of a linear association |
| Scatterplot | is used to help identify and describe an association between two numerical variables |
| Parallel box plots | can be used to display, identify and describe the association between a numerical and a categorical variable |
| Segmented bar charts | can be used to graphically display the information contained in a two-way frequency table. It is a useful tool for identifying relationships between two categorical variables |
| Two-way frequency tables | are used as the starting point for investigating the association between two categorical variables |
| z-score | also known as standardised scores. The value of the standard score gives the distance and direction of a data value from the mean in terms of standard deviations. |
| 68-95-99.7% rule | the rule for normal distribution |
| The normal distribution | Data distributions that have a bell shape can be modelled by |
| outliers | data points away from the majority of the data set |
| Box plots | a graphical representation of a five-number summary |
| Five number summary | A listing of the median, M, the quartiles Q1 and Q3, and the smallest and largest data values of a distribution, written in the order - minimum, Q1, M, Q3, maximum |
| Interquartile range | gives the spread of the middle 50% of data values |
| Median | It is the midpoint of a distribution dividing an ordered dataset into two equal parts. |
| Univariate Data | are generated when each observation involves recording information about a single variable, for example a dataset containing the heights of the children in a preschool |
| Categorical Variable | are used to represent characteristics of individuals |
| Nominal Variable | generate data values that can only be used by name |
| Ordinal Variable | generate data values that can be used to both name and order |
| Numerical Variables | used to represent quantities. |
| Discrete Variables | represent quantities – e.g. the number of cars in a car park |
| Continuous Variables | represent quantities that are measured rather than counted – for example, weights in kg. |
| Bar Charts | are used to display frequency distribution of categorical data |
| Histograms | used to display the frequency distribution of a numerical variable. It is suitable for medium- to large-sized datasets. |
Created by:
TammyKnippel
Popular Math sets