click below
click below
Normal Size Small Size show me how
Data Analyst
Interview Prep
| Question | Answer |
|---|---|
| What is Data Mining? | The process of finding relevant information which has not been found before. |
| What is Data Profiling? | Is usually done to assess a dataset for it's uniqueness, consistency and logic. |
| What is Data Wrangling? | The process of cleaning, structuring, enriching, validating, and analyzing the data. |
| What percent of data analytics is spent on Data Wrangling? | 80% |
| What is step 3 of an analytics project? | Data cleaning |
| What is step 2 of an analytics project? | Data collection |
| What is step 1 of an analytics project? | Understand the problem |
| What is step 4 of an analytics project? | Data exploration and analysis |
| What is step 5 of an analytics project? | Interpret the results |
| What are the 2 most important steps of an analytics project? | Understand the problem & Interpret the results |
| What is the best practice for data cleaning? | Make a plan by understanding where the common errors take place and keep communications open. |
| How can you handle missing values in a dataset? | Listwise deletion |
| What is Listwise deletion? | A method where an entire record is excluded from analysis if a single value is missing. |
| What is Average Imputation? | Using the most common response from other participants to fill the missing value |
| How is Average Imputation useful? | If a group of data doesn't have certain info needed, you could average it. |
| Wha t is Regression substitution? | This is used to estimate a missing value. |
| What is Multiple imputation? | It creates plausible values based on the correlations for missing data and then averages the simulated datasets by incorporating random errors in your predictions. |
| What is Normal Distribution? | A type of continuous probability distribution that is symmetric about the mean and in a graph appears as a bell curve |
| What is a Time Series analysis? | A statistical method that deals with ordered sequence of values of a variable at equally spaced time intervals. |
| What is Data Joining in Tableau? | When the data comes from the same source, and share a common set of Dimensions and Measures. |
| What is Data Blending in Tableau? | When the data comes from two or more different sources, and have their own set of Dimensions and Measures. |
| What is Overfitting? | The performance drops significantly over the test set and the model learns the noise and random fluctuations in the training dataset in detail. |
| What is Underfitting? | |
| In MS Excel, a numeric value can be treated as a text if it precedes with? | Apostrophe |