click below
click below
Normal Size Small Size show me how
MIS 665 Final2SS
Terms from Summary Sheets
| Term | Definition |
|---|---|
| _More_ or _Less_ features with similar R^2 "wins?" | Less |
| What do we fit data on? | Train data (NEVER test) |
| What do we check before modeling to catch multicollinearity | Correlation Heatmap |
| Is a lower or higher MSE better? | Lower |
| Higher R^2 means | more variance explained |
| StnadardScaler changes ___, NOT ___ | coef scale, model accuracy |
| Parsimony | fewer features with similar R^2 is preferred |
| Lasso (alpha=1) | automatic feature selection via L1 penalty |
| DATA LEAKAGE | never fit_transform on full X --> Split first! |
| KNN requires ___ | StandardScaler (uses distance) |
| Logistic Regression gives ____, not just class labels | probabilities |
| SelectKBest: fit on training data only | leakage rule |
| 0.5 is __ guess, 1.0 is ___ guess | random, perfect |
| Use _____ for reliable accuracy | cross_val_score(cv=10) |
| what avoids dummy trap in OneHotEncoder | drop='first' |
| ALWAYS do what before clustering | standardize |
| ______ features dominate distance | high-variance |
| Clustering is supervised/unsupervised | unsupervised (NO Y) |
| Silhouette score of ___+ is strong, ____ is reasonable | 0.71+, 0.51-0.70 |
| K-Means++ init reduces | sensitivity to random start |
| Profile clusters on _________ data for meaning | original (unscaled) |
| PCA is ____-based, MUST scale first | variance |
| Fit PCA on ___ data only, transform both separately | training |
| n_components=0/90 | auto-selects fewest PCs for 90% variance |
| PCA replaces feature names | use loadings to interpret |
| Embeddings understand ___ | synonyms |
| Pipeline: | represent text > reduce dims > build classifier |
| A silhouette score of 0.50 generally indicates | moderately well-separated clusters |
| In healthcare, which metric is most important? | Recall (measure of how many actual positives were caught) |