click below
click below
Normal Size Small Size show me how
ML
| Question | Answer |
|---|---|
| OLS Linear Regression Weight Vector Formula | w = (X^T * X)^-1 * X^T * t |
| Ridge Regression (L2 Regularization) Analytical Weights Formula | w = (X^T * X + lambda * I)^-1 * X^T * t |
| Binary Logistic Regression Hypothesis Function (Sigmoid) | y = sigma(w^T * phi) = 1 / (1 + exp(-w^T * phi)) |
| Binary Logistic Regression Cross-Entropy Loss Function | E(w) = - sum_{n=1}^N [ t_n * ln(y_n) + (1 - t_n) * ln(1 - y_n) ] |
| Multiclass Logistic Regression Hypothesis Function (Softmax) | p(C_k | phi) = y_k(phi) = exp(w_k^T * phi) / sum_j exp(w_j^T * phi) |
| Multiclass Logistic Regression Negative Log-Likelihood Loss | L(w_1, ..., w_K) = - sum_{n=1}^N sum_{k=1}^K t_{nk} * ln(y_{nk}) |
| Perceptron Error Function (Loss over misclassified patterns) | E_P(w) = - sum_{n in M} w^T * (phi_n * t_n) |
| Mallow's C_p Statistic Formula | C_p = (1 / N) * (RSS + 2 * d * sigma_tilde^2) |
| Akaike Information Criterion (AIC) Formula | AIC = -2 * ln(L) + 2 * d |
| Bayesian Information Criterion (BIC) Formula | BIC = -2 * ln(L) + d * ln(N) |
| Generalization Error Bound for Finite Hypothesis Spaces (Agnostic) | L_true(h) <= L_train(h) + sqrt( (ln|H| + ln(2/delta)) / (2*N) ) |
| Generalization Error Bound for Infinite Hypothesis Spaces (VC Bound) | L_true(h) <= L_train(h) + sqrt( (VC(H) * (ln(2*N / VC(H)) + 1) + ln(4/delta)) / N ) |
| PAC Learning Sample Complexity Bound (Finite Space, Agnostic) | N >= (1 / (2 * epsilon^2)) * ( ln|H| + ln(2 / delta) ) |
| VC Dimension Sample Complexity Bound | N >= (1 / epsilon) * ( 4 * log2(2/delta) + 8 * VC(H) * log2(13/epsilon) ) |
| Dual Representation of Linear Regression (Dual Weight Vector) | w = X^T * a where a = (I * sigma^2 + X * X^T)^-1 * t |
| Gram Matrix (Kernel Matrix) Element Definition | K_nm = k(x_n, x_m) = phi(x_n)^T * phi(x_m) |
| Soft-Margin Support Vector Machine (SVM) Primal Objective | min_{w, b, xi} (1/2)||w||^2 + C * sum_{i=1}^N xi_i |
| Soft-Margin SVM Constraints | t_i * (w^T * x_i + b) >= 1 - xi_i and xi_i >= 0 |
| Soft-Margin SVM Dual Maximization Objective | max_alpha sum(alpha_n) - (1/2) * sum_n sum_m alpha_n * alpha_m * t_n * t_m * k(x_n, x_m) |
| Soft-Margin SVM Dual Constraints | 0 <= alpha_n <= C and sum(alpha_n * t_n) = 0 |
| Gaussian Process Predictive Mean Function | m(x_{N+1}) = k^T * C_N^-1 * t |
| Gaussian Process Predictive Variance Function | sigma^2(x_{N+1}) = k(x_{N+1}, x_{N+1}) + sigma^2 - k^T * C_N^-1 * k |
| State-Value Function V^pi(s) Bellman Expectation Equation | V^pi(s) = sum_a pi(a|s) [ R(s,a) + gamma * sum_{s'} P(s'|s,a) * V^pi(s') ] |
| Action-Value Function Q^pi(s,a) Bellman Expectation Equation | Q^pi(s,a) = R(s,a) + gamma * sum_{s'} P(s'|s,a) * sum_{a'} pi(a'|s') * Q^pi(s',a') |
| Optimal State-Value Function V*(s) Bellman Optimality Equation | V*(s) = max_a [ R(s,a) + gamma * sum_{s'} P(s'|s,a) * V*(s') ] |
| Optimal Action-Value Function Q*(s,a) Bellman Optimality Equation | Q*(s,a) = R(s,a) + gamma * sum_{s'} P(s'|s,a) * max_{a'} Q*(s', a') |
| Value Iteration Value Update Rule | V_{k+1}(s) <- max_a [ R(s,a) + gamma * sum_{s'} P(s'|s,a) * V_k(s') ] |
| Policy Iteration (Greedy Improvement Rule) | pi_{k+1}(s) <- argmax_a [ R(s,a) + gamma * sum_{s'} P(s'|s,a) * V^{pi_k}(s') ] |
| Bellman Optimality Operator T* acting on V | (T*V)(s) = max_a [ R(s,a) + gamma * sum_{s'} P(s'|s,a) * V(s') ] |
| Max-Norm Contraction Property of Bellman Operators | ||T f_1 - T f_2||_infinity <= gamma * ||f_1 - f_2||_infinity |
| Temporal Difference Error (TD Error) delta_t | delta_t = r_{t+1} + gamma * V(s_{t+1}) - V(s_t) |
| Temporal Difference TD(0) State-Value Update Rule | V(s_t) <- V(s_t) + alpha * (r_{t+1} + gamma * V(s_{t+1}) - V(s_t)) |
| SARSA (On-Policy Control) Action-Value Update Rule | Q(s,a) <- Q(s,a) + alpha * (r + gamma * Q(s',a') - Q(s,a)) |
| Q-Learning (Off-Policy Control) Action-Value Update Rule | Q(s,a) <- Q(s,a) + alpha * (r + gamma * max_{a'} Q(s',a') - Q(s,a)) |
| Thompson Sampling Parameter Update for Bernoulli Success | alpha_i <- alpha_i + 1 |
| Thompson Sampling Parameter Update for Bernoulli Failure | beta_i <- beta_i + 1 |
| Incremental Target/Reward Mean Formula | Q_{k}(a) <- Q_{k-1}(a) + (1 / k) * (r_k - Q_{k-1}(a)) |