383 Final
Help!
|
|
||||
---|---|---|---|---|---|
show | That a human cannot distinguish the difference bewteen that machine and another human over a conversation
🗑
|
||||
What is the primary advantage of the rational agent approach for the purpose of science research? | show 🗑
|
||||
show | A machine that can pick up objects using eating utensils
🗑
|
||||
What is Moravec’s Paradox? | show 🗑
|
||||
show | Graph Search algorithms keep track of explored nodes to prevent cycles
🗑
|
||||
show | False
🗑
|
||||
In what year was the field of AI officially founded? | show 🗑
|
||||
Which of the following best describes how leading textbooks define the study of AI? | show 🗑
|
||||
What are the inputs to a search problem? a. Initial State b. Goal State c. Action cost function d. Transition Model e. All of the above | show 🗑
|
||||
T or F: Adversarial Search is exactly the same as performing regular search in a multi-agent environment? | show 🗑
|
||||
show | Deep Blue
🗑
|
||||
What are the space and time complexities of BFS? | show 🗑
|
||||
What is the main way that graph search algorithms differ? | show 🗑
|
||||
show | Heuristic function
🗑
|
||||
show | A sequence of actions to reach the goal state
🗑
|
||||
What is the key difference between a goal-based agent and a utility-based agent? | show 🗑
|
||||
In what way does online search demonstrate an example of machine learning? | show 🗑
|
||||
show | Alpha-beta
🗑
|
||||
show | the goal test checks if all states in the beliefe state are the goal state
🗑
|
||||
Which of the following tasks would always involve non-deterministic actions? | show 🗑
|
||||
Which of the following best describes the Physical Symbol System Hypothesis? | show 🗑
|
||||
Which best describes the argument in Rodney Brook's 1990 influential paper "Elephants Don't Play Chess"? | show 🗑
|
||||
In the field of AI, what are the two necessary components of an agent? | show 🗑
|
||||
h(n) = 0 is this heuristic function admissible given non-negative edge weights? | show 🗑
|
||||
What are the four environment assumptions needed to execute proper execution of "pure" search algorithms? | show 🗑
|
||||
show | A reflex agent is rational if it selects actions that maximizes its performance measure based on current percept and knowledge, given its environment and actions.
ex. thermostat controlling heating system is rational if it turns on when temp is below
🗑
|
||||
what is the critical assumption of the minimax algorithm? If that assumption were not true, would the algorithm still yield the optiomal action to take? | show 🗑
|
||||
show | to find incremental adjustments to make to all the weights in a neural network
🗑
|
||||
in general, how do we optimize an ML model? | show 🗑
|
||||
show | Find the minimim of a function
🗑
|
||||
show | Its not differentiable
🗑
|
||||
show | false
🗑
|
||||
show | a measure of the imperfection of our prediction
🗑
|
||||
show | A partition of data used to tune hyper-parameters of a model before testing
🗑
|
||||
show | the learning rate
🗑
|
||||
in the context of machine learning, which statement best describes the purpose of a hypothesis function h(x)? | show 🗑
|
||||
show | - each nearby datapoint within a certain distance radius votes for the classification of a novel datapoint
--> set of datapoints closest to a novel datapoint vote for its classification
🗑
|
||||
which of the following best describes the process of selection a model for a machine learning problem? | show 🗑
|
||||
what is the backpropogation algorithm used to compute? | show 🗑
|
||||
show | a binary linear classifier
🗑
|
||||
show | a hyperplane through a set of datapoints using the residuals between estimate and actual output
🗑
|
||||
which of the following is not a category of ML? a. unsupervised learning b. reinforcement learning c. supervised learning d. harmonic learning | show 🗑
|
||||
show | true
🗑
|
||||
show | multi-layer perceptron
🗑
|
||||
show | given equal performance the least complex model is often preferred
🗑
|
||||
which describes a likely observation you could make on an overfit model? | show 🗑
|
||||
show | a hyperplave divides data in such a way that maximizes the margin between the categories
🗑
|
||||
show | its not differentiable and thus incompatible with differential optimization techniques
🗑
|
||||
T or F: When approaching a ML problem, there's only one correct model to use to create accurate predictions | show 🗑
|
||||
show | a linear sum of weighted inputs is taken. if that sum exceeds a set value then the perceptron activates sending a fixed signal to its downstream connections
🗑
|
||||
show | A trick to find a decision boundary in a different coordinate space
🗑
|
||||
T or F: Multilayer neural networks can in theory predict any continous function | show 🗑
|
||||
T or F: a ML model that is overfit to a dataset will generlize well | show 🗑
|
||||
physical symbol system hypothesis: | show 🗑
|
||||
show | Physical symbol system hypothesis
🗑
|
||||
show | Adjusting neural-network parameters from data
🗑
|
||||
show | - bottom-up, inductive approach.
- Systems learn rules and patterns directly from data (observations) rather than being explicitly programmed.
🗑
|
||||
show | A. Symbolic AI
🗑
|
||||
Natural Language Processing (NLP) is primarily concerned with: A. Visual scene understanding B. Physical robot control C. Understanding and generating human language D. Optimizing search algorithms | show 🗑
|
||||
show | Enabling agents to "see" and interpret visual information.
🗑
|
||||
show | joint probability as a product of conditional probabilities
🗑
|
||||
A bigram model relies on which assumption? | show 🗑
|
||||
Which n-gram model ignores word order entirely? | show 🗑
|
||||
Corpus | show 🗑
|
||||
Why is the full joint distribution P(w₁…wₘ) infeasible to compute directly? | show 🗑
|
||||
show | Comparing sequence probability under each language model
🗑
|
||||
In LLMs, the “context window” is: A. Number of GPUs used B. Maximum length of prefix tokens the model can attend to C. Batch size during training D. Size of the model vocabulary | show 🗑
|
||||
Which stage of LLM training uses human-labelled prompt/output pairs to teach formatting? A. Pre-training B. Tokenization C. Instruction tuning D. Inference | show 🗑
|
||||
show | Reinforcement Learning from Human Feedback
🗑
|
||||
show | Have the model articulate its step-by-step reasoning
🗑
|
||||
show | - Extract specific pieces of structured information from unstructured or semi-structured text
- An example is extracting product names and their prices from websites
🗑
|
||||
show | A sequence of characters defining a search pattern (ex. format of a price or phone number)
🗑
|
||||
Risk of LLM's being prompted to perform information extraction: | show 🗑
|
||||
Information retrieval | show 🗑
|
||||
show | - document collection: The large set of documents to search within
- query: The user's expression of their information need
- retrieval system: The algorithm/system that processes the query and retruns ranked subset of documents deemed relevant
🗑
|
||||
In TF-IDF scoring, a term gets high weight if it is: A. Frequent in all documents B. Rare in the corpus but frequent in the current document C. Absent from the current document D. Only appears in stop-word list | show 🗑
|
||||
show | - TF: How often a term appears in the document. High scores suggests relevance
- IDF: How rare a term is across the entire corpus. Rarer terms are considered more informative
IDF(t) = log(N/df(t)), [N = total docs, d(f) = # docs containing term]
🗑
|
||||
PageRank ranks web pages based on: A. Term frequency B. Link structure (“importance” via incoming links) C. Document length D. Keyword density | show 🗑
|
||||
In a TF-IDF scheme, what role does the IDF component play? | show 🗑
|
||||
NLP Task: Syntactic Parsing | show 🗑
|
||||
show | A type of grammer where rules apply regardless of surrounding context
In a plain CFG, if you have two ways to expand NP—say
1. NP → Det Noun (ex. "the cat")
2. NP → Name (ex. Sam)
you don’t say which one is preferred; both parses are just “allowed.”
🗑
|
||||
show | Assigning probabilities to each production rule
- instead of treating every grammar rule as equally “possible,” a PCFG lets you say “Rule X is twice as likely as Rule Y.”
🗑
|
||||
Probabilistic CFG (PCFG) | show 🗑
|
||||
NP → Det Noun (tree) | show 🗑
|
||||
show | Probability: 0.6 x 1.0 x 0.5 = 0.3
NP
├─ Det → “the”
└─ Noun → “cat”
🗑
|
||||
In a parse tree, “terminals” are: A. Non-terminal symbols like NP or VP B. Actual words of the sentence C. Probability values D. Grammar rules | show 🗑
|
||||
show | - word embed: Dense and low-dimensional, learned to capture similarity
- one-hot: very sparse and high dimensional, treats words as independent symbols
🗑
|
||||
Word embedding example | show 🗑
|
||||
show | "cat" → [1,0,0]
"dog" → [0,1,0]
"mouse" → [0,0,1]
every word is turned into a vector of length |V| = 3, with a single 1 at its index
🗑
|
||||
show | One-hot are sparse and don't capture meaning.
Word embeddings are better: each word is mapped to a dense, relatively low-dimensional vector whose values are learned during training.
These embeddings often capture semantic relationships between words.
🗑
|
||||
Softmax activation is used to: A. Normalize outputs into a probability distribution (sum=1) B. Compute the maximum activation only C. Introduce non-linearity by thresholding at 0 D. Pool features spatially | show 🗑
|
||||
show | Previous hidden state ht₋₁ and current input embedding
🗑
|
||||
RNNs often struggle to learn long-range dependencies due to: A. Exploding gradients only B. Vanishing gradients only C. Both vanishing and exploding gradients D. Lack of embeddings | show 🗑
|
||||
Greedy decoding always picks: A. A random next word B. The highest-probability next word C. The least probable next word D. A word based on TF-IDF | show 🗑
|
||||
show | -RNN's were developed to handle sequential data more effectively
-added: hidden state (ht) that is updated at each time step (t) based on the current input (Et) and the prev hidden state (ht-1)
-allows network to maintain a summary of sequences seen
🗑
|
||||
Search algorithm components | show 🗑
|
||||
temperature | show 🗑
|
||||
nucleau sampling (top-p) | show 🗑
|
||||
show | Predefined patterns and templates (often regex)
🗑
|
||||
show | Agents interacting within a conversational environment using natural language
🗑
|
||||
show | -chatbots: designed for open-ended convo
-task-oriented: designed to help user accomplish specific goals (e.g. Siri/Alexa, automated phone system)
🗑
|
||||
show | retrieves responses from a large database of existing conversations (e.g. movie scripts, twitter data)
🗑
|
||||
Rule-based chatbot strength and weakness | show 🗑
|
||||
show | -handle more variety
-lack control and can inherit biases or undesirable content from the training data
🗑
|
||||
show | -Convert user's spoken audio into text ("utterance")
-modern ASR uses deep learning (e.g. transformers)
🗑
|
||||
show | Analyze the utterance text to determine the user's goal.
steps:
1. Domain classification
2. Intent Determination
3. Slot filling
🗑
|
||||
show | Identify the general topic (e.g. Weather, Music)
🗑
|
||||
NLU: Intent Determination | show 🗑
|
||||
NLU: Slot filling | show 🗑
|
||||
Computer vision | show 🗑
|
||||
show | comp. vision provides the "perceptron" component, allowing agents to sense and interpret their visual environment to inform state representation and action selection
🗑
|
||||
show | Grayscale: On value per pixel --> single intensity value
Color (RGB): typically 3 values per pixel
🗑
|
||||
show | - an algorithm for object detection, particularly face detection
- uses rectangualr features called Haar-like features which capture basic patterns of intensity differences in faces (e.g. the eye region is typically darker than upper cheeks)
🗑
|
||||
Convolutional Neural Networks (CNNs) | show 🗑
|
||||
What key operations are a part of CNN's? | show 🗑
|
||||
show | Apply small, learnable filters across input img
Each filter slides over img, computing dot-product to detect patterns
Outputs a feature map showing where each pattern appears
Uses parameter sharing (same filter everywhere) to keep the model compact
🗑
|
||||
show | – Downsample each feature map by summarizing small regions (e.g., taking the maximum in each 2×2 block).
– Reduces spatial dimensions and computation.
– Introduces slight invariance to shifts or distortions in the input.
🗑
|
||||
show | Applying the same small filter (kernel) across the spatial dimensions (parameter sharing)
🗑
|
||||
Pooling layers (e.g., max-pooling) serve to: A. Increase feature map size B. Reduce spatial dimensions and add invariance to small shifts C. Normalize pixel intensities D. Learn filter weights | show 🗑
|
||||
AlexNet (2012): | show 🗑
|
||||
Why is AlexNet so important? | show 🗑
|
||||
show | High-level abstract reasoning (e.g., chess)
🗑
|
||||
show | The robot’s current state (position/orientation/joint angles)
🗑
|
||||
show | Estimate the true state by filtering noisy sensor measurements
🗑
|
||||
n an MDP for reinforcement learning, the agent aims to learn a policy that maximizes: A. Immediate reward only B. Cumulative (discounted) future rewards C. Total number of states visited D. Size of the action space | show 🗑
|
||||
PID Controller | show 🗑
|
||||
show | used to estimate the true underlying state by statistically averaging noisy measurements over time, producing a smoother, more reliable signal.
🗑
|
||||
Control Theory: | show 🗑
|
||||
Reinforcement Learning (RL) | show 🗑
|
||||
show | The standard mathematical framework for RL problems
-defined by: States (S), Action (A), Transition Probabilities, Reward, Discount Factor
🗑
|
||||
show | A function mapping states to actions. RL aims to find the optimal policy π∗ that maximizes expected discounted future rewards.
🗑
|
||||
show | In SL, a model learns to map inputs to known outputs by minimizing prediction error on labeled training data. In RL, an agent learns through trial and error by interacting with an environment and optimizing its actions to maximize cumulative reward.
🗑
|
||||
show | combines reinforcement learning with deep neural networks, using networks to approximate policies or value functions so agents can handle very large or continuous state/action spaces.
🗑
|
||||
show | agent interacts with the environment using its current neural network policy. The collected experiences (state, action, reward, next state) are used to update the network parameters (θ) via gradient-based optimization, aiming to improve expected rewards.
🗑
|
||||
Adversarial Search | show 🗑
|
||||
show | Both exponential
🗑
|
||||
show | both exponential
🗑
|
||||
show | search algorithm that finds the path withe the lowest cumulative cost
🗑
|
||||
show | both linear
🗑
|
||||
show | BFS: explores level by level
DFS: explores as deep as possible branch by branch
🗑
|
||||
utility-based agents | show 🗑
|
||||
rational agent | show 🗑
|
||||
show | mathematical model of a biological neuron, a foundational element of artificial neural networks.
-developed in 1958
🗑
|
||||
show | -The output of one layer of Perceptrons serves as the input to the next layer.
-This interconnected structure allows ANNs to represent much more complex functions than a single Perceptron.
🗑
|
Review the information in the table. When you are ready to quiz yourself you can hide individual columns or the entire table. Then you can click on the empty cells to reveal the answer. Try to recall what will be displayed before clicking the empty cell.
To hide a column, click on the column name.
To hide the entire table, click on the "Hide All" button.
You may also shuffle the rows of the table by clicking on the "Shuffle" button.
Or sort by any of the columns using the down arrow next to any column heading.
If you know all the data on any row, you can temporarily remove it by tapping the trash can to the right of the row.
To hide a column, click on the column name.
To hide the entire table, click on the "Hide All" button.
You may also shuffle the rows of the table by clicking on the "Shuffle" button.
Or sort by any of the columns using the down arrow next to any column heading.
If you know all the data on any row, you can temporarily remove it by tapping the trash can to the right of the row.
Embed Code - If you would like this activity on your web page, copy the script below and paste it into your web page.
Normal Size Small Size show me how
Normal Size Small Size show me how
Created by:
user-1948709
Popular Computers sets