Busy. Please wait.
Log in with Clever
or

show password
Forgot Password?

Don't have an account?  Sign up 
Sign up using Clever
or

Username is available taken
show password

Your email address is only used to allow you to reset your password. See our Privacy Policy and Terms of Service.


Already a StudyStack user? Log In

Reset Password
Enter the associated with your account, and we'll email you a link to reset your password.

383 Final

        Help!  

Term
Definition
show That a human cannot distinguish the difference bewteen that machine and another human over a conversation  
🗑
What is the primary advantage of the rational agent approach for the purpose of science research?   show
🗑
show A machine that can pick up objects using eating utensils  
🗑
What is Moravec’s Paradox?   show
🗑
show Graph Search algorithms keep track of explored nodes to prevent cycles  
🗑
show False  
🗑
In what year was the field of AI officially founded?   show
🗑
Which of the following best describes how leading textbooks define the study of AI?   show
🗑
What are the inputs to a search problem? a. Initial State b. Goal State c. Action cost function d. Transition Model e. All of the above   show
🗑
T or F: Adversarial Search is exactly the same as performing regular search in a multi-agent environment?   show
🗑
show Deep Blue  
🗑
What are the space and time complexities of BFS?   show
🗑
What is the main way that graph search algorithms differ?   show
🗑
show Heuristic function  
🗑
show A sequence of actions to reach the goal state  
🗑
What is the key difference between a goal-based agent and a utility-based agent?   show
🗑
In what way does online search demonstrate an example of machine learning?   show
🗑
show Alpha-beta  
🗑
show the goal test checks if all states in the beliefe state are the goal state  
🗑
Which of the following tasks would always involve non-deterministic actions?   show
🗑
Which of the following best describes the Physical Symbol System Hypothesis?   show
🗑
Which best describes the argument in Rodney Brook's 1990 influential paper "Elephants Don't Play Chess"?   show
🗑
In the field of AI, what are the two necessary components of an agent?   show
🗑
h(n) = 0 is this heuristic function admissible given non-negative edge weights?   show
🗑
What are the four environment assumptions needed to execute proper execution of "pure" search algorithms?   show
🗑
show A reflex agent is rational if it selects actions that maximizes its performance measure based on current percept and knowledge, given its environment and actions. ex. thermostat controlling heating system is rational if it turns on when temp is below  
🗑
what is the critical assumption of the minimax algorithm? If that assumption were not true, would the algorithm still yield the optiomal action to take?   show
🗑
show to find incremental adjustments to make to all the weights in a neural network  
🗑
in general, how do we optimize an ML model?   show
🗑
show Find the minimim of a function  
🗑
show Its not differentiable  
🗑
show false  
🗑
show a measure of the imperfection of our prediction  
🗑
show A partition of data used to tune hyper-parameters of a model before testing  
🗑
show the learning rate  
🗑
in the context of machine learning, which statement best describes the purpose of a hypothesis function h(x)?   show
🗑
show - each nearby datapoint within a certain distance radius votes for the classification of a novel datapoint --> set of datapoints closest to a novel datapoint vote for its classification  
🗑
which of the following best describes the process of selection a model for a machine learning problem?   show
🗑
what is the backpropogation algorithm used to compute?   show
🗑
show a binary linear classifier  
🗑
show a hyperplane through a set of datapoints using the residuals between estimate and actual output  
🗑
which of the following is not a category of ML? a. unsupervised learning b. reinforcement learning c. supervised learning d. harmonic learning   show
🗑
show true  
🗑
show multi-layer perceptron  
🗑
show given equal performance the least complex model is often preferred  
🗑
which describes a likely observation you could make on an overfit model?   show
🗑
show a hyperplave divides data in such a way that maximizes the margin between the categories  
🗑
show its not differentiable and thus incompatible with differential optimization techniques  
🗑
T or F: When approaching a ML problem, there's only one correct model to use to create accurate predictions   show
🗑
show a linear sum of weighted inputs is taken. if that sum exceeds a set value then the perceptron activates sending a fixed signal to its downstream connections  
🗑
show A trick to find a decision boundary in a different coordinate space  
🗑
T or F: Multilayer neural networks can in theory predict any continous function   show
🗑
T or F: a ML model that is overfit to a dataset will generlize well   show
🗑
physical symbol system hypothesis:   show
🗑
show Physical symbol system hypothesis  
🗑
show Adjusting neural-network parameters from data  
🗑
show - bottom-up, inductive approach. - Systems learn rules and patterns directly from data (observations) rather than being explicitly programmed.  
🗑
show A. Symbolic AI  
🗑
Natural Language Processing (NLP) is primarily concerned with: A. Visual scene understanding B. Physical robot control C. Understanding and generating human language D. Optimizing search algorithms   show
🗑
show Enabling agents to "see" and interpret visual information.  
🗑
show joint probability as a product of conditional probabilities  
🗑
A bigram model relies on which assumption?   show
🗑
Which n-gram model ignores word order entirely?   show
🗑
Corpus   show
🗑
Why is the full joint distribution P(w₁…wₘ) infeasible to compute directly?   show
🗑
show Comparing sequence probability under each language model  
🗑
In LLMs, the “context window” is: A. Number of GPUs used B. Maximum length of prefix tokens the model can attend to C. Batch size during training D. Size of the model vocabulary   show
🗑
Which stage of LLM training uses human-labelled prompt/output pairs to teach formatting? A. Pre-training B. Tokenization C. Instruction tuning D. Inference   show
🗑
show Reinforcement Learning from Human Feedback  
🗑
show Have the model articulate its step-by-step reasoning  
🗑
show - Extract specific pieces of structured information from unstructured or semi-structured text - An example is extracting product names and their prices from websites  
🗑
show A sequence of characters defining a search pattern (ex. format of a price or phone number)  
🗑
Risk of LLM's being prompted to perform information extraction:   show
🗑
Information retrieval   show
🗑
show - document collection: The large set of documents to search within - query: The user's expression of their information need - retrieval system: The algorithm/system that processes the query and retruns ranked subset of documents deemed relevant  
🗑
In TF-IDF scoring, a term gets high weight if it is: A. Frequent in all documents B. Rare in the corpus but frequent in the current document C. Absent from the current document D. Only appears in stop-word list   show
🗑
show - TF: How often a term appears in the document. High scores suggests relevance - IDF: How rare a term is across the entire corpus. Rarer terms are considered more informative IDF(t) = log(N/df(t)), [N = total docs, d(f) = # docs containing term]  
🗑
PageRank ranks web pages based on: A. Term frequency B. Link structure (“importance” via incoming links) C. Document length D. Keyword density   show
🗑
In a TF-IDF scheme, what role does the IDF component play?   show
🗑
NLP Task: Syntactic Parsing   show
🗑
show A type of grammer where rules apply regardless of surrounding context In a plain CFG, if you have two ways to expand NP—say 1. NP → Det Noun (ex. "the cat") 2. NP → Name (ex. Sam) you don’t say which one is preferred; both parses are just “allowed.”  
🗑
show Assigning probabilities to each production rule - instead of treating every grammar rule as equally “possible,” a PCFG lets you say “Rule X is twice as likely as Rule Y.”  
🗑
Probabilistic CFG (PCFG)   show
🗑
NP → Det Noun (tree)   show
🗑
show Probability: 0.6 x 1.0 x 0.5 = 0.3 NP ├─ Det → “the” └─ Noun → “cat”  
🗑
In a parse tree, “terminals” are: A. Non-terminal symbols like NP or VP B. Actual words of the sentence C. Probability values D. Grammar rules   show
🗑
show - word embed: Dense and low-dimensional, learned to capture similarity - one-hot: very sparse and high dimensional, treats words as independent symbols  
🗑
Word embedding example   show
🗑
show "cat" → [1,0,0] "dog" → [0,1,0] "mouse" → [0,0,1] every word is turned into a vector of length |V| = 3, with a single 1 at its index  
🗑
show One-hot are sparse and don't capture meaning. Word embeddings are better: each word is mapped to a dense, relatively low-dimensional vector whose values are learned during training. These embeddings often capture semantic relationships between words.  
🗑
Softmax activation is used to: A. Normalize outputs into a probability distribution (sum=1) B. Compute the maximum activation only C. Introduce non-linearity by thresholding at 0 D. Pool features spatially   show
🗑
show Previous hidden state ht₋₁ and current input embedding  
🗑
RNNs often struggle to learn long-range dependencies due to: A. Exploding gradients only B. Vanishing gradients only C. Both vanishing and exploding gradients D. Lack of embeddings   show
🗑
Greedy decoding always picks: A. A random next word B. The highest-probability next word C. The least probable next word D. A word based on TF-IDF   show
🗑
show -RNN's were developed to handle sequential data more effectively -added: hidden state (ht) that is updated at each time step (t) based on the current input (Et) and the prev hidden state (ht-1) -allows network to maintain a summary of sequences seen  
🗑
Search algorithm components   show
🗑
temperature   show
🗑
nucleau sampling (top-p)   show
🗑
show Predefined patterns and templates (often regex)  
🗑
show Agents interacting within a conversational environment using natural language  
🗑
show -chatbots: designed for open-ended convo -task-oriented: designed to help user accomplish specific goals (e.g. Siri/Alexa, automated phone system)  
🗑
show retrieves responses from a large database of existing conversations (e.g. movie scripts, twitter data)  
🗑
Rule-based chatbot strength and weakness   show
🗑
show -handle more variety -lack control and can inherit biases or undesirable content from the training data  
🗑
show -Convert user's spoken audio into text ("utterance") -modern ASR uses deep learning (e.g. transformers)  
🗑
show Analyze the utterance text to determine the user's goal. steps: 1. Domain classification 2. Intent Determination 3. Slot filling  
🗑
show Identify the general topic (e.g. Weather, Music)  
🗑
NLU: Intent Determination   show
🗑
NLU: Slot filling   show
🗑
Computer vision   show
🗑
show comp. vision provides the "perceptron" component, allowing agents to sense and interpret their visual environment to inform state representation and action selection  
🗑
show Grayscale: On value per pixel --> single intensity value Color (RGB): typically 3 values per pixel  
🗑
show - an algorithm for object detection, particularly face detection - uses rectangualr features called Haar-like features which capture basic patterns of intensity differences in faces (e.g. the eye region is typically darker than upper cheeks)  
🗑
Convolutional Neural Networks (CNNs)   show
🗑
What key operations are a part of CNN's?   show
🗑
show Apply small, learnable filters across input img Each filter slides over img, computing dot-product to detect patterns Outputs a feature map showing where each pattern appears Uses parameter sharing (same filter everywhere) to keep the model compact  
🗑
show – Downsample each feature map by summarizing small regions (e.g., taking the maximum in each 2×2 block). – Reduces spatial dimensions and computation. – Introduces slight invariance to shifts or distortions in the input.  
🗑
show Applying the same small filter (kernel) across the spatial dimensions (parameter sharing)  
🗑
Pooling layers (e.g., max-pooling) serve to: A. Increase feature map size B. Reduce spatial dimensions and add invariance to small shifts C. Normalize pixel intensities D. Learn filter weights   show
🗑
AlexNet (2012):   show
🗑
Why is AlexNet so important?   show
🗑
show High-level abstract reasoning (e.g., chess)  
🗑
show The robot’s current state (position/orientation/joint angles)  
🗑
show Estimate the true state by filtering noisy sensor measurements  
🗑
n an MDP for reinforcement learning, the agent aims to learn a policy that maximizes: A. Immediate reward only B. Cumulative (discounted) future rewards C. Total number of states visited D. Size of the action space   show
🗑
PID Controller   show
🗑
show used to estimate the true underlying state by statistically averaging noisy measurements over time, producing a smoother, more reliable signal.  
🗑
Control Theory:   show
🗑
Reinforcement Learning (RL)   show
🗑
show The standard mathematical framework for RL problems -defined by: States (S), Action (A), Transition Probabilities, Reward, Discount Factor  
🗑
show A function mapping states to actions. RL aims to find the optimal policy π∗ that maximizes expected discounted future rewards.  
🗑
show In SL, a model learns to map inputs to known outputs by minimizing prediction error on labeled training data. In RL, an agent learns through trial and error by interacting with an environment and optimizing its actions to maximize cumulative reward.  
🗑
show combines reinforcement learning with deep neural networks, using networks to approximate policies or value functions so agents can handle very large or continuous state/action spaces.  
🗑
show agent interacts with the environment using its current neural network policy. The collected experiences (state, action, reward, next state) are used to update the network parameters (θ) via gradient-based optimization, aiming to improve expected rewards.  
🗑
Adversarial Search   show
🗑
show Both exponential  
🗑
show both exponential  
🗑
show search algorithm that finds the path withe the lowest cumulative cost  
🗑
show both linear  
🗑
show BFS: explores level by level DFS: explores as deep as possible branch by branch  
🗑
utility-based agents   show
🗑
rational agent   show
🗑
show mathematical model of a biological neuron, a foundational element of artificial neural networks. -developed in 1958  
🗑
show -The output of one layer of Perceptrons serves as the input to the next layer. -This interconnected structure allows ANNs to represent much more complex functions than a single Perceptron.  
🗑


   

Review the information in the table. When you are ready to quiz yourself you can hide individual columns or the entire table. Then you can click on the empty cells to reveal the answer. Try to recall what will be displayed before clicking the empty cell.
 
To hide a column, click on the column name.
 
To hide the entire table, click on the "Hide All" button.
 
You may also shuffle the rows of the table by clicking on the "Shuffle" button.
 
Or sort by any of the columns using the down arrow next to any column heading.
If you know all the data on any row, you can temporarily remove it by tapping the trash can to the right of the row.

 
Embed Code - If you would like this activity on your web page, copy the script below and paste it into your web page.

  Normal Size     Small Size show me how
Created by: user-1948709
Popular Computers sets