383 Final

Quiz yourself by thinking what should be in each of the black spaces below before clicking on it to display the answer.

Help!

Term

Definition

What would it mean for a machine to pass the Turing Test?

show

🗑

What is the primary advantage of the rational agent approach for the purpose of science research?

show

🗑

show

A machine that can pick up objects using eating utensils

🗑

show

Moravec’s Paradox highlights that AI systems easily outperform humans on abstract, logic‑based tasks (like chess or arithmetic) yet struggle with “simple” skills (like recognizing objects ) that evolution has optimized over millions of years.

🗑

What is the difference between Tree-Search Algorithm and Graph-Search Algorithm?

show

🗑

T or F: A* search is always optimal

show

🗑

In what year was the field of AI officially founded?

show

🗑

show

The study of Intelligent Agents

🗑

What are the inputs to a search problem? a. Initial State b. Goal State c. Action cost function d. Transition Model e. All of the above

show

🗑

T or F: Adversarial Search is exactly the same as performing regular search in a multi-agent environment?

show

🗑

What is the name of the computer that famously defeated world champion Gary Kasparov in chess in 1997?

show

🗑

show

Space: Exponential, Time: Exponential

🗑

What is the main way that graph search algorithms differ?

show

🗑

show

Heuristic function

🗑

show

A sequence of actions to reach the goal state

🗑

show

A goal-based agent selects action to achieve a specific goal, while a utility-based agent selects actions based on how desirable they ate

🗑

In what way does online search demonstrate an example of machine learning?

show

🗑

show

Alpha-beta

🗑

show

the goal test checks if all states in the beliefe state are the goal state

🗑

Which of the following tasks would always involve non-deterministic actions?

show

🗑

show

Any system capable of general intelligence must operate on sybmols and symbolic manipulation

🗑

Which best describes the argument in Rodney Brook's 1990 influential paper "Elephants Don't Play Chess"?

show

🗑

In the field of AI, what are the two necessary components of an agent?

show

🗑

h(n) = 0 is this heuristic function admissible given non-negative edge weights?

show

🗑

What are the four environment assumptions needed to execute proper execution of "pure" search algorithms?

show

🗑

show

A reflex agent is rational if it selects actions that maximizes its performance measure based on current percept and knowledge, given its environment and actions. ex. thermostat controlling heating system is rational if it turns on when temp is below

🗑

show

- the critical assumption is that your apponent (MIN) will take the optimal move at every step to minimize MAX's score - if this is not true, the algorithm may not yield optimal actions

🗑

show

to find incremental adjustments to make to all the weights in a neural network

🗑

in general, how do we optimize an ML model?

show

🗑

Gradient Descent is a general algorithm to do what?

show

🗑

Why is a step function not an ideal activation function for a neural network?

show

🗑

show

false

🗑

which best describes a loss function?

show

🗑

in the contect of Machine learning, which best describes a Validation Set?

show

🗑

what does alpha describe in the gradient descent update equation?

show

🗑

show

h(x) is an estimate function of the underlying tre function f(x) which relates features, x to labels, y

🗑

which best describes the intuition for k-nearest neighbors classifications?

show

🗑

show

model selection involves experimentation influenced by the problem, data, and evaluation

🗑

what is the backpropogation algorithm used to compute?

show

🗑

show

a binary linear classifier

🗑

which of the following best describe the intuition behind linear regression?

show

🗑

show

harmonic learning

🗑

show

true

🗑

show

multi-layer perceptron

🗑

in context of machine learning, which best describes the concept of Ockham's Razor?

show

🗑

which describes a likely observation you could make on an overfit model?

show

🗑

show

a hyperplave divides data in such a way that maximizes the margin between the categories

🗑

show

its not differentiable and thus incompatible with differential optimization techniques

🗑

show

false

🗑

show

a linear sum of weighted inputs is taken. if that sum exceeds a set value then the perceptron activates sending a fixed signal to its downstream connections

🗑

show

A trick to find a decision boundary in a different coordinate space

🗑

T or F: Multilayer neural networks can in theory predict any continous function

show

🗑

show

false

🗑

physical symbol system hypothesis:

show

🗑

show

Physical symbol system hypothesis

🗑

show

Adjusting neural-network parameters from data

🗑

Connectionist AI:

show

🗑

show

A. Symbolic AI

🗑

show

C. Understanding and generating human language

🗑

show

Enabling agents to "see" and interpret visual information.

🗑

The chain rule in language modeling expresses

show

🗑

show

P(wᵢ|w₁…wᵢ₋₁) ≈ P(wᵢ|wᵢ₋₁) - a bigram is two units ex. 'th', or 'the cat' - Edge only from Wi−1 to Wi.

🗑

show

Unigram

🗑

show

Corpus: A collection of text or speech data used for analysis or training

🗑

show

Table size grows as |V|ᵐ (explodes)

🗑

Language identification via n-grams works by: A. Counting parts of speech B. Comparing sequence probability under each language model C. Parsing with a CFG D. Measuring sentence length

show

🗑

show

Maximum length of prefix tokens the model can attend to

🗑

show

Instruction tuning

🗑

RLHF stands for: A. Reinforcement Learning from Human Feedback B. Recurrent Language Hierarchical Framework C. Randomized Learning Hyperparameter Fitting D. Rule-based Language Heuristic Fusion Answer: A

show

🗑

show

Have the model articulate its step-by-step reasoning

🗑

show

- Extract specific pieces of structured information from unstructured or semi-structured text - An example is extracting product names and their prices from websites

🗑

show

A sequence of characters defining a search pattern (ex. format of a price or phone number)

🗑

Risk of LLM's being prompted to perform information extraction:

show

🗑

Information retrieval

show

🗑

components of information retrieval

show

🗑

In TF-IDF scoring, a term gets high weight if it is: A. Frequent in all documents B. Rare in the corpus but frequent in the current document C. Absent from the current document D. Only appears in stop-word list

show

🗑

show

- TF: How often a term appears in the document. High scores suggests relevance - IDF: How rare a term is across the entire corpus. Rarer terms are considered more informative IDF(t) = log(N/df(t)), [N = total docs, d(f) = # docs containing term]

🗑

PageRank ranks web pages based on: A. Term frequency B. Link structure (“importance” via incoming links) C. Document length D. Keyword density

show

🗑

In a TF-IDF scheme, what role does the IDF component play?

show

🗑

NLP Task: Syntactic Parsing

show

🗑

show

A type of grammer where rules apply regardless of surrounding context In a plain CFG, if you have two ways to expand NP—say 1. NP → Det Noun (ex. "the cat") 2. NP → Name (ex. Sam) you don’t say which one is preferred; both parses are just “allowed.”

🗑

A Probabilistic CFG (PCFG) extends a CFG by

show

🗑

show

Assigns probabilities to each grammer rule based on its observed frequency in a corpus

🗑

show

NP └── Det Noun | | “the” “dog”

🗑

Calculate the PCFP of "the cat": Rule Probability NP → Det Noun 0.6 NP → "dog" 0.5 Det → “the” 1.0 Det → “cat” 0.5

show

🗑

In a parse tree, “terminals” are: A. Non-terminal symbols like NP or VP B. Actual words of the sentence C. Probability values D. Grammar rules

show

🗑

Word embeddings differ from one-hot vectors because they are: A. Sparse and high-dimensional B. Dense and low-dimensional, learned to capture similarity C. Randomly assigned D. Always binary

show

🗑

Word embedding example

show

🗑

show

"cat" → [1,0,0] "dog" → [0,1,0] "mouse" → [0,0,1] every word is turned into a vector of length |V| = 3, with a single 1 at its index

🗑

show

One-hot are sparse and don't capture meaning. Word embeddings are better: each word is mapped to a dense, relatively low-dimensional vector whose values are learned during training. These embeddings often capture semantic relationships between words.

🗑

show

Normalize outputs into a probability distribution (sum=1)

🗑

An RNN’s hidden state ht is updated by combining: A. Previous output only B. Previous hidden state ht₋₁ and current input embedding C. Unrelated random noise D. Future target tokens

show

🗑

show

Both vanishing and exploding gradients

🗑

Greedy decoding always picks: A. A random next word B. The highest-probability next word C. The least probable next word D. A word based on TF-IDF

show

🗑

How do RNN's build off of fixed-windows?

show

🗑

Search algorithm components

show

🗑

show

the parameter controlling randomness - higher temp = more randomness (flattens distribution) - lower temp = more greedy (sharpens distribution)

🗑

show

-A common advanced sampling method that considers only the most probable words whose cumulative probability exceeds a threshold 'p'. -all tokens with cumulative probability ≥ p -balance temp by sampling over core set (nucl) of most probable next words

🗑

show

Predefined patterns and templates (often regex)

🗑

Conversational Agents

show

🗑

show

-chatbots: designed for open-ended convo -task-oriented: designed to help user accomplish specific goals (e.g. Siri/Alexa, automated phone system)

🗑

Corpus-based chatbot

show

🗑

Rule-based chatbot strength and weakness

show

🗑

show

-handle more variety -lack control and can inherit biases or undesirable content from the training data

🗑

Automatic Speech Recognition (ASR)

show

🗑

Natural Language Understanding (NLU)

show

🗑

NLU: Domain Classification

show

🗑

show

Identify the specific action requested within the doman (e.g. GetWeather, PlayMusic etc.)

🗑

NLU: Slot filling

show

🗑

show

- extract high-level knowledge and understanding from visual data - focuses on enabling machines to interpret and understand information from images and videos

🗑

computer vision relation to agent paradigm

show

🗑

Digital pixel colors: Grayscale and Color

show

🗑

show

- an algorithm for object detection, particularly face detection - uses rectangualr features called Haar-like features which capture basic patterns of intensity differences in faces (e.g. the eye region is typically darker than upper cheeks)

🗑

Convolutional Neural Networks (CNNs)

show

🗑

show

- convolution - pooling

🗑

convolution

show

🗑

show

– Downsample each feature map by summarizing small regions (e.g., taking the maximum in each 2×2 block). – Reduces spatial dimensions and computation. – Introduces slight invariance to shifts or distortions in the input.

🗑

A convolutional layer differs from a fully-connected layer by: A. Using hand-designed filters only B. Applying the same small filter (kernel) across the spatial dimensions (parameter sharing) C. Operating on one pixel at a time

show

🗑

show

Reduce spatial dimensions and add invariance to small shifts

🗑

show

A deep CNN architecture (8 layers: 5 convolutional, 3 fully connected) developed by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton.

🗑

Why is AlexNet so important?

show

🗑

Moravec’s paradox states that tasks easy for humans (walking, grasping) are often harder for AI than: A. Simple lookup tables B. High-level abstract reasoning (e.g., chess) C. Calculating arithmetic D. Sorting numbers

show

🗑

show

The robot’s current state (position/orientation/joint angles)

🗑

show

Estimate the true state by filtering noisy sensor measurements

🗑

show

Cumulative (discounted) future rewards

🗑

show

tracks the error between the desired and actual states and adjusts its output using proportional, integral, and derivative terms to counter disturbances and accurately reach and maintain the target state.

🗑

Kalman filters algorithm

show

🗑

show

A field dedicated to designing control systems that handle uncertainty and ensure desired behavior despite noise and disturbances. -ex. PID Controllers Reinforcement Learning (RL)

🗑

show

Agent learns through trial-and-error interaction with an environment, receiving feedback in the form of rewards or punishments, without explicit data/label pairs like supervised learning.

🗑

Markov Decision Process (MDP)

show

🗑

show

A function mapping states to actions. RL aims to find the optimal policy π∗ that maximizes expected discounted future rewards.

🗑

Reinforcement learning (RL) vs Supervied Learning (SL)

show

🗑

Deep Reinforcement Learning (Deep RL)

show

🗑

show

agent interacts with the environment using its current neural network policy. The collected experiences (state, action, reward, next state) are used to update the network parameters (θ) via gradient-based optimization, aiming to improve expected rewards.

🗑

show

deals with multi-agent environments where other agents are actively trying to prevent the agent from reaching its goal -ex. board games

🗑

A* search time and space complexity

show

🗑

unifrom cost search time and space complexity

show

🗑

Uniform Cost Search

show

🗑

DFS time and space complexity

show

🗑

BFS vs DFS

show

🗑

utility-based agents

show

🗑

show

defined as one that selects actions expected to maximize its performance measure, given its perceptions and built-in knowledge.

🗑

the perceptron model

show

🗑

show

-The output of one layer of Perceptrons serves as the input to the next layer. -This interconnected structure allows ANNs to represent much more complex functions than a single Perceptron.

🗑

Review the information in the table. When you are ready to quiz yourself you can hide individual columns or the entire table. Then you can click on the empty cells to reveal the answer. Try to recall what will be displayed before clicking the empty cell.

To hide a column, click on the column name.

To hide the entire table, click on the "Hide All" button.

You may also shuffle the rows of the table by clicking on the "Shuffle" button.

Or sort by any of the columns using the down arrow next to any column heading.
If you know all the data on any row, you can temporarily remove it by tapping the trash can to the right of the row.

Embed Code - If you would like this activity on your web page, copy the script below and paste it into your web page.

<iframe src="http://www.studystack.com/istudytable-4457330" width="1278px" height="9480px" frameborder="0"></iframe><br /><a href="http://www.studystack.com">Flashcards and educational games by StudyStack</a><br />

Normal Size Small Size show me how

Created by: user-1948709

Popular Computers sets

Definitions for Word Processing and the main parts of the MS Word 2010 window

WP Page Formatting

WP Paragraph Formatting terms

Review business letter parts

Review the three paragraph formats (block, indented, hanging indent)

WP font formatting features in Word 2010

Arkansas CBA unit 1 Hardware