Save
Busy. Please wait.
Log in with Clever
or

show password
Forgot Password?

Don't have an account?  Sign up 
Sign up using Clever
or

Username is available taken
show password


Make sure to remember your password. If you forget it there is no way for StudyStack to send you a reset link. You would need to create a new account.
Your email address is only used to allow you to reset your password. See our Privacy Policy and Terms of Service.


Already a StudyStack user? Log In

Reset Password
Enter the associated with your account, and we'll email you a link to reset your password.
focusNode
Didn't know it?
click below
 
Knew it?
click below
Don't Know
Remaining cards (0)
Know
0:00
Embed Code - If you would like this activity on your web page, copy the script below and paste it into your web page.

  Normal Size     Small Size show me how

CS 3375

QuestionAnswer
The performance of a machine is determined by instruction count (IC), clock cycles per instruction (CPI), and clock cycle time. Here, IC is determined by compiler and [ ? ]. Instruction Set Architecture
In a single cycle data path, every instruction begins on one clock edge and completes execution on the next clock edge. So the CPI is [ ? ] CPI = 1
In 6 by 1 multiplexer, how many selector bits do we need? 3 bits
How to determine the length of single clock cycle in a single-cycle datapath? The clock cycle is determined by the longest possible path (or the slowest operation, lw) in the machine
Pipeline rate is limited by [slowest? or fastest?] pipeline stage. Slowest
List five pipeline stages in a single-cycle datapath. IF(Instruction Fetch), ID(Instruction Decode), EX(Execute / Address Calculation), MEM(Memory Access (read / write), and WB (Write Back (results into register file)
Pipelining improves performance by increasing instruction throughput and decreasing the execution time of an individual instruction. False
Two's complement operation, Subtraction using addition of negative numbers
Arithmetic function: - Addition & subtraction - Need to know number representations - Integer and floating point arithmetic
Logical operations: - and - or - not
How do you know there is no overflow? - When adding a positive and a negative number - When signs are the same for subtraction
How do we know when there is an overflow Overflow occurs when the value affects the sign:
Examples of overflow: -When adding two positives yields a negative -Adding two negatives gives a positive, or -Subtract a negative from a positive and get a negative, or -Subtract a positive from a negative and get a positive,
MIPS commands that don't detect overflow addu, addiu, subu
lb command Loads a byte and stores the sign-extended version in a word
lbu command Loads a byte and stores it in a word
Drawback of integer binary multiplication - Half of 64-bit multiplicand register are zeros - Half of 64 bit adder is adding zeros
Solution to the drawback of integer binary multiplication - Shift product right instead of multiplicand left - Only left half of product register added to multiplicand
Steps in hardware integer division: - Shift the dividend left one position - Subtract the divisor from the left half of the dividend - if (result positive), Shift left a 1 into the quotient; - else, Shift left a 0 into the quotient; - Once the result is positive, Repeat the process for
Signed magnitude representation (-1)^S x F x B^E, S = sign (either 0 (pos) or 1 (neg)) F = fraction (also called the mantissa) B = base (implied) (Ex, 2 or 10) E = exponent
Overflow in Signed magnitude representation The exponent is too large to be represented in the exponent field
Underflow in Signed magnitude representation The negative exponent is too large to fit in the exponent field
IEEE 754 floating-point standard: normalized numbers in binary format (-1)^s x (1 + Fraction) x 2^E
normalized numbers in binary format: single precision The significand is 24 bits long (1 + 23-bit fraction)
normalized numbers in binary format: double precision The significand is 53 bits long (1 + 52-bit fraction)
normalized numbers in binary format: single precision exponential bias Exponential Bias = 127
normalized numbers in binary format: double precision exponential bias Exponential Bias = 1023
How is the Instruction count determined? Determined by ISA and compiler
Instruction formats: R-type Instructions like registers and words of data, are 32 bits long Ex: add $t0, $s1, $s2
Instruction formats: I-type Load/Store instruction format Ex: lw $t1, 24($s2)
Instruction formats: J-type Jump formats Ex: jal 0x00400030
Instruction implementation: PC goes to instruction memory, fetch instruction
Instruction implementation:Register numbers goes to register file, read registers
In R - type instructions format, use ALU to calculate Arithmetic result
In I - type instructions format, use ALU to calculate Memory address for load/store
In J- type instructions format, use ALU to calculate Branch target address
The functional units in the MIPS: Combinational - Elements that operate on data values - Output is a function of input
The functional units in the MIPS: Sequential - Elements that contain state - Output depends on both inputs and the contents of the internal state
Logic Design Conventions: Sequential Elements: Register - stores data in a circuit - Uses a clock signal to determine when to update the stored value - Edge-triggered: update when Clk changes from 0 to 1
Logic Design Conventions: Sequential Elements: Register with write control - Only update on clock edge when write control input is 1 - Used when stored value is required later
Clocking Methodology - When signals can be read & when they can be written - Edge-triggered clocking – update only on a clock edge
instruction memory stores the instructions of a program
program counter (PC) keeps the address of the instruction
adder increment the PC or jump to the address of the next instruction
To read one word from a register? Need an input (register number) and one output (data)
To write one word from a register? Need two inputs (register number & data)
Single cycle datapath: The implementation attempts to execute all instructions in 1 clock cycle CPI is always 1
Any element needed more than once must be duplicated
 No datapath resource can be used more than once per instruction? A separate instruction memory is required
Multiplexor (Data Selector): - Allow multiple connections to the input of an element - A control signal selects among the inputs
The key differences between R-type and memory-reference (lw, sw) are: - The second input to the ALU is either a register (R-type) or the sign-extended half of the instruction (memory) - The value stored into a destination register comes from either the ALU (R-type) or the memory (lw? or sw?)
Single Cycle Datapath & Control Unit equation CPU = Datapath + Control
Single-cycle design in a Single Cycle Datapath & Control Unit: - Instruction takes exactly one clock cycle (CPI = 1) - Datapath units used only once per cycle - Writable state updated at end of cycle
What must be “controlled” in a Single Cycle Datapath & Control Unit? - Multiplexors - Writable state elements: * Register File, Data Memory - ALU (which operation?) – 4 bits
Control unit” implementation steps: - Identify control inputs and control outputs - Make a control signal table for each cycle - Derive control logic from the control table
Control inputs: Opcode (6 bits)
Control outputs: - RegDst - ALUSrc - MemtoReg - RegWrite - MemRead - MemWrite - Branch - Jump - ALUctr
PCSrc Should be set (1) ... if the instruction is branch on equal (beq), and the Zero output of the ALU is true
What is pipelining? - Multiple instructions are overlapped in execution - Key to making processors fast is to make everything work in parallel
Speedup due to pipelining the number of steps in the pipeline
Classic MIPS Instructions: Fetch Fetch instruction from memory
Classic MIPS Instructions: Read Read registers while decoding the instruction
Classic MIPS Instructions: Execute Execute the operation or calculate an address
Classic MIPS Instructions: Access Access an operand in data memory
Classic MIPS Instructions: Write Write the result into a register
Pipeline Hazards: Structural hazards attempt to use same resource twice
Pipeline Hazards: Data hazards attempt to use data before it is ready
Pipeline Hazards: Control hazards attempt to make decision before condition is evaluated
How to we prevent Pipeline Hazards? Can always resolve hazards by waiting
Data Hazards: Forwarding: - connect new value directly to next stage - getting the missing item early from the internal resources
Forwarding paths are valid, Only if the destination stage is later in time than the source stage.
Reordering instructions - Change the order of instructions to make sure the data is ready when the instruction is used - Use compiler to avoid hazards
Control Hazards: Delayed Branch - Always execute the next sequential instruction, with the branch taking place after that one instruction delay - Place an instruction immediately that is NOT affected by the branch
Instruction Fetch: Pipelined Datapath: Instructions and data Move generally from left to right through the 5 stages
Instruction Fetch: Pipeline registers - To retain the values of each stage for the next stage - Must be wide enough to store all the data corresponding to the lines that go through them
Instruction Fetch: The PC address is incremented by.... 4
Instruction Fetch: The instruction is read from... Memory using the address in the PC and then placed in the IF/ID pipeline register
Instruction Fetch: Then written back into the PC to... Be ready for the next clock cycle
Instruction fetch: The computer cannot know... What type of instruction is being fetched
The IF/ID pipeline register supplies... The 16-bit immediate field and the register numbers to read
All instructions advance during...from pipeline step to the next each clock cycle
Pipelined Datapath: 2 exceptions - Place the result back into the register file - Selection of the next value of the PC
Execution: The contents of register 1 and the sign-extended immediate from the ... ID/EX pipeline register and adds them using the ALU
Execution: The sum is placed in the EX/MEM pipeline register
Memory: The instruction loads the data into the... MEM/WB pipeline register
lw instruction: Any information needed in a later pipe stage must be passed to that stage via a... pipeline register
sw instruction: Steps to implement Instruction fetch -> Instruction decode and register file read -> Execute and address calculation -> Memory access -> Write back
What happens in the Write back stage during the sw instruction? Nothing
Why does the sw stage passes the writeback stage when nothing happens? Because later instructions are already progressing at the maximum rate
What a bug happens when the load word instruction is run? IF/ID register pipeline register will be overwritten by following instructions, the destination register number is not preserved
What is the solution to the bug in the load word instruction? Pass the register number, From the ID/EX through EX/MEM to the MEM/WB pipeline register for use in the WB stage
Why is pipelining difficult to understand? Many instructions are simultaneously executing in a single datapath in every clock cycle
Graphical representation of pipelining? - Multiple-clock-cycle pipeline diagrams - Single-clock-cycle pipeline diagrams
Single-clock-cycle pipeline diagrams: Show the details of what is happening
MIPS architecture was... Designed to be pipelined
Pipelining in MIPS - Simple instruction format (makes IF, ID easy) - Memory operations only in lw, sw instructions (simplifies EX) - Memory operands aligned in memory (simplifies MEM) - Single value for write-back (limits forwarding)
Pipelining in MIPS: Simple instruction format (makes IF, ID easy) - Single-word instructions - Small number of instruction formats - Common fields in same place (e.g., rs, rt) in different formats
There are NO separate write signals for, The pipeline register
If the PC is written on each clock cycle? There are NO separate write signals
Adding control: Basic approach - Build on single-cycle control - Place control unit in ID stage - Pass control signals to following stages
Adding control: Later approaches - Extra features to deal with - Data forwarding; Stalls; Exceptions
Pipelined Control: IF Nothing special
Pipelined Control: ID No optional control lines to set
Pipelined Control: EX RegDst, ALUOp, ALUSrc
Pipelined Control: MEM Branch, MemRead, MemWrite
Pipelined Control: WB MemtoReg, RegWrite
Implementing control Setting the nine control lines to the values in each stage for each instruction
Control lines start with the... EX stage
Forwarding - The write is in the first half of the clock cycle - The read is in the second half, so the read delivers what is written -The result is available at the end of EX stage in first instruction (or clock cycle 3)
Controlling Forwarding First, detect a hazard and then, forward the proper value to resolve the hazard
Data Hazard Detection: "EX" hazard - EX/MEM - test whether instruction writes register file and examine rd register - ID/EX - test whether instruction reads rs or rt register and matches rd register in EX/MEM
"EX" hazard: Two pairs of data hazard conditions are 1a. EX/MEM.RegisterRd == ID/EX.RegisterRs 1b. EX/MEM.RegisterRd == ID/EX.RegisterRt
Data Hazard Detection: "MEM" hazard - MEM/WB - test whether instruction writes register file and examine rd (rt) register - ID/EX - test whether instruction reads rs or rt register and matches rd (rt) register in EX/MEM
"MEM" hazard: Two pairs of data hazard conditions are 2a. MEM/WB.RegisterRd == ID/EX.RegisterRs 2b. MEM/WB.RegisterRd == ID/EX.RegisterRt
Add hardware to feed back... ALU and MEM results to both ALU inputs
If we can take the inputs to the ALU from any pipeline register rather than just ID/EX, Forward the proper data
The forwarding control will be in the... EX Stage
Why is the forwarding control will be in the EX stage? Because the ALU forwarding multiplexers are found in that stage
We can take the inputs to the ALU from...rather than just... any pipeline register, ID/EX
What if... Double Hazards? The result is forwarded from the MEM stage
Why is the result is forwarded from the MEM stage? Because the result in the MEM stage is the more recent result
When an instruction tries to read a register following a load instruction that writes the same register even with forwarding... Cannot avoid a pipeline stall
Hazard Detection Unit Operate during the ID stage: It can insert the stall between the load and its use.
Hazard Detection Unit: Test to see if, Destination register of the load in the EX stage == source register of the instruction of ID stage
To stall the pipeline... To stall the pipeline
if (we move the branch execution earlier in the pipeline) then fewer instructions need be flushed.
Two schemes for resolving control hazards: - Assume branch NOT taken - Reducing the delay of branch
If the branch is taken after assuming it is not... - The instruction that are being fetched and decoded must be discarded. - Execution continues at the branch target.
Many MIPS implementations move the branch execution to the ID stage. 1 pipeline stall
For branch equal, Compare the two registers during the ID stage.
Principle of Locality Programs access a small proportion of their address space at any time
Temporal locality - If a data item is accessed, it is likely to be accessed again soon - e.g., instructions in a loop, induction variables
Spatial locality - If a data item is accessed, data items whose addresses are close by are likely to be accessed soon - e.g., sequential instruction access, array data
Implement the memory of a computer as a memory hierarchy,
What is attached to the CPU? Cache memory
A level closer to the processor is a... subset of any level further away
All the data are stored at the... lowest level
Data is copied between... only two adjacent levels at a time
Upper Level Memory Faster and smaller
Lower Level Memory lower and bigger
Block the minimum unit of information
Hit if the data requested is found in the upper level
Miss if the data is not found
Hit rate (hit ratio) the fraction of memory accesses found in the upper level
Miss rate (1 – hit rate)
Hit time Time to access the upper level of the memory hierarchy
Miss penalty (Time to replace a block in the upper level with the corresponding block from the lower level) + (the time to deliver this block to the processor)
Cache - Located between the CPU and main memory - Contains a collection of recent references: - Takes advantage of locality of access
Principle of locality Programs access a relatively small portion of their address space at any instant of time
Direct Mapped - When assign the cache location, - Based on the address of the word in the memory - Simplest way to assign a location in the cache.
Number of cache entries power of two
Variations of Set Associative: Direct mapped: one-way set-associative
Variations of Set Associative: Fully associative with m entries: m-way set-associative
Variations of Set Associative: n: degree of associativity
Variations of Set Associative: Advantage of increasing the degree Decrease the miss rate
Variations of Set Associative: Disadvantage of increasing the degree: Increase the hit time
Read-stall clock cycles (Reads / Program) x Read-stall x clock cycles
Write-stall clock cycles (for write-through) ((Write/Program) x Write miss rate x Write miss penalty ) + Write buffer stall
Larger blocks exploit spatial locality to... lower miss rates
But increasing the block size results in what problem? Increase the miss penalty
Cache inconsistency - Write the data into only the data cache (without changing main memory) - Then, memory would have a different value from that in the cache.
Reduce the miss penalty? Increase the bandwidth from the memory to cache.
Increase the bandwidth to memory - Widen the memory and the buses between the processor and memory - Parallel access to all the words of the block
The memory chips are organized in? Banks to read or write multiple words in one access time.
CPU time (CPU execution clock cycles + Memory-stall clock cycles) x Clock cycle time
Memory-stall cycles (Memory access/ program) x Miss rate x Miss penalty or (Instructions/ program) x (Misses/ Instructions) x Miss penalty
Instruction Set Architecture (ISA) The one “true” language of a machine, Boundary between hardware and software, The hardware’s specification, and Defines “what” a machine does
Machine Organization The “guts” of the machine, “how” the hardware works, The implementation, and Must obey the ISA abstraction
Application software Written in high-level language
System software - Compiler: translates HLL code to machine code - Operating System: service code
Hardware Processor, memory, I/O controllers
RISC philosophy: - Fixed instruction lengths - Limited addressing modes Program is longer but takes shorter to interpret - Limited operations
CISC philosophy: - Increased capability of each instruction - Lead to more addressing modes - Variable length instructions - Instructions execution time variable Program is shorter but takes longer to interpret
Hardware performance - Time, Time, & Time - Key to the effectiveness of an entire system of hardware and software - Critical to purchasers, and to designers
Response time (or execution time) - Time between the start and completion of a task - Important to individual users
Throughput - Total amount of work done in a given time - Important to data center managers
Decreasing response time almost always, Improve throughput
Elapse time - counts everything (disk and memory accesses, I/O , etc.) - a useful number, but often not good for comparison purposes
CPU time (CPU execution time), - doesn't count I/O or time spent running other programs - can be broken up into system time, and user time
Created by: Valitine
Popular Miscellaneous sets

 

 



Voices

Use these flashcards to help memorize information. Look at the large card and try to recall what is on the other side. Then click the card to flip it. If you knew the answer, click the green Know box. Otherwise, click the red Don't know box.

When you've placed seven or more cards in the Don't know box, click "retry" to try those cards again.

If you've accidentally put the card in the wrong box, just click on the card to take it out of the box.

You can also use your keyboard to move the cards as follows:

If you are logged in to your account, this website will remember which cards you know and don't know so that they are in the same box the next time you log in.

When you need a break, try one of the other activities listed below the flashcards like Matching, Snowman, or Hungry Bug. Although it may feel like you're playing a game, your brain is still making more connections with the information to help you out.

To see how well you know the information, try the Quiz or Test activity.

Pass complete!
"Know" box contains:
Time elapsed:
Retries:
restart all cards