4.5 Pipelined Y86
Quiz yourself by thinking what should be in
each of the black spaces below before clicking
on it to display the answer.
Help!
|
|
||||
|---|---|---|---|---|---|
| Understand the need for pipeline registers between stages. | - Save state so you can do data forwarding.
- Send data and control values to later stages.
🗑
|
||||
| Understand fowarding and how it works. | Hardware notices that there’s a register for some operation
and that there’s a pending write to that register. So
instead of stalling it will directly pass the value from
one pipeline stage to another.
🗑
|
||||
| Know why control hazards arise, and specifically which arise in the Y86 pipeline, and how they're dealt with. | Occur when the processor cannot reliably determine the address of the next instruction based on current instruction in fetch stage.
🗑
|
||||
| Control Hazards can only occur in ?? | ret and jump instructions
🗑
|
||||
| Control Hazards: for rets, you simply | bubble/stall
🗑
|
||||
| Control Hazards, In jumps, you ... | predict the branch to take and if it's wrong you can inject bubbles to cancel
🗑
|
||||
| what is branch prediction | branch prediction is when a guess is made as to whether or not to do the jump.
🗑
|
||||
| branch prediction strategies in Y86? | - Call and unconditional jumps: next PC is valPC, always reliable.
- Conditional jump, next pc is valPC (destination), only correct if the branch is taken. 60% usually.
- return you don't try to predict
🗑
|
||||
| How to recover from a branch misprediction | - need to throw away instructions between prediction and resolution, usually done through bubbles.
- will see branch flag once in memory stage. can get fall through PC from valA.
🗑
|
||||
| Know the relationships and differences between a NOP (program statement), a stall, and a bubble. | - NOP- hardcoded in
- Stalling - processor holds back one or more instructions in the pipeline until hazard condition doesn't hold
- bubbles - dynamically generated no instruction. doesn't cause any changes to register, memory, CC or program status.
🗑
|
||||
| What is a load/use Hazard | A load-use hazard requires delaying the execution of the using instruction until the result from the loading instruction can be made available to the using instruction.
🗑
|
||||
| Understand data forwarding as applied to the Y86 pipeline | - register isn't written until completion of write-back stage
- source operands reads from register file in decode stage
- needs to be in register file at start of stage
🗑
|
||||
| Understand the limitation of forwarding and how it can't always supply the value fast enough; what do you do then (stall / bubble) | Memory reads occur late in the pipeline so if mrmovq moving into %rax, and then addq using %rax, then ISSUE. Add inst. requires value in %rax in a cycle but the value isn't loaded until the cycle after. fixed by stalling until you can forward the data.
🗑
|
||||
| Understand the CPI computation with pipelining. | CPU time = (seconds/ program)= (instructions/program) * (cycles/instruction) * (seconds/ cycle).
CPI = (C_i + C_b)/C_i = 1.0 + (C_b/C_i)
CPI = 1.0 + lp + mp + rp
🗑
|
||||
| that introducing pipelining into a system with feedback can lead to problems when there are dependencies between successive instructions. two forms of these dependencies | data and control
🗑
|
||||
| data dependencies, | where the results computed by one instruction are used as the data for a following instruction
🗑
|
||||
| control dependencies, | where one instruction determines the location of the following instruction, such as when executing a jump, call, or return.
🗑
|
||||
| When such dependencies have the potential to cause an erroneous computation by the pipeline, they are called | they are called hazards.
🗑
|
||||
| One very general technique for avoiding hazards involves stalling, where the processor ... | holds back one or more instructions in the pipeline until the hazard condition no longer holds.
🗑
|
||||
| It injects a bubble into execute stage and repeats the decoding of the addl instruction in cycle 7. In effect, the machine has ... pg 414 | dynamically inserted a nop instruction, giving a flow similar to that shown for prog1 (Figure 4.43).
🗑
|
||||
| A bubble is like ___—it does not cause any changes to the registers, the memory, the condition codes, or the program status. | a dynamically generated nop instruction
🗑
|
||||
| stalling can be implemented fairly easily BUT | the resulting performance sometimes is not very good. it can reduce the overall throughput
🗑
|
||||
| avoiding data hazards by forwarding | Rather than stalling until the write has completed, it can simply pass the value that is about to be written to pipeline register E as the source operand.
🗑
|
||||
| This technique of passing a result value directly from one pipeline stage to an earlier one is commonly known as | data forwarding (or simply forwarding, and sometimes bypassing).
🗑
|
||||
| limitations of data forwarding | one class of data hazards that cannot be handled purely by forwarding, because memory reads occur late in the pipeline. called load/use hazard.
🗑
|
||||
| we can avoid a load/use data hazard with a combination of | stalling and forwarding.
🗑
|
||||
| This use of a stall to handle a load/use hazard is called | a load interlock.
🗑
|
||||
| Load/use hazards: The pipe line must stall for one cycle between an instruction that reads a value from | memory and an instruction that uses this value.
🗑
|
||||
| Mispredicted Branches: By the time the branch logic detects that a jump should not have been taken, several instructions at the branch target will have started down the pipeline. These instructions must... | be removed from the pipeline
🗑
|
||||
| example of load/use hazard | one instruction (the mrmovl at address 0x018) reads a value from memory for register %eax while the next instruction (the addl at address 0x01e) needs this value as a source operand.
🗑
|
||||
| A return instruction generates ___ bubbles, a load/use hazard generates ___, and a mispredicted branch generates ___. | three// one// two
🗑
|
||||
| cpi | cycles per instruction // the reciprocal of the average throughput of the pipeline, but with time measured in clock cycles rather than picoseconds. It is a useful measure of the architectural efficiency of a design.
🗑
|
||||
| CPI = (C_i + C_b) / C_i means what | If the stage processes a total of Ci instructions and Cb bubbles, then the processor has required around Ci + Cb total clock cycles to execute Ci instructions.
🗑
|
||||
| CPI = 1.0 + lp + mp + rp. What does each term represent? | There are 3 different instructions that can cause a bubble to be injected:
lp = load penalty ( avg freq of bubbles introduced while stalling for load/use hazards)
mp = avg freq of bubbles due to mispredicted branch
rp = avg freq while stalling for ret
🗑
|
Review the information in the table. When you are ready to quiz yourself you can hide individual columns or the entire table. Then you can click on the empty cells to reveal the answer. Try to recall what will be displayed before clicking the empty cell.
To hide a column, click on the column name.
To hide the entire table, click on the "Hide All" button.
You may also shuffle the rows of the table by clicking on the "Shuffle" button.
Or sort by any of the columns using the down arrow next to any column heading.
If you know all the data on any row, you can temporarily remove it by tapping the trash can to the right of the row.
To hide a column, click on the column name.
To hide the entire table, click on the "Hide All" button.
You may also shuffle the rows of the table by clicking on the "Shuffle" button.
Or sort by any of the columns using the down arrow next to any column heading.
If you know all the data on any row, you can temporarily remove it by tapping the trash can to the right of the row.
Embed Code - If you would like this activity on your web page, copy the script below and paste it into your web page.
Normal Size Small Size show me how
Normal Size Small Size show me how
Created by:
aengstlich