Question
click below
click below
Question
Normal Size Small Size show me how
4.5 Pipelined Y86
Question | Answer |
---|---|
Understand the need for pipeline registers between stages. | - Save state so you can do data forwarding. - Send data and control values to later stages. |
Understand fowarding and how it works. | Hardware notices that there’s a register for some operation and that there’s a pending write to that register. So instead of stalling it will directly pass the value from one pipeline stage to another. |
Know why control hazards arise, and specifically which arise in the Y86 pipeline, and how they're dealt with. | Occur when the processor cannot reliably determine the address of the next instruction based on current instruction in fetch stage. |
Control Hazards can only occur in ?? | ret and jump instructions |
Control Hazards: for rets, you simply | bubble/stall |
Control Hazards, In jumps, you ... | predict the branch to take and if it's wrong you can inject bubbles to cancel |
what is branch prediction | branch prediction is when a guess is made as to whether or not to do the jump. |
branch prediction strategies in Y86? | - Call and unconditional jumps: next PC is valPC, always reliable. - Conditional jump, next pc is valPC (destination), only correct if the branch is taken. 60% usually. - return you don't try to predict |
How to recover from a branch misprediction | - need to throw away instructions between prediction and resolution, usually done through bubbles. - will see branch flag once in memory stage. can get fall through PC from valA. |
Know the relationships and differences between a NOP (program statement), a stall, and a bubble. | - NOP- hardcoded in - Stalling - processor holds back one or more instructions in the pipeline until hazard condition doesn't hold - bubbles - dynamically generated no instruction. doesn't cause any changes to register, memory, CC or program status. |
What is a load/use Hazard | A load-use hazard requires delaying the execution of the using instruction until the result from the loading instruction can be made available to the using instruction. |
Understand data forwarding as applied to the Y86 pipeline | - register isn't written until completion of write-back stage - source operands reads from register file in decode stage - needs to be in register file at start of stage |
Understand the limitation of forwarding and how it can't always supply the value fast enough; what do you do then (stall / bubble) | Memory reads occur late in the pipeline so if mrmovq moving into %rax, and then addq using %rax, then ISSUE. Add inst. requires value in %rax in a cycle but the value isn't loaded until the cycle after. fixed by stalling until you can forward the data. |
Understand the CPI computation with pipelining. | CPU time = (seconds/ program)= (instructions/program) * (cycles/instruction) * (seconds/ cycle). CPI = (C_i + C_b)/C_i = 1.0 + (C_b/C_i) CPI = 1.0 + lp + mp + rp |
that introducing pipelining into a system with feedback can lead to problems when there are dependencies between successive instructions. two forms of these dependencies | data and control |
data dependencies, | where the results computed by one instruction are used as the data for a following instruction |
control dependencies, | where one instruction determines the location of the following instruction, such as when executing a jump, call, or return. |
When such dependencies have the potential to cause an erroneous computation by the pipeline, they are called | they are called hazards. |
One very general technique for avoiding hazards involves stalling, where the processor ... | holds back one or more instructions in the pipeline until the hazard condition no longer holds. |
It injects a bubble into execute stage and repeats the decoding of the addl instruction in cycle 7. In effect, the machine has ... pg 414 | dynamically inserted a nop instruction, giving a flow similar to that shown for prog1 (Figure 4.43). |
A bubble is like ___—it does not cause any changes to the registers, the memory, the condition codes, or the program status. | a dynamically generated nop instruction |
stalling can be implemented fairly easily BUT | the resulting performance sometimes is not very good. it can reduce the overall throughput |
avoiding data hazards by forwarding | Rather than stalling until the write has completed, it can simply pass the value that is about to be written to pipeline register E as the source operand. |
This technique of passing a result value directly from one pipeline stage to an earlier one is commonly known as | data forwarding (or simply forwarding, and sometimes bypassing). |
limitations of data forwarding | one class of data hazards that cannot be handled purely by forwarding, because memory reads occur late in the pipeline. called load/use hazard. |
we can avoid a load/use data hazard with a combination of | stalling and forwarding. |
This use of a stall to handle a load/use hazard is called | a load interlock. |
Load/use hazards: The pipe line must stall for one cycle between an instruction that reads a value from | memory and an instruction that uses this value. |
Mispredicted Branches: By the time the branch logic detects that a jump should not have been taken, several instructions at the branch target will have started down the pipeline. These instructions must... | be removed from the pipeline |
example of load/use hazard | one instruction (the mrmovl at address 0x018) reads a value from memory for register %eax while the next instruction (the addl at address 0x01e) needs this value as a source operand. |
A return instruction generates ___ bubbles, a load/use hazard generates ___, and a mispredicted branch generates ___. | three// one// two |
cpi | cycles per instruction // the reciprocal of the average throughput of the pipeline, but with time measured in clock cycles rather than picoseconds. It is a useful measure of the architectural efficiency of a design. |
CPI = (C_i + C_b) / C_i means what | If the stage processes a total of Ci instructions and Cb bubbles, then the processor has required around Ci + Cb total clock cycles to execute Ci instructions. |
CPI = 1.0 + lp + mp + rp. What does each term represent? | There are 3 different instructions that can cause a bubble to be injected: lp = load penalty ( avg freq of bubbles introduced while stalling for load/use hazards) mp = avg freq of bubbles due to mispredicted branch rp = avg freq while stalling for ret |