The Bypass Unit (BPU) is a FIFO-like temporary storage area, that keeps data to be written into the Register File.
If an instruction computes a value that must be written into the Register File (RF) (an ALU instruction, for example) it first writes the BPU, and then (or at the same time) actually writes the RF.
If the following instructions need an operand from the RF, at the same address where the previous result should have been written into the RF, they will actually read that operand from the BPU rather than from RF.
This way, `read before write' pipeline hazards are avoided.
The specific situations where BPU is needed are:
when reading Register File operand(s).
Reading Register File operands is done through the BPU.
when reading pointer registers.
Reading pointer registers is done through the BPU.
The algorithm of using BPU:
the instruction that wants to write a result into the RF, writes first the BPU with 3 data fields:
the result itself
result's address into RF
a flag that marks this BPU entry as having valid data (a so-called `active' flag)
next instruction(s) that need an operand from RF, read it through a dedicated function (combinational logic), that does the following:
checks all BPU entries and see which ones are active (hold meaningful data).
compares operand's address against the addresses in all active BPU entries.
if a single address matches, gets the data in that BPU entry rather than data from the RF.
if multiple addresses match, gets the data in the most recent BPU entry. Even though it's possible that 2 matches happen at simultaneous BPU entries, this situation should never occur; it would indicate a design bug. This illegal situation would assert an error during simulation.
if no address matches, gets data from the RF (as if BPU were not existing).
The maximum delay between a write and a read from the RF is 4 clocks. Thus, the BPU FIFO-like structure has a depth of 4.
On the other hand, the BPU must be able to be written 3 one byte operands, at a time (must have 3 write ports). The most BPU demanding instructions are stores with pre(post) decrement(increment). Both the one byte data and a 2 byte pointer register must be written into the BPU, as well as into the RF. The 3 bytes are simultaneousely written into so-called `BPU chains' or `BPU registers' (BPU chains 0, 1, 2; or BPU registers 0, 1, 2; or BPR0, BPR1, BPR2).
The BPU has 3x4 entries, each consisting of:
an 8 bit data field
a 5 bit address field
a flag that marks the entry as active or inactive
Generated on Tue Dec 31 20:26:30 2002 for Pipelined AVR microcontroller by
@Importing into repository the new directory structure.