Harvard architecture, with separate program/data buses
2 level pipeline: fetch and execute
Most instructions execute in 1 clock.
Variable instruction word width: 16 or 32 bits. Most instructions are 16 bits wide
Register File (RF) with 32 registers
IO File (IOF) with 64 registers
Loads and stores operate in the Unified Memory space.
The Unified Memory (UM) is the space formed by concatenating the RF, IOF and the Data Memory (DM), in this order. Thus, the RF begins at address 0 in the UM, the IOF at address 32 and the DM at address 96.
Register File mapped pointer registers X, Y, Z, 16 bits each, for indirect addressing the Data Memory and the Program Memory (PM).
Pointer registers have pre-decrement and post-increment capabilities.
Add some AVR kernel schematics.
Add some AVR general considerations.
Notes on AVR downsides
Among other 8 bit microcontrollers, the AVR architecture is relatively clean and fast. Of course, it is not perfect. In the following, I will expand on some of the drawbacks of the AVR architecture.
The Register File, IO file and Data Memory are very different entities, from the point of view of the AVR instruction set. It's an obvious decision to physically implement them as different memory-like entities. Pipelining such a structure is straightforward. A simple and fast pipeline can be built naturally. Every memory-like entity can be assigned a fixed pipe stage during which it is accessed for writing or for reading, with no more than one such elementary operation needed during any instruction. However, the AVR architecture has a unified addressing space for Register File - IO file - Data Memory. Accessing this Unified Memory space can be done through indirect loads and stores, via dedicated pointer registers. Depending upon the contents of a pointer register, an access to the Register File or the IO file or the Data Memory is needed. This completely messes up the simple pipeline structure above, because instructions' execution is datadriven. As a result, for example, the Register File must now be accessed, let's say for reading, in more than one pipe stage. This is most pipeline-destructive, because different instructions will compete on the same hardware resources.
Arbitration/stall schemes are required. Also, new data hazards must be dealt with. All these are pretty complex, and come with a cost, in terms of both power consumption and speed.
The unified address space does bring new addressing capabilities. However, they are unnatural and basically useless. Who will ever place the stack in the Register File or in the IO File? That would make some sense for low-end controllers that don't have Data Memory at all, and rely on a Register File mapped stack. However, the price paid for that is big.
As a result, pAVR's loads and stores take 2 cycles. If the pointer registers would have pointed only in the Data Memory space, loads and stores would have naturally taken a single clock.
That would have allowed reducing the number of pipe stages from 6 to 5. As a result, a lower CPI would have been obtained, because of less cycles penalty on the instructions that modify the instruction flow (branches, jumps, calls etc). Also, that would have ment lower power consumption because of less registers and combinational logic.
That is not pipeline-friendly.
Each 32 bit instruction could have easily been replaced by two 16 bit instructions.
Instruction set orthogonalithy issues:
Pointer registers X, Y, Z have addressing capabilities that are different from each other.
Register File locations 0...15 have different addressing capabilities than RF locations 16...31.
IO locations 0 to 31 support more addressing modes than IO locations 32 to 63.
There are instructions that work on 16 bit words (for example, 16 bit register-to-register moves).
The existance of such instructions on a 8 bit RISC controller is questionable. That's not because such operations are not needed, but because the raise in complexity and irregularity is not justifiable.
The cost/performance balance is negative for these instructions (we're still talking about a controller claimed to be RISC).
opcodes 0x95C8 and 0x9004 do exactly the same thing (LPM).
Other such examples might exist.
The instruction bits could have been used more carefully.
CLR affects flags, while SER does not, even though they seem to be complementary intructions.
This might be a design flaw in the original core or designed on (a hidden) purpose by whoever designed the AVR core. By the way, if I remember well some ancient news, AVR was designed not by Atmel, but by a Scandinavian company that was aquired later by Atmel.
Generated on Tue Dec 31 20:26:30 2002 for Pipelined AVR microcontroller by
@Importing into repository the new directory structure.