Branch prediction with hashed branch prediction table and 2 bit predictor.
Super-RAM interfacing to Program Memory.
A super-RAM is a classic RAM with two supplemental lines: a mem_rq input and a mem_ack output. The device that writes/reads the super-RAM knows that it can place an access request when the memory signalizes it's ready via mem_ack. Only then, it can place an access request via mem_rq.
A super-RAM is a super-class for classic RAM. That is, a super-RAM becomes classic RAM if the RAM ignores mem_rq and keeps continousely mem_ack to 1.
The super-RAM protocol is so flexible that, as an extreme example, it can serially (!) interface the Program Memory to the controller. That is, about 2-3 wires instead of 38 wires, without needing to modify anything in the controller. Of course, that would come with a very large speed penalty, but it allows choosing the most advantageous compromise between the number of wires and speed. The only thing to be done is to add a serial to parallel converter, that complies to the super-RAM protocol.
After pAVR is made super-RAM compatible, it can run anyway from a regular RAM, as it runs now, by ignoring the two extra lines. Thus, nothing is removed, it's only added. No speed penalty should be payed.
A simple way to add the super-RAM interface is to force nops into the pipeline as long as the serial-to-parallel converter works on an instruction word.
Modify stall handling so that no nops are required after the instruction wavefront. The instructions could take care of themselves. The idea is that a request to a hardware resource that is already in use by an older instruction, could automatically generate a stall.
generally simplify instruction handling
make average instruction execution slightly faster.