[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [openrisc] Re: OR1000 16 bit instruction set



On Wednesday 20 February 2002 07:29, Andreas Bombe wrote:
> List ate my email - this is a resend.
>
> On Thu, Feb 14, 2002 at 08:43:44PM +0100, Damjan Lampret wrote:
> > Hi Jeff,
> >
> > it is all about demand. Right now it looks like 32-bit insn
> > length is most wanted, especially because of the coming 64-bit
> > superscalar version of the OR1K. It is also true that prices of
> > Flash and RAM are droping some 30% or more each year. But if
> > there will be enough interest for 16-bit insn length, it could
> > become a priority.
>
> "Whatever, RAM is cheap these days" is the usual answer, but
> usually cost per MB is not the issue.  It is the wrong answer on
> desktop systems (because RAM is slow) and usually wrong in low cost
> embedded systems.

I have worked on several CPUs that support "duality" of opcode 
lengths (i.e. supporting both 16bit and 32bit ISAs), both ARM and 
MIPs.  In theory 16bit instructions are great, in practice they 
usually turn out to be a big fat pain in the rear side.  To me it's 
another "me too" feature that all the embedded CPUs had to support to 
get customer attention.

That being said, when the datapath is constricted it does have an 
advantage..., keep in mind "constricted" is more than just buswidth 
differences, but also relates to wait states on 32bit wide busses.  
The greater advantage is actually instruction cache expansion.  
Statistically, using 16bit instructions reduces the size of the code 
by 30%.  That results in a large increase in "effective" instruction 
cache size.

However, the issues in the tools with 16bit instructions have never 
been properly addressed.  The idea I had while considering this some 
time ago is adding a "multi op" instruction that has a 4 bit 
pre-amble and can do 3 register to register arithmetic operations.  
It couldn't access all the registers clearly, but the way it would 
work is you would use the "normal" instructions to load the 
arithmetic registers (say 0-3), and then use the "multop" instruction 
to do the heavy lifting of those calculations.  This gets rid of all 
the issues involved with having multiple ISAs and gives you most of 
the advantages of 16bit instructions in terms of code size based on 
my calculations.

Consider the difference in ARM vs. this idea-

lui r0, #0x0123456 @lui has 28 bits of expresiveness in
lui r1, #0x9123456 @Shanes Make Believe ISA ;-)
{ ori r0 #0x7, ori r1 #0x7, add r0, r1}

 /* The add op places the result in the first mentioned register, so 
the result would be r0=0x92468ACE , r1=0x91234567*/
(Full add register to register with a 32bit insn size, 32bit register 
size total code and data- 32bit*3=96bit. )

In ARM this would be 3 instructions and 2 32bit data pieces, so 
160bits, not to mention the cache hit normally involved with ldr.

ldr r0, =0x01234567
ldr r1, =0x01234567
add  r0, r0, r1
[0x01234567]
[0x91234567]

Thanks,
Shane Nay.
(I have the entire ISA written out for both a 32bit and 64bit version 
of a multi-op processor, one of these days I'll finish learning 
Verilog :).  Using a multi-op strategy you can actually end up with a 
64bit instruction length processor that has more compact code than 
it's 32bit brothers)
--
To unsubscribe from openrisc mailing list please visit http://www.opencores.org/mailinglists.shtml