[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[fpu] Re: status



Jamil said:

> I asked in previous mail about
> some instructions and why they exists in some FPUs like ylog(x+1) and
> 2^(x+1) what is that advantages of such instructions.

Near zero, ln(x-1) = x - 0.5*x^2 + ...
If you implement this function by first subtracting one,
you lose all your precision for small numbers.  Same for
exp(x-1).

Applications tend to want log/exp in base e or 10.
The hardware is normally considered easier to build
for base 2.  The development chain (compiler or
assembler) should be able to add the extra "multiply
by constant" step that is needed to map between bases.
That's especially nice at the compiler step, so that
if someone codes
   0.75*log(x)
the compiler translates that to
   0.75*0.69315*log2(x)
and the two constants can be folded.  This approach makes
mathematical and computer sense; someone would have to
check that the cascaded roundoff error is still within
the limits proscribed by IEEE math rules.

Look into the CORDIC hardware architecture.  Careful
use of that structure might give you some interesting
capabilities, although it tends to require a lot of cycles.
I believe that structure was used on the 8087, but Intel
probably moved away from that as they shifted to the
bigger/faster end of the gate/speed tradeoff.  A good
reference on CORDIC is on the 'net at
   http://users.ids.net/~randraka/cordic.htm
Ray Andraka has lots of other good FPGA ideas on his
pages, too.

> Antoher thing we have to take each instruction and check who it can be
> implemented rom both the mathmetical point of view and hw implementation
> for example the SQroot is too hard to be implemented in HW may be we can
> make some basic operations nad leave the reset for the software.

Beyond the basic add/subtract/multiply operations, that
should be as fast as possible and therefore nearly pure
hardware, I think it would help to look more at how to
_accelerate_ floating point computations (1/x, trig, sqrt,
log, exp, bessel, gamma, erf, ...) rather than provide a
complete hardware implementation.  Think of the associated
RISC CPU as a microsequencer :-) .

       - Larry Doolittle   <LRDoolittle@lbl.gov>