[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [openrisc] or1200 code size.



* Christian Melki (christian.melki@axis.com) wrote:
> Hi Matjaz.
> 
> I could try the -Os flag but i have been avoiding it
> on purpose. Since i try to even out all the oddities
> between architectures ( think compilers that target
> different OS, with different libraries and different
> versions ) i have considered the -O2 flag as standard
> since all compilers should be able to handle -O2.
> I don't know which compiler the OR port of gcc is derived
> from ( mips perhaps? ) or if it is written from scratch.
> Anyway. Here are my results.. But _please_ regard them as
> very preliminary since i would like to verify
> how to build a test before regarding them as viable results.
> 
> For hypercube:
> chrisme@speed2: testcode/hypercube-0.4/size> find . -name total.o -exec ls
> -1sh {} \;
>  60k ./x86-3.0.4/total.o
>  76k ./or32uclinux-3.1/total.o
>  64k ./cris-3.2.1/total.o
>  64k ./x86-2.95.4/total.o
>  84k ./sparcleon-3.2.2/total.o
> chrisme@speed2: testcode/hypercube-0.4/size> find . -name total.o -exec
> gzip -9 {} \;
> chrisme@speed2: testcode/hypercube-0.4/size> find . -name total.o.gz -exec
> ls -1sh {} \;
>  20k ./x86-3.0.4/total.o.gz
>  28k ./or32uclinux-3.1/total.o.gz
>  24k ./cris-3.2.1/total.o.gz
>  24k ./x86-2.95.4/total.o.gz
>  24k ./sparcleon-3.2.2/total.o.gz
> 
> For boa:
> chrisme@speed2: compiled-toolchaintest-code/boa-0.94.13/src> find . -name
> TOTAL.o -exec ls -1sh {} \;
> 130k ./x86/strip/TOTAL.o
> 134k ./cris/strip/TOTAL.o
> 166k ./or32/strip/TOTAL.o
> 174k ./sparc/strip/TOTAL.o
> chrisme@speed2: compiled-toolchaintest-code/boa-0.94.13/src> find . -name
> TOTAL.o -exec gzip -9 {} \;
> chrisme@speed2: compiled-toolchaintest-code/boa-0.94.13/src> find . -name
> TOTAL.o.gz -exec ls -1sh {} \;
>  44k ./x86/strip/TOTAL.o.gz
>  46k ./cris/strip/TOTAL.o.gz
>  54k ./or32/strip/TOTAL.o.gz
>  50k ./sparc/strip/TOTAL.o.gz
> 
> For secure:
> chrisme@speed2: compiled-toolchaintest-code/secure/total_O2> ls -1sh
> total 236k
>  38k cris-total.o
>  66k mmix-total.o
>  44k or32-total.o
>  46k sparc-total.o
>  42k x86-total.o
> chrisme@speed2: compiled-toolchaintest-code/secure/total_O2> gzip -9 *
> chrisme@speed2: compiled-toolchaintest-code/secure/total_O2> ls -1sh
> total 68k
>  12k cris-total.o.gz
>  14k mmix-total.o.gz
>  16k or32-total.o.gz
>  14k sparc-total.o.gz
>  12k x86-total.o.gz
> 
> As you can see. The or1k generates ok binary sizes uncompressed.
> But takes the on the end of the chain when compressed on a regular basis.
> I know that these are ugly big blocksize listings,
> but the difference becomes even more obvious when choosing a more heavy
> hitting compressor like bzip2.
> 
> anyway
> I have a few problems left..
> Here they are:
> 
> How should i generate code with so many different versions
> of compilers? uclibc/glibc/linux/uclinux etc etc.
> This makes it hard if not impossible...
> 
> Ok. Choose most common unit to compile from.
> *That would be *arch*-elf-(no-os-dependencies)
> 
> Ok. Find programs that does not carry to many
> os dependencies ( sys/whatever.h ).
> This was actually quite hard. Does anyone have any
> program that is based on non os-dependant code
> and is large enough?
> * Boa (webserv - obligatory axis)
> * hypercube (webserv, a friend of mine's)
> * generic hello world for testcase
> * secure (crypto program that is quite os independent)
> 
> (here comes the part that im not sure of)
> Ok. Choose how to compile
> * -O2 should be standard.
> and:
> which path? one or combination?
> * Do not link... or link with null-functions/non-library-link(is it
> possible??)
> * Just strip all binary symbol information --strip-all.
> * Mearly --strip-unneeded ( info is vital for comparision?? )
> and:
> * Measure on the entire object file and summarize over
> all objects generated??
> * Measure on code segment only? ( read out each ELF-code
> segment size and summarize )
> 
> Any suggestions regarding measurement is welcome!

i see your dilema... i'm not sure what you are trying to achieve though.

if you are solving an actual problem and you already have an application
which should compile into smallest possible code size than you just compile
*that* application in a way that produces the smallest binary. (with
or without compression - whatever you are going to use). in this case it
doesn't matter what code size other applications would produce because 
you'd have no use for them. there are also some interesting tricks you can
play (this is for ia32, but many ideas are general):

http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html

WARNING: here on everything is on very soft ground

if this is 'academic' question or research then this becomes much more
elusive question. i guess you could study (maybe with some statistical
methods) encoding of different instruction sets. in particular two things:
 1. encoding of higer level language (like C) into
assembler: 'typical' number of instructions needed to do something
(multiplied with 'typical' instruction + operands sizes) -
whatever 'typical' might be.
 2. encoding of assembler into binary code - it could be that on
or32 encoding is 'typicaly' (by plain coincidence of choosing how to
encode instructions into binary code) not compressable as much as others are.

the biggest problem here is defining 'typical'. abstracly i'd say it's
some kind of weighted avarage but i am gussing that you could also get any
results you'd want just by carefuly picking typical.

i think a very detailed analysis could be done using statistical and
information theories but usfullnes of the end result (in the 'real world')
could be very limited by very different compiler maturity,
specific applications...

---

i'd just concentrate on one os, pick one library to link with (the
same for all) and compile a set of programs with wide array of
options/compilers. for comparison i'd take *the best* result for
each platform. this way benefits of your research can be immediate (you'll
find out optimal settings for each platform) and have tengible results
in 'real world'. i think it's important that you compare results
of complete program that you can actuall run or you'll neglect a lot
of 'hidden' costs that can't be quantified in advance.

if you want to be really ambitious you could always do benchmarks with
also different libraries/os/.../ combinations and get a more complex
meassurment.

regards,
m. 
--
To unsubscribe from openrisc mailing list please visit http://www.opencores.org/mailinglists.shtml