How shasm works						docs/design

shasm is a collection of shell routines that append arbitrary binary data
to a file named ./output. There are numerous routines for making that data
x86 machine code. shasm itself is written entirely in GNU Bash. shasm
makes each individual opcode assembler an actual shell subroutine
(function), and thus the file to be assembled (if you use an input file)
is actually a shell script, as opposed to something parsed by one. This is
a result of the pathologically un-orthogonal x86 instruction set, and the
desire to avoid extranaeity. You basically have to give a lot of brains to
each instruction to assemble x86, so you might as well make them routines.

This means we want the syntax of a shasm assembly source file to map to
the syntax of shell commands without too many contortions. For example, we
don't want to implement macros, which requires parsing the whole file in
ways I am not sure the shell can do, and which would be pathologically
slow anyway. More pathologically slow, that is. Not to mention, shell
"macro expansion" is available anyway, sortof. Here's the syntax of a
shell command or subroutine...

						(one shell command)	
	[assignments] command [ arguments ]

One shasm instruction has this syntax...
						(one shasm instruction)
	opcode [ arguments ]

If the opcode is a command, which it is, we're almost done. We can juggle
the arguments to our heart's content. The only limitation is we can't use
shell meta-ops like < & and so on in shasm tokens. Prefixes are the issue.
We make them commands also.  CS, SS and similar exist as commands and as
defined argument tokens representing register operands. Shell syntax
disambiguates them by context for us. Keep them on separate lines, or use
semicolon. That's right, semicolon is as per usual for the shell, and so
are # comments and everything else native to your friendly neighborhood
unix command interpreter.

If shasm didn't have a branch resolver it wouldn't even be worthy of the
term assembler. (Not that it is anyway, but...) Forward branches have to
be dealt with after they are resolved. Issue. How do we take two passes
over the same shell script, with different actions on pass 1 and pass 2?
We call the actual assembler script by sourcing it from main(), twice, and
pass-sensitive actions are in a few pass-sensitive routines. The
pass-sensitive state, the list of branches to resolve, is global to main()
and it's progeny. All opcodes and so on can use the same few output
routines, which can do one thing on pass 1 and finish up on pass 2.
If you are using shasm interactively, without an assembly script, you
yourself have to maintain the pass variable by hand. Normally that means doing
a pass=2 if you want to see what you're doing in ./listing .

Opcodes do the syntax-checking they need. The very minimum they need.
Error checking is what ./listing is for anyway. Prefixes just get assembled
without checking. There is no inter-op checking, and we've arbitrarily
made prefixes distinct ops.

The listing of values that are created as octal strings are just output to
./listing in octal. A clump of 3 characters in ./listing is an octal byte.
Going from octal or integer back to a string is tricky, and lo and behold,
likely not worth it. The values in question are composed octally anyway,
(the 386 modR/M and SIB bytes and many oper bytes) so they might as well
be viewed in octal. Other bytes are 2 hexadecimal characters. If you look
at the code for the modR/M and SIB stuff, you'll notice that octal values
are actually built up as text strings of ascii representions of the octal
digits. This is the form in which all bytes are presented to the various
binary outputters based on echo -e.

On a P166 one fancy-mode instruction invoked as a command takes .03
seconds or more. At that rate it would take, oh, 3 hours to assemble Linux
from gcc-produced shasm source, if there were such a bizarre thing.  Just
to assemble it, not compile the C. gas is designed to serve gcc. shasm
isn't, so for what shasm is intended for that's plenty good enough. It
should take about 2 minutes to assemble Ha4sm when it's re-written in
shasm.  I looked at using a "string" as an IO buffer, but at a glance it
didn't seem to make a noticeable difference. Anyway, figure 100 times
slower than gas, 10 times slower than Gcc. Well-optimized machine code is
often worth that. The interactivity of shasm is worth much more than that,
if you're hand-coding something.


The address mode parser builds globbing pattern match strings for each
instruction encountered that calls parse(). Instructions that have several
addressing modes all call parse(). The modestring is used in a case switch
in the instruction which implements the specifics of that instruction. The
cases map roughly to the various main opcode bytes that a particular
instruction can start with. The case thing isn't particularly efficient or
clear, but it does map to how the 386 decodes things reasonably well.
Another method might be better for another CPU, or for x86, but the
modestrings thing is OK. Shell case constructs are very flexible because
each case test string is a globbing pattern.

The wholesale renaming of 386 opcodes is helpful, in my experience.
Assembly language mnemonics have remained more cryptic than need be over
the last 20 years. What a verbose name does is moves the comment into the
name, so to speak, which eliminates some mental indirection. The names
used in x86 shasm arose while writing the Janet_Reno bootsector for x86 as
m4 macros called "asmacs". (Janet_Reno by the way is a bootsector, just
the bootsector, that gets into pmode and can call AT BIOS routines in
pmode interrupt handlers.) I suspect that looking at generic-ized names
for x86 instructions may suggest a subset of the x86 instruction set that
is somewhat portable. Assembly language directives, as oposed to opcodes,
(the stuff in shasm main basically) are already portable. shasm main is
little-endian, but the generalization of that is trivial.

Shasm's general from/to syntax wasn't too tough. It involves some short
arrays. Instructions' operands come in with a leftness or rightness, and
then "to" or "from" assigns source and dest to the values of left and
right, as the case may be. Thereafter things can refer to $source or $dest
and not care about left and right, but rather care about what they need
to.

./listing resembles GNU gas -anl output. (Highly recommended, BTW.) The
implementation oddities there are the aformentioned octal cheat, and
left-justifying the text column, the rightmost column. That uses a
per-line character counter and a Boolean mask.

There's some extra added lameness in the source/dest indirection
"pointers" vis-a-vis integers and array subscripts. I do some "math" with
if/then and strings. That can and should probably be cured with 
	declare	-ia	, an array of integers.

feb 20
There's actually a bit of optimizing in fillthru. It will fill 256-byte
pages as such when possible. Big snooze otherwise. Since writing the bulk
of shasm I've found that a=$a`echo -e "\076"`  appends a byte to a
"string". Have fun.

The branch resolver traverses the list of labels doing name matching. I
suspect that can be improved quite a bit. H3sm doesn't do that, IIRC.

feb 24

As far as a linker is concerned, there is a minimal one in shasm. It does
static executables. The "ELF" command does that. All you need for that is
the ELF header and "program header table". shasm doesn't do a symbol
table. For a bootsector, don't use the ELF command. "heap" implements
.bss-like allocation of unitialized data, but not in a separate segment.



Rick Hohensee
                http://linux01.gwdg.de/~rhohen
                                                rickh@capaccess.org

jan/nov 2001


............................................................................
............................................................................
