Re: [mu TECH] byte #1767 or piping /usr/bin/hexd

From: Michele Andreoli (m.andreoli@tin.it)
Date: Fri Mar 31 2000 - 23:45:52 CEST


On Fri, Mar 31, 2000 at 01:50:48PM -0500, Alfie Costa nicely wrote:
> For any code detectives and programmer gumshoes out there, a strange and
> unsolved mystery concerning pipes and redirects...
>
> Wandering around the mu file tree, I discovered '/usr/bin/hexd'. 'hexd' reads
> in binary files and outputs hex dumps; it also converts its own hex dumps back
> to binary files. It's used by the 'muhex' editor.

This is my master-piece!

> An obvious test was to feed the new hexd's input into its output, like this:
>
> myhexd -c < /bin/sed | myhexd -d > sed.hexcode
>
> ...and then see if the files were the same, which I wasn't sure how to do in mu
> so I used 'ls', which isn't really much of a test, but it does at least tell if
> files aren't the same length which is all that matters to appreciate the
> following problem. So...
>
> ls -l /bin/sed sed.hexcode
>
 ...displays...
>
> -rwxr-xr-x 1 root 88 21512 Jun 6 1999 /bin/sed*
> -rw-rw-r-- 1 root root 21401 Mar 30 18:52 sed.hexcode
>
> No, these two files aren't the same length. Then I tried the same test using
> the original mu 'hexd', and surprisingly, that did the same thing. It seemed
> odd this hadn't already been discovered, more about which below...

Uhmm .. yes, you are right. Tested. hexd seems to work differently, using
pipe, compared with temporary files. The new file is alwary longer!

In theory, in UNIX, this two operations:

1) X < a > tmp; Y < tmp > b
2) X < a | Y > b

should be equivalent ( the 2th do not use 'tmp' file).
What happens??

>
> Comparing hex dumps of both files showed that 'sed.hexcode' began to deviate at
> byte #1767. One byte was missing at that address.
>
> Testing other files, it seems that if the data file being tested was shorter
> than 1767 bytes, the second file would be same length as the first. Whether
> the file is text or binary doesn't effect this; the two files always start to
> differ at byte #1767, or 0x6E7, or octal 03347.
>
> So, I closely inspected the source code, as the 'decode()' routine in 'hexd.c'
> did look tricky. Tried many experiments, but got no improvements. And why
> always byte #1767?

Some buffer lenght embedded in libc?

>
> Later on I tried the same experiment one piece at a time, like this...
>
> hexd -c < /bin/sed > sed.hexdump
> hexd -d < sed.hexdump > sed.hexcode
> ls -l /bin/sed sed.hexcode
>
> ...and so...
>
> -rwxr-xr-x 1 root 88 21512 Jun 6 1999 /bin/sed*
> -rw-rw-r-- 1 root root 21512 Mar 30 19:09 sed.hexcode
>
> Now both files are the same, or at least they're the same length. Assuming
> they both really are the same, this explains why the problem hadn't been
> discovered before, and why 'muhex' hasn't ruined any files.

Tips: If you use "ls" instead of "sed", you can test result simply running
the command.

>
> I thought it might be a problem with mu, and so tried it with Debian. Debian
> was no different.
>
> Summing up: apparently, it makes a difference if something is piped to 'hexd'
> with "|", or written to a file first. Why does this happen?
>

You recompiled hexd.c in your Debian? What kernel and what libc?
This should be interesting.

Michele

-- 
I'd like to conclude with a positive statement, but I can't 
remember any. Would two negative ones do?       -- Woody Allen
---------------------------------------------------------------------
To unsubscribe, e-mail: mulinux-unsubscribe@sunsite.auc.dk
For additional commands, e-mail: mulinux-help@sunsite.auc.dk


This archive was generated by hypermail 2.1.6 : Sat Feb 08 2003 - 15:27:13 CET