Previous: Dictionary Termination Record, Up: Data File Format



D.12 Data Record

Data records must follow all other records in the data file. There must be at least one data record in every system file.

The format of data records varies depending on whether the data is compressed. Regardless, the data is arranged in a series of 8-byte elements.

When data is not compressed, Every case is composed of case_size of these 8-byte elements, where case_size comes from the file header record (see File Header Record). Each element corresponds to the variable declared in the respective variable record (see Variable Record). Numeric values are given in flt64 format; string values are literal characters string, padded on the right when necessary.

Compressed data is arranged in the following manner: the first 8-byte element in the data section is divided into a series of 1-byte command codes. These codes have meanings as described below:

0
Ignored. If the program writing the system file accumulates compressed data in blocks of fixed length, 0 bytes can be used to pad out extra bytes remaining at the end of a fixed-size block.
1 through 251
These values indicate that the corresponding numeric variable has the value (code - bias) for the case being read, where code is the value of the compression code and bias is the variable compression_bias from the file header. For example, code 105 with bias 100.0 (the normal value) indicates a numeric variable of value 5.
252
End of file. This code may or may not appear at the end of the data stream. PSPP always outputs this code but its use is not required.
253
This value indicates that the numeric or string value is not compressible. The value is stored in the 8-byte element following the current block of command bytes. If this value appears twice in a block of command bytes, then it indicates the second element following the command bytes, and so on.
254
Used to indicate a string value that is all spaces.
255
Used to indicate the system-missing value.

When the end of the first 8-byte element of command bytes is reached, any blocks of non-compressible values are skipped, and the next element of command bytes is read and interpreted, until the end of the file is reached.