Specifications for Storing Compressed Images in FITS Binary
Tables
Richard L. White, STScI
Perry Greenfield, STScI
William Pence, NASA/GSFC
Doug Tody, NOAO
October 21, 1999
This document describes a convention for compressing n-dimensional images and storing the resulting byte stream in a variable-length column in a FITS binary table. The general file structure outlined here is independent of the specific data compression algorithm that is used. The implementation details for several commonly used compression algorithms are described in the appendixes of this document.
The general principle used in this convention is to first divide the n-dimensional image into a rectangular grid of subimages or `tiles'. Each tile is then compressed as a continuous block of data, and the resulting compressed byte stream is stored in a row of a variable length column in a FITS binary table. By dividing the image into tiles it is generally possible to extract and uncompress subsections of the image without having to uncompress the whole image. The default tiling pattern treats each row of a 2-dimensional image (or higher dimensional cube) as a tile, such that each tile contains NAXIS1 pixels. Any other rectangular tiling pattern may be defined using the ZTILEn keywords that are described below. In the case of relatively small images it may be sufficient to compress the entire image as a single tile, resulting in an output binary table with 1 row. In the case of 3-dimensional data cubes, it may be advantageous to treat each plane of the cube as a separate tile if application software typically needs to access the cube on a plane by plane basis.
The following keywords are defined by this convention for use in the header of the FITS binary table extension to describe the structure of the compressed image.
The compressed image tiles are stored in the binary table in the same order that the first pixel in each tile appears in the FITS image; the tile containing the first pixel in the image appears in the first row of the table, and the tile containing the last pixel in the image appears in the last row of the binary table.
The following columns in the FITS binary table are defined by this convention. The order of the columns in the table is not significant. The column names (given by the TTYPEn keyword) are shown here in upper case letters, but the case is not significant.
Datatype | BITPIX | TFORMn |
byte | 8 | '1PB' |
short int | 16 | '1PI' |
long int | 32 | '1PJ' |
float | -32 | '1PE' |
double | -64 | '1PD' |
If all the tiles in an image are compressed, then the UNCOMPRESSED_DATA column is not required.
ZSCALE and ZZERO generally have double precision values and have default values of 1.0 and 0.0, respectively. If the same values of ZSCALE and ZZERO apply to every tile in the image, then they may be given as header keywords rather than as table columns.
ZSCALE and ZZERO are typically used to scale floating point images (with BITPIX = -32 or -64) into integers before compression, since most compression algorithms are not very efficient with floating point data. See appendix A for a description of one particularly effective scaling algorithm.
These 2 parameters should not be confused with the reserved BSCALE and BZERO keywords which may be present in integer FITS images (which have BITPIX = 8, 16, or 32). Any such integer images should normally be compressed without any further scaling, and the BSCALE and BZERO keywords should be copied verbatim into the header of the binary table containing the compressed image.
[description of the noise estimation and quantization algorithm goes here]. This algorithm is specifically used to quantize floating point images prior to compressing them with the Rice algorithm (see below), however, this same quantization algorithm could be used equally well with other integer compression algorithms.
[description of the Rice decoding algorithm goes here. ]
The IRAF PLIO (Pixel List I/O) algorithm was developed to store image masks in a compressed form. The performance of this encoding is very good for typical masks consisting of isolated high or low values or extended regions at the same level. The worst case performance occurs when successive pixels have different values. Even in this case the encoding will only require one word (16 bits) per mask pixel, provided either the delta intensity change between pixels is usually less than 12 bits, or the mask represents a zero floored step function of constant height. The worst case cannot exceed npix*2 words provided the mask depth is 24 bits or less.
A good compromise between storage efficiency and efficiency of runtime access, while keeping things simple, is achieved if we maintain the compressed line lists as variable length arrays of type short integer (16 bits per list element), regardless of the mask depth. A line list consists of a series of simple instructions which are executed in sequence to reconstruct a line of the mask. Each 16 bit instruction consists of the sign bit (not used at present), a three bit opcode, and twelve bits of data, i.e.:
+--+-----------+-----------------------------+ |16|15 13|12 1| +--+-----------+-----------------------------+ | | opcode | data | +--+-----------------------------------------+The significance of the data depends upon the instruction. The instructions currently implemented are summarized in the table below.
Instruction Opcode Description ZN 00 Output N zeros HN 04 Output N high values PN 05 Output N-1 zeros plus one high value SH 01 Set high value, absolute IH,DH 02,03 Increment or decrement high value IS,DS 06,07 Like IH-DH, plus output one high value
In order to reconstruct a mask line, the application executing these instructions is required to keep track of two values, the current high value and the current position in the output line. The detailed operation of each instruction is as follows:
The high value is assumed to be set to 1 at the beginning of a line, hence the IH,DH and IS,DS instructions are not normally needed for boolean masks. If the length of a line segment of constant value or the difference between two successive high values exceeds 4096 (12 bits), then multiple instructions are required to describe the segment or intensity change.
[description of the HCompress decoding algorithm goes here. ]