NAME
fitsmd5 - Compute/update the DATAMD5 keyword/value
SYNOPSIS
fitsmd5 [-u] [-s] [-a] <FITS files...>
DESCRIPTION
fitsmd5 computes the MD5 signature of all data sections in a FITS file,
and prints out the results on stdout. This command can optionally
update the main FITS header in modifying the value of the DATAMD5 key.
This command is useful to give a unique ID to a FITS file. The
algorithm simply browses through all data sections in the input file
and passes the data blocks to an MD5 hash function. The final result is
a 128-bit signature that can be used to uniquely identify the file.
This approach is meant to provide a tool to tag FITS files with unique
IDs, it is not meant to be used as a checksum for file integrity (the
CKSUM key is the solution for that), although it could be used in that
spirit. The main point is that only data sections are taken into
account, leaving the possibility of changing the headers without
affecting the data signature.
MD5 hashing is cryptographically strong, which means the probability of
having two different FITS files getting the same ID is almost zero. It
should be good enough to assign a unique ID to several tens of
thousands of frames. Since there is still a tiny but non-zero
possibility that two different files will get an identical key, this
approach is not recommended to tag very large numbers of files
(typically: millions of them). If you do have a large database of FITS
files, using a timestamp is usually a better approach.
The MD5 signature is a good solution to tag a list of FITS files which
might have originated from various sources on which the database
maintainer has no control. Typically, calibration databases holding
calibration frames for a given instrument, receive data from different
actors who might not be in sync with unique file naming conventions.
This command makes sure it is always possible to assign a unique ID to
each frame.
Notice that if the input FITS file has no data section, the returned
MD5 key will be non-zero (it is exactly
d41d8cd98f00b204e9800998ecf8427e). This signature also offers the
interesting property that if two files have exactly the same pixels
(bit-wise comparisons) they will get the same ID, this is useful e.g.
for regression tests.
If you want to produce files containing the DATAMD5 key in their main
headers, you should use the qfits library, which always inserts this
key. If you are working with other FITS-processing software, you should
allocate an empty DATAMD5 placeholder and apply this command with the
-u option to update the value.
Notice that this command can also compute the MD5 sum of a complete
file, not just its data sections (see -a option). In this mode, the
command is completely identical to the GNU md5sum command, which is
used to compute checksums on files. Input files in that case need not
be FITS, though they still need to be regular files.
OPTIONS
-u Try to update the DATAMD5 keyword in the main header if present.
-s Silent mode: run without printing any message.
-a Compute the MD5 sum on all bits in the file. In this mode, the
command behaves like the GNU md5sum command, to be used e.g. as
a checksum. This option excludes all others.
FILES
Input files to fitsmd5 shall comply with the FITS format, except when
used with -a option.
01 Aug 2001 fitsmd5(1)