codegroup - encode / decode binary file as five letter codegroups
codegroup -d|-e [ -u ] [ infile [ outfile ] ]
For decades, spies have written their encoded messages in groups of
codegroup encodes any binary file into this form, allowing it to be
transmitted through any medium, and decodes files containing codegroups
into the original input. Encoded files contain a 16-bit cyclical
redundancy check (CRC) and file size to verify, when decoded, that the
message is complete and correct. Files being decoded may contain other
information before and after the codegroups, allowing in-the-clear
annotations to be included.
codegroup makes no attempt, on its own, to prevent your message from
being read. Cryptographic security should be delegated to a package
intended for that purpose, such as pgp. codegroup can then be applied
to the encrypted binary output, transforming it into easily transmitted
text. Text created by codegroup uses only upper case ASCII letters and
spaces. Unlike files encoded with uuencode or pgp’s ‘‘ASCII armour’’
facility, the output of codegroup can be easily (albeit tediously) read
over the telephone, broadcast on shortwave radio to agents in the
field, or sent by telegram, telex, or Morse code.
To illustrate the difference, here are the first few lines of a binary
file encoded by:
begin 644 data.bin
M’XL("&7._R\ VUO;V\ /9U+FN2XSF3G6H5OA1(?HOB<=/<7__X7TN<PJ[L&
M=?-&1;I+) B8 0;P?_Z’?WY_-=7Q"T_JSZ_6)X9?&"$\OU9[N’\A[A%^L^6=
M?^M[OOV+:9=UM9J^] MAS_ ;X0O]U];(Z?<WWE9_\^[/]ZMM\OO[CG’^2M\M
-----BEGIN PGP MESSAGE-----
ZZZZZ YBPIL AIAIG FMOPP CPAAA DGNGP GPGPA ADNJN ELJKO ELIMO
GEOHF KIFGP IFBCB PKCPI YJMHE PHBHP PPOBH NCOHD AKLLL AGHFP
DEGEF LKELC EAIJI ABAGP AHPPO IHHPH OHPDF YNFPB ALEPO KMPKP
NGCHI GFPBI CBDML PFGHL LIHPC BOOBB HOLDO FJNHP OLHLL OPNIL
Only codegroup conforms to the telegraphic convention of all upper case
letters, and passes the ‘‘telephone test’’ of being readable without
any modifiers such as ‘‘capital’’ and ‘‘lower-case’’. Avoiding
punctuation marks and lower case letters makes the output of codegroup
much easier to transmit over a voice or traditional telegraphic link.
-decode Decodes the input, previously created by codegroup, to
recover the original input file, and verifies it to detect
truncation or corruption of the contents.
-encode Encodes the input into an output text file containing five
letter code groups (default).
-usage Print how-to-call information.
All options may be abbreviated to a single letter.
Encoding a binary file as ASCII characters inevitably increases its
size. When used in conjunction with existing compression and
encryption tools, the resulting growth in file size is usually
acceptable. For example, a random extract of electronic mail 32768
bytes in length was chosen as a test sample. Compression with gzip
compacted the file to 15062 bytes. It was then encrypted for
transmission to a single recipient with pgp, which resulted in a 15233
byte file. (Even though pgp has its own compression, smaller files
usually result from initial compression with gzip. In this case, pgp
alone would have produced a file of 15420 bytes.)
codegroup transforms the encrypted file into a 37296 byte text file.
Thus, due to compression, the code groups for the encrypted file are
only a little larger than the original cleartext.
Restricting the character set and including spaces between groups
results in substantially larger output files than those produced by
uuencode and pgp. Files encoded with codegroup are about 2.5 times the
size of the input file, while uuencode and pgp expand the file only
about 35%. codegroup is thus preferable only for applications where
its limited character set is an advantage.
If no infile is specified or infile is a single ‘‘-’’, codegroup reads
from standard input; if no outfile is given, or outfile is a single
‘‘-’’, output is sent to standard output. The input and output are
processed strictly serially; consequently codegroup may be used in
When a CRC error is detected, no indication is given of the location in
the file where the error(s) occurred. When sending large files, you
may want to break them into pieces with the splits utility (available
from the Web page cited below) so, in case of error, only the erroneous
pieces have to be re-sent.
It might be nice to embed the original file name and modes in the
encoded output, but this opens the door to all kinds of system-
dependent problems. You can always include this information as text
before the first codegroup, or send an archive created with tar or zip.
base64(1), gzip(1), pgp(1), splits(1), tar(1), uuencode(1), zip(1)
codegroup returns status 0 if processing was completed without errors,
1 if errors were detected in decoding a file which indicate the output
is incorrect or incomplete, and 2 if processing could not be performed
at all due, for example, to a nonexistent input file or no codegroups
found in the input.
This software is in the public domain. Permission to use, copy,
modify, and distribute this software and its documentation for any
purpose and without fee is hereby granted, without any conditions or
restrictions. This software is provided ‘‘as is’’ without express or