Man Linux: Main Page and Category List


       boxshade - Pretty-printing of multiple sequence alignments




       BOXSHADE is a program for pretty-printing multiple alignment output.
       The program itself doesn´t do any alignment, you have to use a multiple
       alignment program like ClustalW or Pileup and use the output of these
       programs as input for BOXSHADE.

       This manual page was written for the Debian(TM) distribution because
       the original program does not have a manual page. The presented
       information comes from the documentation of the Web Service of the 3.21
       version that is not available as a Debian package.

       BOXSHADE is a program for creating good looking printouts from
       multiple-aligned protein or DNA sequences. The program does no
       alignment by itself, it has to take as input a file preprocessed by a
       multiple alignment program or a multiple file editor. See below for a
       list of supported input formats and output devices. In the standard
       BOXSHADE output, identical and similar residues in the
       multiple-alignment chart are represented by different colors or
       shadings. There are some more options concerning the kind of shading to
       be applied, sequence numbering, consensus output and so on. The user
       interface is a bit clumsy at the moment, one has to answer a lot of
       questions in order to get the desired output. There is, however, the
       possibility to use default parameters from a standard parameter file or
       to supply the program with parameters from the command line. At the
       moment, the VMS and DOS versions of BOXSHADE have identical user

   Input formats
       BOXSHADE 3.2 knows about the following input file formats: (some of the
       are generally used only for MSDOS or VMS systems) + CLUSTAL and
       CLUSTALV, multiple alignment program, DOS/VMS/MAC default extension
       .ALN + ESEE, multiple sequence editor, DOS default extension .ESE +
       PHYLIP, phylogenetic analysis package, DOS, VMS, UNIX default extension
       .PHY + PILEUP and PRETTY of the GCG sequence analysis package VMS/UNIX
       default extensions .MSF and .PRE NB!! you are strongly encouraged NOT
       to use the PRETTY format as input, it may be incompatible with the
       revised version of .MSF input. We can´t actually think why anyone would
       use this format now, .MSF files are more useful generally. + MALIGNED,
       multiple sequence editor, VMS only default extension .MAL BOXSHADE
       tries to determine the file type from the extension but will work also
       if different extensions are used.

   Output devices
       POSTSCRIPT/EPS creates POSTSCRIPT(TM) files for printing on a
       Laserprinter or for further conversion with a POSTCRIPT interpreter
       (like GHOSTSCRIPT) + HPGL for export to various graphics programs or
       for conversion/printing with the shareware program PRINTGL. Plotting
       BOXSHADE output on a plotter is generally not recommended + RTF for
       export to various word-processing and graphics programs + CRT, uses
       direct screen writes to the PC-monitor. Possible options depend on the
       graphics adapter used. This output device is supported only in the
       MSDOS version. + ANSI. On a PC, this option uses an ANSI device driver
       (ANSI.SYS) that has to be loaded in CONFIG.SYS previously. Possible
       character renditions are reverse, bold,underlined, blinking etc. On
       non-DOS systems, this option behaves more or less like the VT100 output
       mode. + VT100 for display on a VT100 compatible terminal or emulator. +
       ReGISterm for display on a ReGIS compatible graphics terminal or
       emulator. + ReGISfile for later conversion by the program RETOS
       (copyright DEC) in order to print on DIGITALs printer series. + LJ250
       for printing on DIGITALS LJ250 color printer. + ASCII output showing
       either the conserved residues or the varying ones (others as ´-´). +
       FIG file for xfig 2.1. + PICT files for import to Mac and PC graphics
       progs. Some of the formats above offer the possibility of scaling the
       characters and of rotating the plot. Character size has to be entered
       in ´point´ units. Normal output orientation is in portrait mode
       (PS/EPS/HPGL/PICT only), to obtain output in landscape orientation,
       ´rotate plot = y´ has to be chosen. When creating multi-page output,
       all pages are contained in a single output file. If one page per file
       is desired, one has to use the command line parameter /SPLIT. This is
       enforced when requesting EPSF or PICT file output, as multi-page EPSFs
       are a contradiction of the purpose of an EPSF and large PICT files
       would probably be too big for most personal computers. While using the
       terminal as output device, the ´RETURN´ key has to be pressed to obtain
       the next page of output.

   Sequence numbering
       Starting with version 2.2 there is the possibility to add numbering to
       the output files. The numbers are printed between the sequence names
       and the sequence itself. Since most of the input-files either use no
       numbering or number the first position in the alignment always with a
       "1" (and that does not necessarily reflect the numbers within the
       original sequence), the user is asked to enter the starting position
       for each sequence. The command line flag /DEFNUM suppressed that
       question, a starting position of 1 is assumed for all sequences.
       Boxshade starts with the value entered for the leftmost position and
       continues numbering every valid symbol, skipping blanks, ´-´,´.´ and
       stuff like that.

   Default parameters
       Several people using previous releases of BOXSHADE pointed me to the
       need of having default parameters for the various questions asked by
       the program. They argued that most sites only use one type of input
       files, one output device and one choice of colors for the output. I
       therefore added a management of default parameters allowing two levels
       of assistance to the user. 1) all default parameters are contained in
       an ASCII file that can be modified easily to accomodate the users
       taste. The format is roughly documented within the file-header, it
       resembles the keyboard input one has to make if using the program
       interactively. There are two such files supplied with this release of
       BOXSHADE, BOX_DNA.PAR and BOX_PEP.PAR , holding some example parameters
       for peptide and dna-comparisons. there are no big differences between
       these two, the major one is that when shading DNA-comparisons one
       doesn´t care of "similar" residues. 2) to run the program with minimal
       user interaction, I have added the possibility to use command line
       parameters. At the moment, you can use: /check : list all allowed
       command line paramters (this list) and allows parameters to be added.
       /def : program runs without questions, BOX_PEP.PAR is used as default
       /dna : makes the program use BOX_DNA.PAR as parameter file /pep : makes
       the program use BOX_PEP.PAR as parameter file /in=xxx : makes the
       program take xxx as input file /out=yyy : makes the program take yyy as
       output file (note1) /par=zzz : makes the program use zzz as a default
       parameter file /type=1 : makes the program assume an input file of type
       1 (PRETTY/MSF) /dev=1 : makes the program assume and output device of
       type 1 (CRT) /numdef : use default numbering (all sequences starting
       with "1") /thr : threshold fraction of residues that must agree for a
       consensus /split : forces one page per file output, creates multiple
       output files. /cons : makes the program create an additional consensus
       line (see below) /symbcons=: influences the way the consensus line is
       displayed. (see below) /unix : writes output files in unix style (LF
       only) (note2) /dos : writes output files in DOS style (CR/LF) (note2)
       note1: on unix machines, use out=OUTPUT for terminal output on DOS
       machines, use out=con: on VMS machines, use out=tt: note2: if no mode
       is specified, the native style of the machine is used.

       on unix systems, the dash (-) instead of the slash (/) has to be used
       as separation character for command line paramters. For example, a
       valid unix command line is: boxshade -def -numdef -cons -symbcons=" .*"

   Shading strategies (similarity to consensus or single sequence)
       Starting with version 3, BOXSHADE has a new shading system. The first
       difference is the introduction of a threshold fraction of residues that
       must agree for there to be a consensus. Previously, the program assumed
       that SOME residue was always the consensus. If no two residues were the
       same, the first sequence provided the consensus residue. This threshold
       fraction can be any number between 0.0 and 1.0. The number of sequences
       that must agree for there to be a consensus is, as you might expect,
       this fraction times the total number of sequences in the alignment
       (fractions of a sequence count as one, e.g. 3.2 becomes 4). The second
       difference is the idea of ´consensus by similarity´; this tries to take
       account of the situations where all the sequences may have (for
       example) R or K at a position, but neither in a majority. It would not
       be logical to shade one type of residue as ´identical´ and the other as
       ´similar´; the threshold function might also eliminate both as being in
       too small numbers. Therefore, if there is not a single residue that is
       conserved (greater than the threshold) at a position, the program looks
       for a ´group´ of amino acids that fulfills the requirements. ´Groups´
       are defined in the .grp files. Users can tailor these to their personal
       prejudices. Any amino acid not listed is assumed not to be in a group.
       All members of a group are considered to be mutually similar, unlike
       the .sim files, described below. If consensus by similarity is found,
       all the residues in the consensus are shaded using the ´similar´
       shading defined by the user. If the user does not select ´shading by
       similarity´, only identity-type consensus is looked at. If an
       identity-type consensus is found, and similarity shading is in
       operation, the program looks to see if the remaining residues are
       similar to the consensus residue. Here the box_xxx.sim files are used.
       The main difference between relationships in these files and those in
       the .grp files is that, e.g. in a .grp file the line STA means that all
       three a.a.s are mutually similar. In a .sim file S TA means that both T
       and A are considered similar to S, where there is a conserved S residue
       in more than threshold number of sequences. However, it does NOT mean
       that T and A are similar to each other. Note that cases where two
       residues, or groups of residues, fulfill the threshold requirements (as
       could happen with values of the thr. fraction less than or equal to
       0.5) are treated as having no consensus. This describes the main
       shading model ´shading according to a consensus´. The alternative model
       is called ´shading according to a master sequence´. In this case the
       user is prompted for a sequence of the alignment and consecutively that
       sequence is taken to be the ´consensus´. Only those residues become
       shaded that are identical or similar to the chosen sequence. Output
       obtained with this option tends to be less shaded and neglects
       similarities beween the other (non-chosen) sequences. Starting in V2.7,
       this ´master sequence´ can be hidden. Thus, it only influences the
       shading of the other sequences without being shown itself.

   Consensus display
       Starting with version 2.5, BOXSHADE offers the possibility to create an
       additional line holding a consensus symbol. This line can either be
       obtained by using the command line qualifier /CONS or interactively by
       answering the question ´ create consensus? ´. The way this consensus
       line is displayed can be modified by the command line parameter
       SYMBCONS=xyz, by editing the respecitve entry in the .PAR file or
       interactively. Since the SYMBCONS syntax is not intuitive, here a brief
       description: The SYMBCONS parameter consist of exactly three symbols: +
       the first one stands for ´normal´ sequence residues that are not
       involved in any similar/identical relationship. + the second symbol
       represents positions that are similar in all sequences of the
       alignment. See the files BOX_PEP.SIM and BOX_DNA.SIM to see what
       residues are considered similar. + the third symbol represents
       positions that are identical in all sequences of the alignment. A
       SYMBCONS parameter string " .*" (blank/point/asterisc) means: label all
       positions in the alignment with totally identical residues by an
       asterisc, all positions with all similar residues by a point and do not
       mark the other positions. The letter ´B´ can be used instead of the
       blank, this is necessary e.g. when using the command line option
       /SYMBCONS=B.* which gives the same result as the above example. The
       option /SYMBCONS= .* would result in an unexpected behaviour because
       MSDOS squeezes blanks out of the command line. Besides points,
       asteriscs and other symbols, there are two special characters when they
       appear in the SYMBCONS string: ´L´ and ´U´. An ´L´ means, that a
       lowercase representation of the most abundant residue at that position
       is to be used instead of a fixed consensus symbol while an ´U´ means an
       uppercase character representation of that residue. A possible
       application would be the SYMBCONS string " LU" where similar residues
       are represented by lowercase characters and identical by uppercase

   Shareware/PD programs useful in conjunction with BOXSHADE
       multiple alignment files that to be used by BOXSHADE can be created,
       amongst others, by the following PD/freeware programs: + PHYLIP by Joe
       Felsenstein, available by ftp from + ESEE by Eric
       Cabot, available from the same sources as BOXSHADE (see above) +
       CLUSTAL by Des Higgins, ditto for preview/conversion of POSTSCRIPT
       files, the program GHOSTSCRIPT from GNU software foundation is highly
       recommended. It is available from all major MSDOS ftp-sites (e.g.
       SIMTEL or There is also a version tested for use with
       boxshade available at although this might be
       not the most recent release. for Mac users, there is MacGhostscript,
       also available from the main archives (info-mac, umich and their
       mirrors). A *very* good tool for putting a preview image into an EPSF
       file, often a prerequisite for incorporating into a drawing package, is
       PS2EPS, by Peter Lerup. This can be found on info-mac. for
       preview/conversion of HPGL files, the shareware program PRINTGL 1.18 by
       Cary Ravitz is highly recommended. It is available from many MSDOS ftp
       sites and from - output on dot printers -
       Since PRINTGL offers a broad choice of printer types and is a nice
       program, I recommend its use for printing BOXSHADE output on
       non-POSTSCRIPT printers. Use HPGL output with options 0F1N for normal
       residues 2F1N for identical residues 3F1N for similar residues 2F4N for
       conserved residues 8 for character size not rotated (these are the
       standard parameters in BOX_PEP.PAR) for creating a HPGL files. (lets
       call it TEST.PLT) Now use PRINTGL either interactively by calling PMI
       or use a command line like: PRINTGL /Fx/S0340/Waaac/Ptest.plt where
       test.plt is to be replaced by the filename to convert and the x in the
       expression /Fx is to be replaced by the letter of the printer you use.
       (See the PRINTGL documentation for further details)


       The RTF output and PHYLIP input implementations are still experimental.
       Please tell me of your experiences with the program. + the current DOS
       version supports only 13 sequences with 2000 residues each. This
       parameters can be easily changed in the source code. If you cannot
       compile the sources because you are lacking a pascal compiler, contact
       the author for precompiled versions


       There is no publication on BOXSHADE and none is planned. Most people
       just use it for figures in publications and don´t mention anything,
       this is ok for the authors of BOXSHADE. If you really feel like
       mentioning BOXSHADE, you could either acknowledge it in the figure
       legend or in the Mat&Meth part on sequence analysis.



       seaview(1) kalign(1)


       Kay Hofmann <>
       ISREC, Bioinformatics Group,
                   Epalinges s/Lausanne             Switzerland

           Wrote Boxshade.

       Michael Baron <>
       BBSRC Institute for Animal Health,
                   GU24 0NF

           Wrote Boxshade.

       Harmut Schirmer <>
       Technische Fakultaet,
                   Kaiserstr. 2

           C port of Boxshade. (don´t send Kay or Michael any questions
           concerning the ´C´ version of boxshade)

       Steffen Möller <>
           Wrote the manpage.

       Charles Plessy <>
           Updated the manpage


       Copyright © 1997 Kay Hofmann, Michael Baron and Harmut Schirmer
       Copyright © 2003, 2007 Steffen Moeller, Charles Plessy

       The above copyright notices refer to the program and its manpage

       BOXSHADE is completely public-domain and may be passed around and
       modified without any notice to the authors.

       This manual page was written for the Debian(TM) system but may be used
       by others. Permission is granted to copy, distribute and/or modify this
       document under same terms as boxshade itself.