NAME
compalign - compare two multiple alignments
SYNOPSIS
compalign [-options] <trusted-alignment> <test-alignment>
DESCRIPTION
compalign calculates the fractional "identity" between the trusted
alignment and the test alignment. The two files must contain exactly
the same sequences, in exactly the same order.
The identity of the multiple sequence alignments is defined as the
averaged identity over all N(N-1)/2 pairwise alignments.
The fractional identity of two sets of pairwise alignments is in turn
defined as follows (for aligned known sequences k1 and k2, and aligned
test sequences t1 and t2):
matched columns / total columns
where total columns = the total number of columns in which there is
a valid (nongap) symbol in k1 or k2;
matched columns = the number of columns in which one of the
following is true:
k1 and k2 both have valid symbols at a given column; t1 and t2
have the same symbols aligned in a column of the t1/t2
alignment;
k1 has a symbol aligned to a gap in k2; that symbol in t1 is
also aligned to a gap;
k2 has a symbol aligned to a gap in k1; that symbol in t2 is
also aligned to a gap.
Because scores for all possible pairs are calculated, the algorithm is
of order (N^2)L for N sequences of length L; large sequence sets will
take a while.
OPTIONS
Available options:
-h Print short help and usage info.
-c Only compare under marked #=CS consensus structure.
--informat <s>
Specify that both alignments are in format <s> (MSF, for
instance).
--quiet
Suppress verbose header (used in regression testing).
SEE ALSO
afetch(1), alistat(1), compstruct(1), revcomp(1), seqsplit(1),
seqstat(1), sfetch(1), shuffle(1), sindex(1), sreformat(1),
stranslate(1), weight(1).
AUTHOR
Sean Eddy
HHMI/Department of Genetics
Washington University School of Medicine
4444 Forest Park Blvd., Box 8510
St Louis, MO 63108 USA
Phone: 1-314-362-7666
FAX : 1-314-362-2157
Email: eddy@genetics.wustl.edu
This manual page was written by Nelson A. de Oliveira <naoliv@gmail.com>,
for the Debian project (but may be used by others).
Mon, 01 Aug 2005 15:28:08 -0300