Man Linux: Main Page and Category List

NAME

       seqstat - show statistics and format for a sequence file

SYNOPSIS

       seqstat [options] seqfile

DESCRIPTION

       seqstat  reads  a  sequence  file  seqfile and shows a number of simple
       statistics about it.

       The printed statistics include the name of the format, the residue type
       of  the first sequence (protein, RNA, or DNA), the number of sequences,
       the total number of residues, and the average and range of the sequence
       lengths.

OPTIONS

       -a     Show  additional  verbose information: a table with one line per
              sequence showing name,  length,  and  description  line.   These
              lines  are prefixed with a * character to enable easily grep’ing
              them out and sorting them.

       -h     Print brief help; includes version number  and  summary  of  all
              options, including expert options.

       -B     (Babelfish).  Autodetect  and  read a sequence file format other
              than the default (FASTA). Almost any common sequence file format
              is recognized (including Genbank, EMBL, SWISS-PROT, PIR, and GCG
              unaligned sequence formats, and Stockholm, GCG MSF, and  Clustal
              alignment formats). See the printed documentation for a complete
              list of supported formats.

EXPERT OPTIONS

       --informat <s>
              Specify that the sequence file is in format <s>, rather than the
              default  FASTA  format.   Common examples include Genbank, EMBL,
              GCG, PIR, Stockholm, Clustal, MSF, or PHYLIP;  see  the  printed
              documentation  for  a  complete  list  of accepted format names.
              This option overrides the default expected  format  (FASTA)  and
              the -B Babelfish autodetection option.

       --quiet
              Suppress  the  verbose  header (program name, release number and
              date, the parameters and options in effect).

SEE ALSO

       afetch(1),   alistat(1),   compalign(1),   compstruct(1),   revcomp(1),
       seqsplit(1),    sfetch(1),    shuffle(1),    sindex(1),   sreformat(1),
       stranslate(1), weight(1).

AUTHOR

       Biosquid  and   its   documentation   are   Copyright   (C)   1992-2003
       HHMI/Washington  University School of Medicine Freely distributed under
       the GNU General Public License (GPL) See COPYING  in  the  source  code
       distribution for more details, or contact me.

       Sean Eddy
       HHMI/Department of Genetics
       Washington University School of Medicine
       4444 Forest Park Blvd., Box 8510
       St Louis, MO 63108 USA
       Phone: 1-314-362-7666
       FAX  : 1-314-362-2157
       Email: eddy@genetics.wustl.edu