Man Linux: Main Page and Category List

NAME

       hmmsearch - search a sequence database with a profile HMM

SYNOPSIS

       hmmsearch [options] hmmfile seqfile

DESCRIPTION

       hmmsearch   reads   an  HMM  from  hmmfile  and  searches  seqfile  for
       significantly similar sequence matches.

       seqfile will be looked for first in the current working directory, then
       in  a  directory  named by the environment variable BLASTDB.  This lets
       users use existing BLAST databases, if BLAST has  been  configured  for
       the site.

       hmmsearch  may take minutes or even hours to run, depending on the size
       of the sequence database. It is a good idea to redirect the output to a
       file.

       The output consists of four sections: a ranked list of the best scoring
       sequences, a ranked list of the best scoring  domains,  alignments  for
       all  the  best  scoring  domains,  and  a  histogram  of the scores.  A
       sequence score may be higher than a domain score for the same  sequence
       if  there  is  more than one domain in the sequence; the sequence score
       takes into account all the domains.  All sequences scoring above the -E
       and  -T cutoffs are shown in the first list, then every domain found in
       this list is shown in the second list of domain hits.  If  desired,  E-
       value  and  bit score thresholds may also be applied to the domain list
       using the --domE and --domT options.

OPTIONS

       -h     Print brief help; includes version number  and  summary  of  all
              options, including expert options.

       -A <n> Limits  the  alignment  output  to the <n> best scoring domains.
              -A0 shuts off the alignment output and can be used to reduce the
              size of output files.

       -E <x> Set  the  E-value cutoff for the per-sequence ranked hit list to
              <x>, where <x> is a positive real number. The default  is  10.0.
              Hits  with  E-values better than (less than) this threshold will
              be shown.

       -T <x> Set the bit score cutoff for the per-sequence ranked hit list to
              <x>,  where  <x>  is  a  real  number.   The default is negative
              infinity; by default, the threshold is controlled by E-value and
              not  by  bit  score.   Hits with bit scores better than (greater
              than) this threshold will be shown.

       -Z <n> Calculate the E-value scores  as  if  we  had  seen  a  sequence
              database  of  <n>  sequences.  The  default  is  the  number  of
              sequences seen in your database file <seqfile>.

EXPERT OPTIONS

       --compat
              Use the output format  of  HMMER  2.1.1,  the  1998-2001  public
              release; provided so 2.1.1 parsers don’t have to be rewritten.

       --cpu <n>
              Sets  the  maximum  number of CPUs that the program will run on.
              The default is to use all CPUs in  the  machine.  Overrides  the
              HMMER_NCPU  environment variable. Only affects threaded versions
              of HMMER (the default on most systems).

       --cut_ga
              Use Pfam GA (gathering threshold) score cutoffs.  Equivalent  to
              --globT <GA1> --domT <GA2>, but the GA1 and GA2 cutoffs are read
              from the HMM file. hmmbuild puts  these  cutoffs  there  if  the
              alignment file was annotated in a Pfam-friendly alignment format
              (extended  SELEX  or  Stockholm  format)  and  the  optional  GA
              annotation line was present. If these cutoffs are not set in the
              HMM file, --cut_ga doesn’t work.

       --cut_tc
              Use Pfam  TC  (trusted  cutoff)  score  cutoffs.  Equivalent  to
              --globT <TC1> --domT <TC2>, but the TC1 and TC2 cutoffs are read
              from the HMM file. hmmbuild puts  these  cutoffs  there  if  the
              alignment file was annotated in a Pfam-friendly alignment format
              (extended  SELEX  or  Stockholm  format)  and  the  optional  TC
              annotation line was present. If these cutoffs are not set in the
              HMM file, --cut_tc doesn’t work.

       --cut_nc
              Use Pfam NC (noise cutoff) score cutoffs. Equivalent to  --globT
              <NC1>  --domT  <NC2>,  but the NC1 and NC2 cutoffs are read from
              the HMM file. hmmbuild puts these cutoffs there if the alignment
              file was annotated in a Pfam-friendly alignment format (extended
              SELEX or Stockholm format) and the optional NC  annotation  line
              was  present.  If  these  cutoffs  are  not set in the HMM file,
              --cut_nc doesn’t work.

       --domE <x>
              Set the E-value cutoff for the per-domain  ranked  hit  list  to
              <x>,  where  <x>  is  a  positive  real  number.  The default is
              infinity; by default, all domains in the sequences  that  passed
              the first threshold will be reported in the second list, so that
              the number of domains  reported  in  the  per-sequence  list  is
              consistent with the number that appear in the per-domain list.

       --domT <x>
              Set  the  bit score cutoff for the per-domain ranked hit list to
              <x>, where <x>  is  a  real  number.  The  default  is  negative
              infinity;  by  default, all domains in the sequences that passed
              the first threshold will be reported in the second list, so that
              the  number  of  domains  reported  in  the per-sequence list is
              consistent with the number that appear in the  per-domain  list.
              Important  note:  only  one  domain  in a sequence is absolutely
              controlled by this parameter, or  by  --domT.   The  second  and
              subsequent  domains  in  a  sequence  have  a de facto bit score
              threshold of 0 because of the details of how HMMER works.  HMMER
              requires  at least one pass through the main model per sequence;
              to do more than one pass (more than one domain) the  multidomain
              alignment  must  have  a  better  score  than  the single domain
              alignment, and hence the extra domains must contribute  positive
              score. See the Users’ Guide for more detail.

       --forward
              Use  the  Forward  algorithm instead of the Viterbi algorithm to
              determine the per-sequence scores. Per-domain scores  are  still
              determined  by  the  Viterbi  algorithm.  Some  have argued that
              Forward is a  more  sensitive  algorithm  for  detecting  remote
              sequence   homologues;   my  experiments  with  HMMER  have  not
              confirmed this, however.

       --informat <s>
              Assert that the input seqfile is  in  format  <s>;  do  not  run
              Babelfish  format autodection. This increases the reliability of
              the program somewhat, because the Babelfish can  make  mistakes;
              particularly recommended for unattended, high-throughput runs of
              HMMER. Valid format strings include FASTA, GENBANK,  EMBL,  GCG,
              PIR,  STOCKHOLM, SELEX, MSF, CLUSTAL, and PHYLIP. See the User’s
              Guide for a complete list.

       --null2
              Turn off the post  hoc  second  null  model.  By  default,  each
              alignment  is  rescored by a postprocessing step that takes into
              account possible biased composition in either  the  HMM  or  the
              target sequence.  This is almost essential in database searches,
              especially with local alignment models. There is  a  very  small
              chance  that  this postprocessing might remove real matches, and
              in these cases --null2 may improve sensitivity at the expense of
              reducing specificity by letting biased composition hits through.

       --pvm  Run on a Parallel Virtual Machine (PVM). The PVM must already be
              running.  The  client program hmmsearch-pvm must be installed on
              all the PVM nodes.  Optional PVM support must have been compiled
              into HMMER.

       --xnu  Turn on XNU filtering of target protein sequences. Has no effect
              on nucleic acid sequences. In trial experiments,  --xnu  appears
              to perform less well than the default post hoc null2 model.

SEE ALSO

       Master  man  page,  with  full  list of and guide to the individual man
       pages: see hmmer(1).

       For complete documentation, see the  user  guide  that  came  with  the
       distribution    (Userguide.pdf);   or   see   the   HMMER   web   page,
       http://hmmer.wustl.edu/.

COPYRIGHT

       Copyright (C) 1992-2003 HHMI/Washington University School of Medicine.
       Freely distributed under the GNU General Public License (GPL).
       See the file COPYING in your distribution for details on redistribution
       conditions.

AUTHOR

       Sean Eddy
       HHMI/Dept. of Genetics
       Washington Univ. School of Medicine
       4566 Scott Ave.
       St Louis, MO 63110 USA
       http://www.genetics.wustl.edu/eddy/