NAME
HMMER - profile hidden Markov model software
SYNOPSIS
hmmalign
Align multiple sequences to a profile HMM.
hmmbuild
Build a profile HMM from a given multiple sequence alignment.
hmmcalibrate
Determine appropriate statistical significance parameters for a
profile HMM prior to doing database searches.
hmmconvert
Convert HMMER profile HMMs to other formats, such as GCG
profiles.
hmmemit
Generate sequences probabilistically from a profile HMM.
hmmfetch
Retrieve an HMM from an HMM database
hmmindex
Create a binary SSI index for an HMM database
hmmpfam
Search a profile HMM database with a sequence (i.e., annotate
various kinds of domains in the query sequence).
hmmsearch
Search a sequence database with a profile HMM (i.e., find
additional homologues of a modeled family).
DESCRIPTION
These programs use profile hidden Markov models (profile HMMs) to model
the primary structure consensus of a family of protein or nucleic acid
sequences.
OPTIONS
All HMMER programs give a brief summary of their command-line syntax
and options if invoked without any arguments. When invoked with the
single argument, -h (i.e., help), a program will report more verbose
command-line usage information, including rarely used, experimental,
and expert options. -h will report version numbers which are useful if
you need to report a bug or problem to me.
Each HMMER program has its own man page briefly summarizing command
line usage. There is also a user’s guide that came with the software
distribution, which includes a tutorial introduction and more detailed
descriptions of the programs.
See http://hmmer.wustl.edu/ for on-line documentation and the current
HMMER release.
In general, no command line options should be needed by beginning
users. The defaults are set up for optimum performance in most
situations. Options that are single lowercase letters (e.g. -a ) are
"common" options that are expected to be frequently used and will be
important in many applications. Options that are single uppercase
letters (e.g. -B ) are usually less common options, but also may be
important in some applications. Options that are full words (e.g.
--verbose ) are either rarely used, experimental, or expert options.
Some experimental options are only there for my own ongoing experiments
with HMMER, and may not be supported or documented adequately.
SEQUENCE FILE FORMATS
In general, HMMER attempts to read most common biological sequence file
formats. It autodetects the format of the file. It also autodetects
whether the sequences are protein or nucleic acid. Standard IUPAC
degeneracy codes are allowed in addition to the usual 4-letter or
20-letter codes.
Unaligned sequences
Unaligned sequence files may be in FASTA, Swissprot, EMBL,
GenBank, PIR, Intelligenetics, Strider, or GCG format. These
formats are documented in the User’s Guide.
Sequence alignments
Multiple sequence alignments may be in CLUSTALW, SELEX, or GCG
MSF format. These formats are documented in the User’s Guide.
ENVIRONMENT VARIABLES
For ease of using large stable sequence and HMM databases, HMMER looks
for sequence files and HMM files in the current working directory as
well as in system directories specified by environment variables.
BLASTDB
Specifies the directory location of sequence databases. Example:
/seqlibs/blast-db/. In installations that use BLAST software,
this environment variable is likely to already be set.
HMMERDB
Specifies the directory location of HMM databases. Example:
/seqlibs/pfam/.