cmemit - generate sequences from a covariance model

NAME

       cmemit - generate sequences from a covariance model

SYNOPSIS

       cmemit [options] cmfile seqfile

DESCRIPTION

       cmemit  reads  the  covariance model(s) (CMs) in cmfile and generates a
       number of sequences from the CM(s); or if the -c  option  is  selected,
       generates  a  single  majority-rule  consensus.  This can be useful for
       various application in  which  one  needs  a  simulation  of  sequences
       consistent  with  a  sequence  family  consensus.  By  default,  cmemit
       generates 10 sequences and outputs them in FASTA (unaligned) format  to
       seqfile.

GENERAL OPTIONS

       -h     Print  brief  help;  includes  version number and summary of all
              options, including expert options.

       -o <f> Save the synthetic sequences to file  <f>  rather  than  writing
              them to stdout.

       -n <n> Generate <n> sequences. Default is 10.

       -u     Write  the generated sequences in unaligned format (FASTA). This
              is the default, so this option is probably useless.

       -a     Write the generated sequences in an aligned  format  (STOCKHOLM)
              with consensus structure annotation rather than FASTA.

       -c     Predict  a  single  majority-rule  consensus sequence instead of
              sampling  sequences  from  the  CM´s  probability  distribution.
              Highly  conserved  residues  (base  paired  residues  that score
              higher than 3.0 bits, or single  stranded  residues  that  score
              higher  than 1.0 bits) are shown in upper case; others are shown
              in lower case.

       -l     Configure the CMs into local mode before emitting sequences. See
              the User’s Guide for more information on locally configured CMs.

       -s <n> Set the random seed to <n>, where <n> is a positive integer. The
              default  is  to use time() to generate a different seed for each
              run, which means that two different runs of cmemit on  the  same
              CM  will  give  different  results.  You  can use this option to
              generate reproducible results.

       --devhelp
              Print help, as with -h , but also include undocumented developer
              options.   These   options  are  not  listed  below,  are  under
              development or experimental, and are not guaranteed to even work
              correctly.  Use  developer  options  at  your own risk. The only
              resources for understanding what they actually do are the  brief
              one-line  description printed when --devhelp is enabled, and the
              source code.

EXPERT OPTIONS

       --rna  Specify that the emitted sequences be output as  RNA  sequences.
              This is true by default.

       --dna  Specify  that  the emitted sequences be output as DNA sequences.
              By default, the output alphabet is RNA.

       --tfile <f>
              Dump tabular sequence parsetrees (tracebacks) for  each  emitted
              sequence to file <f>.  Primarily useful for debugging.

       --exp <x>
              Exponentiate the emission and transition probabilities of the CM
              by <x> and then renormalize those distributions before  emitting
              sequences.  This  option changes the CM probability distribution
              of parsetrees relative to default. With <x> less  than  1.0  the
              emitted  sequences  will  tend  to  have  lower  bit scores upon
              alignment to the CM with cmalign.  With <x>  greater  than  1.0,
              the  emitted  sequences will tend to have higher bit scores upon
              alignment to the CM. This bit score difference will increase  as
              <x>  moves  further  away  from 1.0 in either direction.  If <x>
              equals 1.0, this option has no effect relative to default.  This
              option  is  useful  for  generating  sequences  that  are either
              difficult ( <x> < 1.0) or easy (  <x>  >  1.0)  for  the  CM  to
              distinguish as homologous from background, random sequence.

       --begin <n>
              Truncate the resulting alignment by removing all residues before
              consensus column <n>, where <n> is a positive integer no greater
              than the consensus length of the CM. Must be used in combination
              with --end and either -a or --shmm (a developer option).

       --end <n>
              Truncate the resulting alignment by removing all residues  after
              consensus column <n>, where <n> is a positive integer no greater
              than the consensus length of the CM. Must be used in combination
              with --begin and either -a or --shmm (a developer option).

COPYRIGHT

       Copyright (C) 2009 HHMI Janelia Farm Research Campus.
       Freely distributed under the GNU General Public License (GPLv3).
       See  the  file  COPYING  that  came  with  the  source  for  details on
       redistribution conditions.

AUTHOR

       Eric Nawrocki, Diana Kolbe, and Sean Eddy
       HHMI Janelia Farm Research Campus
       19700 Helix Drive
       Ashburn VA 20147
       http://selab.janelia.org/

NAME

SYNOPSIS

DESCRIPTION

GENERAL OPTIONS

EXPERT OPTIONS

SEE ALSO

COPYRIGHT

AUTHOR