RNAhybrid - calculate secondary structure hybridisations of RNAs

NAME

       RNAhybrid - calculate secondary structure hybridisations of RNAs

SYNOPSIS

       RNAhybrid  [-h]  [-b hit_number] [-e energy_cutoff] [-p p-value_cutoff]
       [-c] [-d xi,theta] [-s set_name] [-f  from,to]  [-m  max_target_length]
       [-n max_query_length] [-u iloop_upper_limit] [-v bloop_upper_limit] [-g
       (ps|png|jpg|all)] [-t target_file] [-q query_file] [target] [query]

DESCRIPTION

       RNAhybrid  is  a  tool  for   finding   minimum   free   energy   (mfe)
       hybridisations  of  a  long  (target)  and  a  short  (query)  RNA. The
       hybridisation is performed in a kind of  domain  mode,  ie.  the  short
       sequence  is  hybridised to the best fitting parts of the long one. The
       tool is primarily meant as a means for microRNA target prediction.   In
       addition  to  mfes,  the  program  calculates p-values based on extreme
       value distributions of length normalised energies.

OPTIONS

       -h     Give a short summary of command line options.

       -b hit_number
              Maximal number of hits to show. hit_number hits with  increasing
              minimum  free  energy  (reminder: larger energies are worse) are
              shown, unless the -e option is used and the energy  cut-off  has
              been  exceeded  (see -e option below) or there are no more hits.
              Hits may only overlap at dangling bases (5’ or 3’  unpaired  end
              of target).

       -c     Produce  compact  output. For each target/query pair one line of
              output is generated. Each line is a colon (:) separated list  of
              the  following  fields:  target  name,  query name, minimum free
              energy, position in target, alignment line 1, line  2,  line  3,
              line  4. If a target or a query is given on command line (ie. no
              -t or -q respectively), its name in the output will be  "command
              line".

       -d xi,theta
              xi   and   theta   are   the   position  and  shape  parameters,
              respectively, of an extreme value distribution  (evd).  p-values
              of  duplex  energies  are assumed to be distributed according to
              such an evd. For a length normalised energy en, we have  P[X  <=
              en] = 1 - exp(-exp(-(-en-xi)/theta)), where en = e/log(m*n) with
              m and  n  being  the  lengths  of  the  target  and  the  query,
              respectively.  If  the  -d  option  is omitted, xi and theta are
              estimated from the maximal duplex energy of the query,  assuming
              a  linear  dependence.  The parameters of this linear dependence
              are coded into the program, but the option -s has to be given to
              choose  from the appropriate set. Note that the evd is mirrored,
              since good mfes are large negative values.

       -s set_name
              Used for quick estimate of extreme value distribution parameters
              (see  -d  option above). Tells RNAhybrid which target dataset to
              assume. Valid parameters are 3utr_fly, 3utr_worm and 3utr_human.

       -e energy_cutoff
              Hits  with  increasing  minimum  free  energy  (reminder: larger
              energies are worse) less than  or  equal  to  energy_cutoff  are
              shown,  unless  the  -b option is used and the number of already
              reported hits has reached the maximal hit_number (see -b  option
              above).  Hits  may  only  overlap  at  dangling  bases (5’ or 3’
              unpaired end of target).

       -p p-value_cutoff
              Only hits with  p-values  not  larger  than  p-value_cutoff  are
              reported.  See also options -d and -s.

       -f from,to
              Forces  all  structures  to  have  a helix from position from to
              position to with respect  to  the  query.  The  first  base  has
              position 1.

       -m max_target_length
              The  maximum  allowed  length  of a target sequence. The default
              value is 2000. This option only has an effect if a  target  file
              is given with the -t option (see below).

       -n max_query_length
              The  maximum  allowed  length  of  a query sequence. The default
              value is 30. This option only has an effect if a query  file  is
              given with the -q option (see below).

       -u iloop_upper_limit
              The  maximally  allowed number of unpaired nucleotides in either
              side of an internal loop.

       -v bloop_upper_limit
              The maximally allowed number of unpaired nucleotides in a  bulge
              loop.

       -g (ps|png|jpg|all)
              Produce  a  plot  of the hybridisation, either in ps, png or jpg
              format, or for all formats together.  The  plots  are  saved  in
              files  whose  names  are created from the target and query names
              ("command_line" if given on the command line). This option  only
              works, if the appropriate graphics libraries are present.

       -t target_file
              Each  of  the  target  sequences  in  target_file  is subject to
              hybridisation with each of the queries (which  either  are  from
              the query_file or is the one query given on command line; see -q
              below). The sequences in the target_file have  to  be  in  FASTA
              format,  ie. one line starting with a > and directly followed by
              a name, then one or  more  following  lines  with  the  sequence
              itself.  Each  individual  sequence line must not have more than
              1000 characters. If no -t is given, either the last argument (if
              a  -q  is given) or the second last argument (if no -q is given)
              to RNAhybrid is taken as a target.

       -q query_file
              See -t option above.

REFERENCES

       The energy parameters are taken from:

       Mathews  DH,  Sabina  J,  Zuker  M,  Turner  DH.   "Expanded   sequence
       dependence  of  thermodynamic  parameters  improves  prediction  of RNA
       secondary structure" J Mol Biol., 288 (5), pp 911-940, 1999

       The graphical output uses code from the Vienna RNA package:

       Hofacker IL.  "Vienna RNA secondary structure server."   Nucleic  Acids
       Research, 31 (13), pp 3429-3431, 2003

VERSION

       This man page documents version 2.0 of RNAhybrid.

AUTHORS

       Marc Rehmsmeier, Peter Steffen, Matthias Hoechsmann.

LIMITATIONS

       Character  dependent  energy  values are only defined for [acgtuACGTU].
       All other characters lead to values of zero in these cases.

BUGS

       In suboptimal hits, dangling ends appear as Ns  if  they  were  in  the
       first or last hybridising position of a previous hit.

NAME

SYNOPSIS

DESCRIPTION

OPTIONS

REFERENCES

VERSION

AUTHORS

LIMITATIONS

BUGS

SEE ALSO