NAME
RNAcalibrate - calibrate statistics of secondary structure
hybridisations of RNAs
SYNOPSIS
RNAcalibrate [-h] [-d frequency_file] [-f from,to] [-k sample_size] [-l
mean,std] [-m max_target_length] [-n max_query_length] [-u
iloop_upper_limit] [-v bloop_upper_limit] [-s] [-t target_file] [-q
query_file] [target] [query]
DESCRIPTION
RNAcalibrate is a tool for calibrating minimum free energy (mfe)
hybridisations performed with RNAhybrid. It searches a random database
that can be given on the command line or otherwise generates random
sequences according to given sample size, length distribution
parameters and dinucleotide frequencies. To the empirical distribution
of length normalised minimum free energies, parameters of an extreme
value distribution (evd) are fitted. The output gives for each miRNA
its name (or "command_line" if it was submitted on the command line),
the number of data points the evd fit was done on, the location and the
scale parameter. The location and scale parameters of the evd can then
be given to RNAhybrid for the calculation of mfe p-values.
OPTIONS
-h Give a short summary of command line options.
-d frequency_file
Generate random sequences according to dinucleotide frequencies
given in frequency_file. See example directory for example
files.
-f from,to
Forces all structures to have a helix from position from to
position to with respect to the query. The first base has
position 1.
-k sample_size
Generate sample_size random sequences. Default value is 5000.
-l mean,std
Generate random sequences with a normal length distribution of
mean mean and standard deviation std. Default values are 500 and
300, respectively.
-m max_target_length
The maximum allowed length of a target sequence. The default
value is 2000. This option only has an effect if a target file
is given with the -t option (see below).
-n max_query_length
The maximum allowed length of a query sequence. The default
value is 30. This option only has an effect if a query file is
given with the -q option (see below).
-u iloop_upper_limit
The maximally allowed number of unpaired nucleotides in either
side of an internal loop.
-v bloop_upper_limit
The maximally allowed number of unpaired nucleotides in a bulge
loop.
-s Generate random sequences according to the dinucleotide
distribution of given targets (either with the -t option or on
command line. If no -t is given, either the last argument (if a
-q is given) or the second last argument (if no -q is given) to
RNAcalibrate is taken as a target). See -t option.
-t target_file
Without the -s option, each of the target sequences in
target_file is subject to hybridisation with each of the queries
(which either are from the query_file or is the one query given
on command line; see -q below). The sequences in the target_file
have to be in FASTA format, ie. one line starting with a > and
directly followed by a name, then one or more following lines
with the sequence itself. Each individual sequence line must not
have more than 1000 characters.
With the -s option, the target (or target file) dinucleotide
distribution is counted, and random sequences are generated
according to this distribution.
If no -t is given, random sequences are generated as described
above (see -d option).
-q query_file
See -t option above. If no -q is given, the last argument to
RNAcalibrate is taken as a query.
REFERENCES
The energy parameters are taken from:
Mathews DH, Sabina J, Zuker M, Turner DH. "Expanded sequence
dependence of thermodynamic parameters improves prediction of RNA
secondary structure" J Mol Biol., 288 (5), pp 911-940, 1999
VERSION
This man page documents version 2.0 of RNAcalibrate.
AUTHORS
Marc Rehmsmeier, Peter Steffen, Matthias Hoechsmann.
LIMITATIONS
Character dependent energy values are only defined for [acgtuACGTU].
All other characters lead to values of zero in these cases.
SEE ALSO
RNAhybrid, RNAeffective