readseq - Reads and writes nucleic/protein sequences in various formats

NAME

       readseq - Reads and writes nucleic/protein sequences in various formats

SYNOPSIS

       readseq [-options] in.seq > out.seq

DESCRIPTION

       This manual page documents briefly the readseq  command.   This  manual
       page  was  written  for  the  Debian GNU/Linux distribution because the
       original program  does  not  have  a  manual  page.   Instead,  it  has
       documentation in text form, see below.

       readseq  reads  and  writes  biosequences  (nucleic/protein) in various
       formats.   Data  files  may  have  multiple  sequences.    readseq   is
       particularly  useful as it automatically detects many sequence formats,
       and interconverts among them.

FORMATS

       Formats which readseq currently understands:

         * IG/Stanford, used by Intelligenetics and others
         * GenBank/GB, genbank flatfile format
         * NBRF format
         * EMBL, EMBL flatfile format
         * GCG, single sequence format of GCG software
         * DNAStrider, for common Mac program
         * Fitch format, limited use
         * Pearson/Fasta, a common format used by Fasta programs and others
         * Zuker format, limited use. Input only.
         * Olsen, format printed by Olsen VMS sequence editor. Input only.
         * Phylip3.2, sequential format for Phylip programs
         * Phylip, interleaved format for Phylip programs (v3.3, v3.4)
         * Plain/Raw, sequence data only (no name, document, numbering)
         + MSF multi sequence format used by GCG software
         + PAUP’s multiple sequence (NEXUS) format
         + PIR/CODATA format used by PIR
         + ASN.1 format used by NCBI
         + Pretty print with various options for nice looking  output.  Output
       only.
         + LinAll format, limited use (LinAll and ConStruct programs)
         + Vienna format used by ViennaRNA programs

       See the included "Formats" file for detail on file formats.

OPTIONS

       -help  Show summary of options.

       -a[ll] Select All sequences

       -c[aselower]
              Change to lower case

       -C[ASEUPPER]
              Change to UPPER CASE

       -degap[=-]
              Remove gap symbols

       -i[tem=2,3,4]
              Select Item number(s) from several

       -l[ist]
              List sequences only

       -o[utput=]out.seq
              Redirect Output

       -p[ipe]
              Pipe (command line, <stdin, >stdout)

       -r[everse]
              Change to Reverse-complement

       -v[erbose]
              Verbose progress

       -f[ormat=]# Format number for output, or
                  -f[ormat=]Name Format name for output:
                  1. IG/Stanford           11. Phylip3.2
                  2. GenBank/GB            12. Phylip
                  3. NBRF                  13. Plain/Raw
                  4. EMBL                  14. PIR/CODATA
                  5. GCG                   15. MSF
                  6. DNAStrider            16. ASN.1
                  7. Fitch                 17. PAUP/NEXUS
                  8. Pearson/Fasta         18. Pretty (out-only)
                  9. Zuker (in-only)       19. LinAll
                 10. Olsen (in-only)       20. Vienna

              Pretty format options:

       -wid[th]=#
              Sequence line width

       -tab=# Left indent

       -col[space]=#
              Column space within sequence line on output

       -gap[count]
              Count gap chars in sequence numbers

       -nameleft, -nameright[=#]
              Name on left/right side [=max width]

       -nametop
              Name at top/bottom

       -numleft, -numright
              Seq index on left/right side

       -numtop, -numbot
              Index on top/bottom

       -match[=.]
              Use match base for 2..n species

       -inter[line=#]
              Blank line(s) between sequence blocks

EXAMPLES

         readseq
                    -- for interactive use

         readseq my.1st.seq  my.2nd.seq  -all  -format=genbank  -output=my.gb
                    -- convert all of two input files to  one  genbank  format
              output file

         readseq  my.seq  -all  -form=pretty  -nameleft=3  -numleft  -numright
       -numtop -match
                    -- output to standard output a file in a pretty format

         readseq my.seq -item=9,8,3,2 -degap -CASE -rev -f=msf -out=my.rev
                    --   select  4  items  from  input,  degap,  reverse,  and
              uppercase them

         cat *.seq | readseq -pipe -all -format=asn > bunch-of.asn
                    -- pipe a bunch of data thru readseq,  converting  all  to
              asn

AUTHOR

       This   manual    page    was    written    by    Stephane    Bortzmeyer
       <bortzmeyer@debian.org>,  for  the  Debian GNU/Linux system (but may be
       used by others).

NAME

SYNOPSIS

DESCRIPTION

FORMATS

OPTIONS

EXAMPLES

SEE ALSO

AUTHOR