clustalw - Multiple alignment of nucleic acid and protein sequences

NAME

       clustalw - Multiple alignment of nucleic acid and protein sequences

SYNOPSIS

       clustalw [-infile] file.ext [OPTIONS]

       clustalw [-help | -fullhelp]

DESCRIPTION

       Clustal W is a general purpose multiple alignment program for DNA or
       proteins.

       The program performs simultaneous alignment of many nucleotide or amino
       acid sequences. It is typically run interactively, providing a menu and
       an online help. If you prefer to use it in command-line (batch) mode,
       you will have to give several options, the minimum being -infile.

OPTIONS

   DATA (sequences)
        -infile=file.ext
           Input sequences.

        -profile1=file.ext and -profile2=file.ext
           Profiles (old alignment)

   VERBS (do things)
       -options
           List the command line parameters.

       -help or -check
           Outline the command line params.

       -fullhelp
           Output full help content.

       -align
           Do full multiple alignment.

       -tree
           Calculate NJ tree.

       -pim
           Output percent identity matrix (while calculating the tree).

        -bootstrap=n
           Bootstrap a NJ tree (n= number of bootstraps; def. = 1000).

       -convert
           Output the input sequences in a different file format.

   PARAMETERS (set things)
       General settings:
           -interactive
               Read command line, then enter normal interactive menus.

           -quicktree
               Use FAST algorithm for the alignment guide tree.

           -type=
               PROTEIN or DNA sequences.

           -negative
               Protein alignment with negative values in matrix.

           -outfile=
               Sequence alignment file name.

           -output=
               GCG, GDE, PHYLIP, PIR or NEXUS.

           -outputorder=
               INPUT or ALIGNED

           -case
               LOWER or UPPER (for GDE output only).

           -seqnos=
               OFF or ON (for Clustal output only).

           -seqnos_range=
               OFF or ON (NEW: for all output formats).

           -range=m,n
               Sequence range to write starting m to m+n.

           -maxseqlen=n
               Maximum allowed input sequence length.

           -quiet
               Reduce console output to minimum.

           -stats=file
               Log some alignments statistics to file.

       Fast Pairwise Alignments:
           -ktuple=n
               Word size.

           -topdiags=n
               Number of best diags.

           -window=n
               Window around best diags.

           -pairgap=n
               Gap penalty.

           -score
               PERCENT or ABSOLUTE.

       Slow Pairwise Alignments:
           -pwmatrix=
               :Protein weight matrix=BLOSUM, PAM, GONNET, ID or filename

           -pwdnamatrix=
               DNA weight matrix=BLOSUMIUB, BLOSUMCLUSTALW or BLOSUMfilename.

           -pwgapopen=f
               Gap opening penalty.

           -pwgapext=f
               Gap extension penalty.

       Multiple Alignments:
           -newtree=
               File for new guide tree.

           -usetree=
               File for old guide tree.

           -matrix=
               Protein weight matrix=BLOSUM, PAM, GONNET, ID or filename.

           -dnamatrix=
               DNA weight matrix=IUB, CLUSTALW or filename.

           -gapopen=f
               Gap opening penalty.

           -gapext=f
               Gap extension penalty.

           -engaps
               No end gap separation pen.

           -gapdist=n
               Gap separation pen. range.

           -nogap
               Residue-specific gaps off.

           -nohgap
               Hydrophilic gaps off.

           -hgapresidues=
               List hydrophilic res.

           -maxdiv=n
               Percent identity for delay.

           -type=
               PROTEIN or DNA

           -transweight=f
               Transitions weighting.

           -iteration=
               NONE or TREE or ALIGNMENT.

           -numiter=n
               Maximum number of iterations to perform.

       Profile Alignments:
           -profile
               Merge two alignments by profile alignment.

           -newtree1=
               File for new guide tree for profile1.

           -newtree2=
               File for new guide tree for profile2.

           -usetree1=
               File for old guide tree for profile1.

           -usetree2=
               File for old guide tree for profile2.

       Sequence to Profile Alignments:
           -sequences
               Sequentially add profile2 sequences to profile1 alignment.

           -newtree=
               File for new guide tree.

           -usetree=
               File for old guide tree.

       Structure Alignments:
           -nosecstr1
               Do not use secondary structure-gap penalty mask for profile 1.

           -nosecstr2
               Do not use secondary structure-gap penalty mask for profile 2.

           -secstrout=STRUCTURE or MASK or BOTH or NONE
               Output in alignment file.

           -helixgap=n
               Gap penalty for helix core residues.

           -strandgap=n
               Gap penalty for strand core residues.

           loopgap=n
               Gap penalty for loop regions.

           -terminalgap=n
               Gap penalty for structure termini.

           -helixendin=n
               Number of residues inside helix to be treated as terminal.

           -helixendout=n
               Number of residues outside helix to be treated as terminal.

           -strandendin=n
               Number of residues inside strand to be treated as terminal.

           -strandendout=n
               Number of residues outside strand to be treated as terminal.

       Trees:
           -outputtree=nj OR phylip OR dist OR nexus

           -seed=n
               Seed number for bootstraps.

           -kimura
               Use Kimura´s correction.

           -tossgaps
               Ignore positions with gaps.

           -bootlabels=node
               Position of bootstrap values in tree display.

           -clustering=
               NJ or UPGMA.

BUGS

       The Clustal bug tracking system can be found at
       http://bioinf.ucd.ie/bugzilla/buglist.cgi?quicksearch=clustal.

REFERENCES

       ·   Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA,
           McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD,
           Gibson TJ, Higgins DG. (2007).  Clustal W and Clustal X version
           2.0.[1] Bioinformatics, 23, 2947-2948.

       ·   Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG,
           Thompson JD. (2003).  Multiple sequence alignment with the Clustal
           series of programs.[2] Nucleic Acids Res., 31, 3497-3500.

       ·   Jeanmougin F, Thompson JD, Gouy M, Higgins DG, Gibson TJ. (1998).
           Multiple sequence alignment with Clustal X[3]. Trends Biochem Sci.,
           23, 403-405.

       ·   Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG.
           (1997).  The CLUSTAL_X windows interface: flexible strategies for
           multiple sequence alignment aided by quality analysis tools.[4]
           Nucleic Acids Res., 25, 4876-4882.

       ·   Higgins DG, Thompson JD, Gibson TJ. (1996).  Using CLUSTAL for
           multiple sequence alignments.[5] Methods Enzymol., 266, 383-402.

       ·   Thompson JD, Higgins DG, Gibson TJ. (1994).  CLUSTAL W: improving
           the sensitivity of progressive multiple sequence alignment through
           sequence weighting, position-specific gap penalties and weight
           matrix choice.[6] Nucleic Acids Res., 22, 4673-4680.

       ·   Higgins DG. (1994).  CLUSTAL V: multiple alignment of DNA and
           protein sequences.[7] Methods Mol Biol., 25, 307-318

       ·   Higgins DG, Bleasby AJ, Fuchs R. (1992).  CLUSTAL V: improved
           software for multiple sequence alignment.[8] Comput. Appl. Biosci.,
           8, 189-191.

       ·   Higgins,D.G. and Sharp,P.M. (1989).  Fast and sensitive multiple
           sequence alignments on a microcomputer.[9] Comput. Appl. Biosci.,
           5, 151-153.

       ·   Higgins,D.G. and Sharp,P.M. (1988).  CLUSTAL: a package for
           performing multiple sequence alignment on a microcomputer.[10]
           Gene, 73, 237-244.

AUTHORS

       Des Higgins
           Copyright holder for Clustal.

       Julie Thompson
           Copyright holder for Clustal.

       Toby Gibson
           Copyright holder for Clustal.

       Charles Plessy <plessy@debian.org>
           Prepared this manpage in DocBook XML for the Debian distribution.

COPYRIGHT

       Copyright © 1988–2009 Des Higgins, Julie Thompson & Toby Giboson
       (Clustal)
       Copyright © 2008–2009 Charles Plessy (This manpage)

       The binaries and source code are made available and can be distributed
       subject to the following conditions:

       ·   Users are free to redistribute Clustal W or Clustal X in it´s
           unmodified form as long as it is not for commercial gain.

       ·   Anyone wishing to redistribute Clustal commercially should contact
           Toby Gibson at gibson@embl.de

       ·   If users make changes/have ideas that they believe would be useful
           to the broader research community they can send their suggestions
           to the clustal development team at clustalw@ucd.ie where they will
           be considered for inclusion in future releases.

       This manual page and its XML source can be used, modified, and
       redistributed as if it were in public domain.

NOTES

        1. Clustal W and Clustal X version 2.0.
           http://www.ncbi.nlm.nih.gov/pubmed/17846036

        2. Multiple sequence alignment with the Clustal series of programs.
           http://www.ncbi.nlm.nih.gov/pubmed/12824352

        3. Multiple sequence alignment with Clustal X
           http://www.ncbi.nlm.nih.gov/pubmed/9810230

        4. The CLUSTAL_X windows interface: flexible strategies for multiple
           sequence alignment aided by quality analysis tools.
           http://www.ncbi.nlm.nih.gov/pubmed/9396791

        5. Using CLUSTAL for multiple sequence alignments.
           http://www.ncbi.nlm.nih.gov/pubmed/8743695

        6. CLUSTAL W: improving the sensitivity of progressive multiple
           sequence alignment through sequence weighting, position-specific
           gap penalties and weight matrix choice.
           http://www.ncbi.nlm.nih.gov/pubmed/7984417

        7. CLUSTAL V: multiple alignment of DNA and protein sequences.
           http://www.ncbi.nlm.nih.gov/pubmed/8004173

        8. CLUSTAL V: improved software for multiple sequence alignment.
           http://www.ncbi.nlm.nih.gov/pubmed/1591615

        9. Fast and sensitive multiple sequence alignments on a microcomputer.
           http://www.ncbi.nlm.nih.gov/pubmed/2720464

       10. CLUSTAL: a package for performing multiple sequence alignment on a
           microcomputer.
           http://www.ncbi.nlm.nih.gov/pubmed/3243435

NAME

SYNOPSIS

DESCRIPTION

OPTIONS

BUGS

SEE ALSO

REFERENCES

AUTHORS

COPYRIGHT

NOTES