Man Linux: Main Page and Category List

NAME

       srf2fastq - Converts SRF files to Sanger fastq format

SYNOPSIS

       srf2fastq  [options] srf_archive ...

DESCRIPTION

       srf2fastq  extracts  sequences  and  qualities  from  one  or  more SRF
       archives and writes them in Sanger fastq format to stdout.

       Note that Illumina also  have  a  fastq  format  (used  in  the  GERALD
       directories)  which  differs slightly in the use of log-odds scores for
       the quality values. The format described here is using the  traditional
       Phred style of quality encoding.

OPTIONS

       -c     Outputs  calibrated  confidence  values using the ZTR CNF1 chunk
              type for a  single  quality  per  base.  Without  this  use  the
              original  Illumina  _prb.txt  files  consisting  of four quality
              values per base, stored in the ZTR CNF4 chunks.

       -C     Masks out sequences tagged as bad quality.

       -s root
              Generates files on disk with filenames starting root,  one  file
              per  non-explicit  element  in  the SRF/ZTR region (REGN) chunk.
              Typically this results in two files for  paired  end  runs.  The
              filename  suffixes  come from the names listed in the SRF region
              chunks.  This option conflicts with the -S parameter.

       -S     Splits sequences  into  regions,  but  sequentially  lists  each
              sequence region to stdout instead of splitting to separate files
              on disk. This option conflicts with the -s parameter.

       -n     When  using  -s  the  filename  suffixes  are  simply   numbered
              (starting  with  1) instead of using the names listed in the SRF
              region chunks.

       -a     Appends region index to the sequence names. Ie generate "name/1"
              and "name/2" for a paired read.

       -e     Include  any explicit sequence (ZTR region chunk of type ā€™Eā€™) in
              the sequence output. The explicit sequence is also  included  in
              the quality line too. Currently this is utilised by ABI SOLiD to
              store the last base of the primer.

       -r region list
              Reverse complements the sequence and reverses the quality values
              for  all  regions  in the region list. This is a comma separated
              list of integer values enumerating the regions, starting from 1.
              Note  that  this  option  only  works  when  either -s or -S are
              specified.

EXAMPLES

       To extract only the good quality sequences from all srf  files  in  the
       current directory using calibrated confidence values (if available).

           srf2fastq -c -C *.srf > runX.fastq

       To  extract  a  paired  end  run into two separate files with sequences
       named name/1 and name/2.

           srf2fastq -s runX -a -n runX.srf

       To extract a paired end run as a single file, alternating  forward  and
       reverse sequences, with the second read being reverse complemented.

           srf2fastq -S -r 2 runX.srf > runX.fastq

AUTHOR

       James Bonfield, Steven Leonard - Wellcome Trust Sanger Institute

                                  December 10                     srf2fastq(1)