Package picard.sam
Class SamToFastq
- java.lang.Object
-
- picard.cmdline.CommandLineProgram
-
- picard.sam.SamToFastq
-
- Direct Known Subclasses:
SamToFastqWithTags
@DocumentedFeature public class SamToFastq extends CommandLineProgram
Extracts read sequences and qualities from the input SAM/BAM file and writes them into the output file in Sanger FASTQ format. . See MAQ FASTQ specification for details. This tool can be used by way of a pipe to run BWA MEM on unmapped BAM (uBAM) files efficiently.
In the RC mode (default is True), if the read is aligned and the alignment is to the reverse strand on the genome, the read's sequence from input sam file will be reverse-complemented prior to writing it to FASTQ in order restore correctly the original read sequence as it was generated by the sequencer.
Usage example:
java -jar picard.jar SamToFastq \ I=input.bam \ FASTQ=output.fastq
-
-
Field Summary
Fields Modifier and Type Field Description String
CLIPPING_ACTION
String
CLIPPING_ATTRIBUTE
int
CLIPPING_MIN_LENGTH
Boolean
COMPRESS_OUTPUTS_PER_RG
File
FASTQ
boolean
INCLUDE_NON_PF_READS
boolean
INCLUDE_NON_PRIMARY_ALIGNMENTS
File
INPUT
boolean
INTERLEAVE
File
OUTPUT_DIR
boolean
OUTPUT_PER_RG
Integer
QUALITY
boolean
RE_REVERSE
Integer
READ1_MAX_BASES_TO_WRITE
int
READ1_TRIM
Integer
READ2_MAX_BASES_TO_WRITE
int
READ2_TRIM
String
RG_TAG
File
SECOND_END_FASTQ
File
UNPAIRED_FASTQ
-
Fields inherited from class picard.cmdline.CommandLineProgram
COMPRESSION_LEVEL, CREATE_INDEX, CREATE_MD5_FILE, GA4GH_CLIENT_SECRETS, MAX_ALLOWABLE_ONE_LINE_SUMMARY_LENGTH, MAX_RECORDS_IN_RAM, QUIET, REFERENCE_SEQUENCE, referenceSequence, specialArgumentsCollection, TMP_DIR, USE_JDK_DEFLATER, USE_JDK_INFLATER, VALIDATION_STRINGENCY, VERBOSITY
-
-
Constructor Summary
Constructors Constructor Description SamToFastq()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected static void
assertPairedMates(htsjdk.samtools.SAMRecord record1, htsjdk.samtools.SAMRecord record2)
protected String[]
customCommandLineValidation()
Put any custom command-line validation in an override of this method.protected int
doWork()
Do the work after command line has been parsed.protected Map<htsjdk.samtools.SAMReadGroupRecord,List<htsjdk.samtools.fastq.FastqWriter>>
generateAdditionalWriters(List<htsjdk.samtools.SAMReadGroupRecord> readGroups, htsjdk.samtools.fastq.FastqWriterFactory factory)
protected void
handleAdditionalRecords(htsjdk.samtools.SAMRecord currentRecord, Map<htsjdk.samtools.SAMReadGroupRecord,List<htsjdk.samtools.fastq.FastqWriter>> additionalWriters, htsjdk.samtools.SAMRecord read1, htsjdk.samtools.SAMRecord read2)
protected void
initializeAdditionalWriters()
-
Methods inherited from class picard.cmdline.CommandLineProgram
getCommandLine, getCommandLineParser, getCommandLineParserForArgs, getDefaultHeaders, getFaqLink, getMetricsFile, getPGRecord, getStandardUsagePreamble, getStandardUsagePreamble, getVersion, hasWebDocumentation, instanceMain, instanceMainWithExit, makeReferenceArgumentCollection, parseArgs, requiresReference, setDefaultHeaders, useLegacyParser
-
-
-
-
Field Detail
-
INPUT
@Argument(doc="Input SAM/BAM file to extract reads from", shortName="I") public File INPUT
-
FASTQ
@Argument(shortName="F", doc="Output FASTQ file (single-end fastq or, if paired, first end of the pair FASTQ).", mutex={"OUTPUT_PER_RG","COMPRESS_OUTPUTS_PER_RG","OUTPUT_DIR"}) public File FASTQ
-
SECOND_END_FASTQ
@Argument(shortName="F2", doc="Output FASTQ file (if paired, second end of the pair FASTQ).", optional=true, mutex={"OUTPUT_PER_RG","COMPRESS_OUTPUTS_PER_RG"}) public File SECOND_END_FASTQ
-
UNPAIRED_FASTQ
@Argument(shortName="FU", doc="Output FASTQ file for unpaired reads; may only be provided in paired-FASTQ mode", optional=true, mutex={"OUTPUT_PER_RG","COMPRESS_OUTPUTS_PER_RG"}) public File UNPAIRED_FASTQ
-
OUTPUT_PER_RG
@Argument(shortName="OPRG", doc="Output a FASTQ file per read group (two FASTQ files per read group if the group is paired).", optional=true, mutex={"FASTQ","SECOND_END_FASTQ","UNPAIRED_FASTQ"}) public boolean OUTPUT_PER_RG
-
COMPRESS_OUTPUTS_PER_RG
@Argument(shortName="GZOPRG", doc="Compress output FASTQ files per read group using gzip and append a .gz extension to the file names.", mutex={"FASTQ","SECOND_END_FASTQ","UNPAIRED_FASTQ"}) public Boolean COMPRESS_OUTPUTS_PER_RG
-
RG_TAG
@Argument(shortName="RGT", doc="The read group tag (PU or ID) to be used to output a FASTQ file per read group.") public String RG_TAG
-
OUTPUT_DIR
@Argument(shortName="ODIR", doc="Directory in which to output the FASTQ file(s). Used only when OUTPUT_PER_RG is true.", optional=true) public File OUTPUT_DIR
-
RE_REVERSE
@Argument(shortName="RC", doc="Re-reverse bases and qualities of reads with negative strand flag set before writing them to FASTQ", optional=true) public boolean RE_REVERSE
-
INTERLEAVE
@Argument(shortName="INTER", doc="Will generate an interleaved fastq if paired, each line will have /1 or /2 to describe which end it came from") public boolean INTERLEAVE
-
INCLUDE_NON_PF_READS
@Argument(shortName="NON_PF", doc="Include non-PF reads from the SAM file into the output FASTQ files. PF means \'passes filtering\'. Reads whose \'not passing quality controls\' flag is set are non-PF reads. See GATK Dictionary for more info.") public boolean INCLUDE_NON_PF_READS
-
CLIPPING_ATTRIBUTE
@Argument(shortName="CLIP_ATTR", doc="The attribute that stores the position at which the SAM record should be clipped", optional=true) public String CLIPPING_ATTRIBUTE
-
CLIPPING_ACTION
@Argument(shortName="CLIP_ACT", doc="The action that should be taken with clipped reads: \'X\' means the reads and qualities should be trimmed at the clipped position; \'N\' means the bases should be changed to Ns in the clipped region; and any integer means that the base qualities should be set to that value in the clipped region.", optional=true) public String CLIPPING_ACTION
-
CLIPPING_MIN_LENGTH
@Argument(shortName="CLIP_MIN", doc="When performing clipping with the CLIPPING_ATTRIBUTE and CLIPPING_ACTION parameters, ensure that the resulting reads after clipping are at least CLIPPING_MIN_LENGTH bases long. If the original read is shorter than CLIPPING_MIN_LENGTH then the original read length will be maintained.") public int CLIPPING_MIN_LENGTH
-
READ1_TRIM
@Argument(shortName="R1_TRIM", doc="The number of bases to trim from the beginning of read 1.") public int READ1_TRIM
-
READ1_MAX_BASES_TO_WRITE
@Argument(shortName="R1_MAX_BASES", doc="The maximum number of bases to write from read 1 after trimming. If there are fewer than this many bases left after trimming, all will be written. If this value is null then all bases left after trimming will be written.", optional=true) public Integer READ1_MAX_BASES_TO_WRITE
-
READ2_TRIM
@Argument(shortName="R2_TRIM", doc="The number of bases to trim from the beginning of read 2.") public int READ2_TRIM
-
READ2_MAX_BASES_TO_WRITE
@Argument(shortName="R2_MAX_BASES", doc="The maximum number of bases to write from read 2 after trimming. If there are fewer than this many bases left after trimming, all will be written. If this value is null then all bases left after trimming will be written.", optional=true) public Integer READ2_MAX_BASES_TO_WRITE
-
QUALITY
@Argument(shortName="Q", doc="End-trim reads using the phred/bwa quality trimming algorithm and this quality.", optional=true) public Integer QUALITY
-
INCLUDE_NON_PRIMARY_ALIGNMENTS
@Argument(doc="If true, include non-primary alignments in the output. Support of non-primary alignments in SamToFastq is not comprehensive, so there may be exceptions if this is set to true and there are paired reads with non-primary alignments.") public boolean INCLUDE_NON_PRIMARY_ALIGNMENTS
-
-
Method Detail
-
doWork
protected int doWork()
Description copied from class:CommandLineProgram
Do the work after command line has been parsed. RuntimeException may be thrown by this method, and are reported appropriately.- Specified by:
doWork
in classCommandLineProgram
- Returns:
- program exit status.
-
initializeAdditionalWriters
protected void initializeAdditionalWriters()
-
generateAdditionalWriters
protected Map<htsjdk.samtools.SAMReadGroupRecord,List<htsjdk.samtools.fastq.FastqWriter>> generateAdditionalWriters(List<htsjdk.samtools.SAMReadGroupRecord> readGroups, htsjdk.samtools.fastq.FastqWriterFactory factory)
-
handleAdditionalRecords
protected void handleAdditionalRecords(htsjdk.samtools.SAMRecord currentRecord, Map<htsjdk.samtools.SAMReadGroupRecord,List<htsjdk.samtools.fastq.FastqWriter>> additionalWriters, htsjdk.samtools.SAMRecord read1, htsjdk.samtools.SAMRecord read2)
-
assertPairedMates
protected static void assertPairedMates(htsjdk.samtools.SAMRecord record1, htsjdk.samtools.SAMRecord record2)
-
customCommandLineValidation
protected String[] customCommandLineValidation()
Put any custom command-line validation in an override of this method. clp is initialized at this point and can be used to print usage and access argv. Any options set by command-line parser can be validated.- Overrides:
customCommandLineValidation
in classCommandLineProgram
- Returns:
- null if command line is valid. If command line is invalid, returns an array of error messages to be written to the appropriate place.
-
-