Sam/Bam Manipulation

read_quality.py
Function: According to SAM specification, if Q is the character to represent “base calling quality” in SAM file, then Phred Quality Score = ord(Q) - 33. Here ord() is python function that returns an integer representing the Unicode code point of the character when the argument is a unicode object, for example, ord(‘a’) returns 97. Phred quality score is widely used to measure “reliability” of base-calling, for example, phred quality score of 20 means there is 1/100 chance that the base-calling is wrong, phred quality score of 30 means there is 1/1000 chance that the base-calling is wrong. In general: Phred quality score = -10xlog(10)P, here P is probability that base-calling is wrong.
Usage: read_quality.py -i Pairend_nonStrandSpecific_36mer_Human_hg19.bam -o output
java -jar picard.jar
Function: Adds comments to the header of a BAM file.This tool makes a copy of the input bam file, with a modified header that includes the comments specified at the command line (prefixed by @CO). Use double quotes to wrap comments that include whitespace or special characters. Note that this tool cannot be run on SAM files.
Usage: java -jar picard.jar AddCommentsToBam I=input.bam O=modified_bam.bam C=comment_1 C="comment 2"
java -jar picard.jar
Function: Create BFQ files from a BAM file for use by the maq aligner. BFQ is a binary version of the FASTQ file format. This tool creates bfq files from a BAM file for use by the maq aligner.
Usage: java -jar picard.jar BamToBfq I=input.bam ANALYSIS_DIR=analysis_dir OUTPUT_FILE_PREFIX=output_file_1 PAIRED_RUN=false
java -jar picard.jar
Function: DEPRECATED: Use CollectHsMetrics instead. Calculates a set of Hybrid Selection specific metrics from an aligned SAMor BAM file. If a reference sequence is provided, AT/GC dropout metrics will be calculated, and the PER_TARGET_COVERAGE option can be used to output GC and mean coverage information for every target.
Usage: java -jar picard.jar CalculateHsMetrics
java -jar picard.jar
Function: Prints a SAM or BAM file to the screen.
Usage: java -jar picard.jar ViewSam
java -jar picard.jar
Function: Compare two metrics files.This tool compares the metrics and histograms generated from metric tools to determine if the generated results are identical. This tool is useful to test and compare outputs when code changes are implemented. It is not meant for use by end-users of this toolkit. The tool's output simply indicates whether two metrics files are equal or not equal.
Usage: java -jar picard.jar CompareMetrics metricfile1.txt metricfile2.txt
bam2fq.py
Function: Convert alignments in BAM or SAM format into fastq format.
Usage: bam2fq.py -i test_PairedEnd_StrandSpecific_hg19.sam -o bam2fq_out1
samtools reheader
Function: Copies header from source dataset into target dataset using samtools reheader command.
Usage: samtools reheader [-iP] in.header.sam in.bam
java -jar picard.jar
Function: Collects metrics from reduced representation bisulfite sequencing (Rrbs) data.
Usage: java -jar picard.jar CollectRrbsMetrics R=reference_sequence.fasta I=input.bam M=basename_for_metrics_files
java -jar picard.jar
Function: Cleans the provided SAM/BAM, soft-clipping beyond-end-of-reference alignments and setting MAPQ to 0 for unmapped reads
Usage: java -jar picard.jar CleanSam
java -jar picard.jar
Function: Collects per-sample and aggregate (spanning all samples) metrics from the provided VCF file.
Usage: java -jar picard.jar CollectVariantCallingMetrics
samtools fasta
Function: Converts a BAM or CRAM into either FASTQ or FASTA format depending on the command invoked.
Usage: samtools fasta [options] in.bam
bamtools
Function: Print header from BAM file(s)
Usage: bamtools header -in input_alignments.bam -out output_alignments_header.txt
bamtools
Function: The command bamtools revert removes duplicate marks and restores original base qualities.
Usage: bamtools revert -in input_alignments.bam -out output_alignments_reverted.bam
java -jar picard.jar
Function: Manipulates interval lists. This tool offers multiple interval list file manipulation capabilities include sorting, merging, subtracting, padding, customizing, and other set-theoretic operations. If given one or more inputs, the default operation is to merge and sort them. Other options e.g. interval subtraction are controlled by the arguments. The tool lists intervals with respect to a reference sequence.Both interval_list and VCF files are accepted as input. The interval_list file format is relatively simple and reflects the SAM alignment format to a degree. A SAM style header must be present in the file that lists the sequence records against which the intervals are described. After the header, the file then contains records, one per line in text format with the following values tab-separated:
Usage: java -jar picard.jar -Sequence name (SN) -Start position (1-based)** -End position (1-based, end inclusive) -Strand (either + or -) -Interval name (ideally unique names for intervals)