Sam/Bam Manipulation

samtools collate
Function: Shuffles and groups reads together by their names. A faster alternative to a full query name sort, collate ensures that reads of the same name are grouped together in contiguous groups, but doesn't make any guarantees about the order of read names between groups. The output from this command should be suitable for any operation that requires all reads from the same template to be grouped together.
Usage: samtools collate [options] in.sam|in.bam|in.cram [out.prefix]
java -jar picard.jar
Function:
Usage: java -jar picard.jar CollectInsertSizeMetrics I=input.bam O=insert_size_metrics.txt H=insert_size_histogram.pdf M=0.5
java -jar picard.jar
Function: Renames a sample within a VCF or BCF. This tool enables the user to rename a sample in either a VCF or BCF file. It is intended to change the name of a sample in a VCF prior to merging with VCF files in which one or more samples have similar names. Note that the input VCF file must be single-sample VCF and that the NEW_SAMPLE_NAME is required.
Usage: java -jar picard.jar RenameSampleInVcf I=input.vcf O=renamed.vcf NEW_SAMPLE_NAME=sample123
java -jar picard.jar
Function: Reads a SAM or BAM file and rewrites it with new adapter-trimming tags.
Usage: java -jar picard.jar MarkIlluminaAdapters INPUT=input.sam METRICS=metrics.txt
java -jar picard.jar CollectRnaSeqMetrics
Function: Produces RNA alignment metrics for a SAM or BAM file.
Usage: java -jar picard.jar CollectRnaSeqMetrics I=input.bam O=output.RNA_Metrics REF_FLAT=ref_flat.txt STRAND=SECOND_READ_TRANSCRIPTION_STRAND RIBOSOMAL_INTERVALS=ribosomal.interval_list
java -jar picard.jar
Function: Reads a VCF/VCF.gz/BCF and removes all genotype information from it while retaining all site level information, including annotations based on genotypes (e.g. AN, AF). Output an be any support variant format including .vcf, .vcf.gz or .bcf.
Usage: java -jar picard.jar MakeSitesOnlyVcf
bamtools
Function: The command bamtools resolve resolves paired-end reads. The resolving mode is required, and it can be -makeStats, -markPairs, or -twoPass.
Usage: bamtools resolve -twoPass -in input_alignments.bam -out output_alignments.bam
java -jar picard.jar
Function: Lifts over an interval list from one reference build to another. This tool adjusts the coordinates in an interval list derived from one reference to match a new reference, based on a chain file that describes the correspondence between the two references. It is based on the UCSC liftOver tool (see: http://genome.ucsc.edu/cgi-bin/hgLiftOver) and uses a UCSC chain file to guide its operation. It accepts both Picard interval_list files or VCF files as interval inputs.
Usage: java -jar picard.jar LiftOverIntervalList I=input.interval_list O=output.interval_list SD=reference_sequence.dict CHAIN=build.chain
java -jar picard.jar
Function: Collect metrics regarding GC bias. This tool collects information about the relative proportions of guanine (G) and cytosine (C) nucleotides in a sample. Regions of high and low G + C content have been shown to interfere with mapping/aligning, ultimately leading to fragmented genome assemblies and poor coverage in a phenomenon known as 'GC bias'. Detailed information on the effects of GC bias on the collection and analysis of sequencing data can be found at DOI: 10.1371/journal.pone.0062856/.
Usage: java -jar picard.jar CollectGcBiasMetrics I=input.bam O=gc_bias_metrics.txt CHART=gc_bias_metrics.pdf S=summary_metrics.txt R=reference_sequence.fasta
java -jar picard.jar
Function: Calculate PCR-related metrics from targeted sequencing data.
Usage: java -jar picard.jar CollectTargetedPcrMetrics I=input.bam O=pcr_metrics.txt R=reference_sequence.fasta AMPLICON_INTERVALS=amplicon.interval_list TARGET_INTERVALS=targets.interval_list
java -jar picard.jar
Function: Lifts over a VCF file from one reference build to another. This tool adjusts the coordinates of variants within a VCF file to match a new reference. The output file will be sorted and indexed using the target reference build. To be clear, REFERENCE_SEQUENCE should be the target reference build. The tool is based on the UCSC liftOver tool (see: http://genome.ucsc.edu/cgi-bin/hgLiftOver) and uses a UCSC chain file to guide its operation. Note that records may be rejected because they cannot be lifted over or because of sequence incompatibilities between the source and target reference genomes. Rejected records will be emitted with filters to the REJECT file, using the source genome coordinates.
Usage: java -jar picard.jar LiftoverVcf I=input.vcfO=lifted_over.vcfCHAIN=b37tohg19.chainREJECT=rejected_variants.vcfR=reference_sequence.fasta
java -jar picard.jar
Function: Converts a SAM or BAM file to FASTQ. This tool extracts read sequences and base quality scores from the input SAM/BAM file and outputs them in FASTQ format. This can be used by way of a pipe to run BWA MEM on unmapped BAM (uBAM) files efficiently.
Usage: java -jar picard.jar SamToFastq I=input.bam FASTQ=output.fastq
java -jar picard.jar
Function: Collect metrics to assess oxidative artifacts.This tool collects metrics quantifying the error rate resulting from oxidative artifacts. For a brief primer on oxidative artifacts, see the GATK Dictionary.This tool calculates the Phred-scaled probability that an alternate base call results from an oxidation artifact. This probability score is based on base context, sequencing read orientation, and the characteristic low allelic frequency. Please see the following reference for an in-depth discussion of the OxoG error rate.
Usage: java -jar picard.jar CollectOxoGMetrics I=input.bam O=oxoG_metrics.txt R=reference_sequence.fasta
java -jar picard.jar
Function: Subsets intervals from a reference sequence to a new FASTA file.This tool takes a list of intervals, reads the corresponding subsquences from a reference FASTA file and writes them to a new FASTA file as separate records. Note that the reference FASTA file must be accompanied by an index file and the interval list must be provided in Picard list format. The names provided for the intervals will be used to name the corresponding records in the output file.
Usage: java -jar picard.jar ExtractSequences INTERVAL_LIST=regions_of_interest.interval_list R=reference.fasta O=extracted_IL_sequences.fasta
bamtools
Function: Create index for BAM file
Usage: bamtools index -i <BAM FILE>