Reads Manipulation

Scythe
Function: Scythe uses a Naive Bayesian approach to classify contaminant substrings in sequence reads. It considers quality information, which can make it robust in picking out 3'-end adapters, which often include poor quality bases.
Usage: scythe -a adapter_file.fasta -o trimmed_sequences.fasta sequences.fastq
cd-hit-dup
Function: cd-hit-dup is a simple tool for removing duplicates from sequencing reads, with optional step to detect and remove chimeric reads.
Usage: cd-hit-dup -i R1.fa -i2 R2.fa -o output-R1.fa -o2 output-R2.fa [other options]
geneBody_coverage.py
Function: Calculate the RNA-seq reads coverage over gene body.
Usage: geneBody_coverage.py -r hg19.housekeeping.bed -i bam_path.txt -o output
cd-hit-dup
Function: cd-hit-dup is a simple tool for removing duplicates from sequencing reads, with optional step to detect and remove chimeric reads.
Usage: cd-hit-dup -i input.fq -o output.fq [other options]
Kraken-translate
Function: The file sequences.labels generated by the above example is a text file with two tab-delimited columns, and one line for each classified sequence in sequences.fa; unclassified sequences are not reported by kraken-translate.
Usage: kraken-translate --db $DBNAME sequences.kraken > sequences.labels
mismatch_profile.py
Function: Calculate the distribution of mismatches across reads.
Usage: mismatch_profile.py -l 101 -i ../test.bam -o out
cd-hit-dup
Function: cd-hit-dup is a simple tool for removing duplicates from sequencing reads, with optional step to detect and remove chimeric reads.
Usage: cd-hit-dup -i R1.fq -i2 R2.fq -o output-R1.fq -o2 output-R2.fq [other options]
deletion_profile.py
Function: Calculate the distributions of deletions across reads
Usage: deletion_profile.py -i sample.bam -l 101 -o out
fastqc
Function: Generate QC reports for fastq files
Usage: fastqc [-o output dir] [--(no)extract] [-f fastq|bam|sam] [-c contaminant file] seqfile1 .. seqfileN
UMI-Tools
Function: Deduplicate reads using UMI and mapping coordinates
Usage: umi_tools dedup [OPTIONS] [--stdin=IN_BAM] [--stdout=OUT_BAM] > OUTFILE