Reference Code backup Executable files
Two strategies were used to determine reads duplication rate: * Sequence based: reads with identical sequence are regarded as duplicated reads. * Mapping based: reads mapped to the exactly same genomic location are regarded as duplicated reads. For splice reads, reads mapped to the same starting position and splice the same way are regarded as duplicated reads.
read_duplication.py -i Pairend_nonStrandSpecific_36mer_Human_hg19.bam -o output
--version | show program’s version number and exit |
-h, --help | show this help message and exit |
-i INPUT_FILE, --input-file=INPUT_FILE | |
Alignment file in BAM or SAM format. | |
-o OUTPUT_PREFIX, --out-prefix=OUTPUT_PREFIX | |
Prefix of output files(s). | |
-u UPPER_LIMIT, --up-limit=UPPER_LIMIT | |
Upper limit of reads’ occurrence. Only used for plotting, default=500 (times) | |
-q MAP_QUAL, --mapq=MAP_QUAL | |
Minimum mapping quality (phred scaled) for an alignment to be considered as “uniquely mapped”. default=30 |