Reads Manipulation


correctGCBias -b file.bam --effectiveGenomeSize 2150570000 -g mm9.2bit --GCbiasFrequenciesFile freq.txt -o gc_corrected.bam [options]


correctGCBias is a tool from the deepTools suite. The information on this page is based on deepTools version 3.5.1.

–bamfile, -b    Sorted BAM file to correct.
–effectiveGenomeSize    The effective genome size is the portion of the genome that is mappable. Large fractions of the genome are stretches of NNNN that should be discarded. Also, if repetitive regions were not included in the mapping of reads, the effective genome size needs to be adjusted accordingly. Common values are: mm9: 2150570000, hg19:2451960000, dm3:121400000 and ce10:93260000. See Table 2 of or for several effective genome sizes. This value is needed to detect enriched regions that, if not discarded, could bias the results.
–genome, -g    Genome in two bit format. Most genomes can be found here: Search for the .2bit ending. Otherwise, fasta files can be converted to 2bit using faToTwoBit available here:
–GCbiasFrequenciesFile, -freq    Indicate the output file from computeGCBias containing the observed and expected read frequencies per GC-content.
–correctedFile, -o    Name of the corrected file. The ending will be used to decide the output file format. The options are ”.bam”, ”.bw” for a bigWig file, ”.bg” for a bedGraph file.
–version    show program’s version number and exit
–binSize, -bs    Size of the bins, in bases, for the output of the bigwig/bedgraph file.
–region, -r    Region of the genome to limit the operation to - this is useful when testing parameters to reduce the computing time. The format is chr:start:end, for example –region chr10 or –region chr10:456700:891000.
–numberOfProcessors, -p    Number of processors to use. Type “max/2” to use half the maximum number of processors or “max” to use all available processors.
–verbose, -v    Set to see processing messages.

Share your experience or ask a question