computeGCBias -b file.bam --effectiveGenomeSize 2150570000 -g mm9.2bit -l 200 --GCbiasFrequenciesFile freq.txt [options]


computeGCBias is a tool from the deepTools suite. The information on this page is based on deepTools version 3.5.1.

–bamfile, -b    Sorted BAM file.
–effectiveGenomeSize    The effective genome size is the portion of the genome that is mappable. Large fractions of the genome are stretches of NNNN that should be discarded. Also, if repetitive regions were not included in the mapping of reads, the effective genome size needs to be adjusted accordingly. Common values are: mm9: 2150570000, hg19:2451960000, dm3:121400000 and ce10:93260000. See Table 2 of or for several effective genome sizes. This value is needed to detect enriched regions that, if not discarded can bias the results.
–genome, -g    Genome in two bit format. Most genomes can be found here: Search for the .2bit ending. Otherwise, fasta files can be converted to 2bit using the UCSC programm called faToTwoBit available for different plattforms at
–fragmentLength, -l    Fragment length used for the sequencing. If paired-end reads are used, the fragment length is computed based from the bam file
–sampleSize    Number of sampling points to be considered.
–extraSampling    BED file containing genomic regions for which extra sampling is required because they are underrepresented in the genome.
–version    show program’s version number and exit
–region, -r    Region of the genome to limit the operation to - this is useful when testing parameters to reduce the computing time. The format is chr:start:end, for example –region chr10 or –region chr10:456700:891000.
–blackListFileName, -bl    A BED or GTF file containing regions that should be excluded from all analyses. Currently this works by rejecting genomic chunks that happen to overlap an entry. Consequently, for BAM files, if a read partially overlaps a blacklisted region or a fragment spans over it, then the read/fragment might still be considered. Please note that you should adjust the effective genome size, if relevant.
–numberOfProcessors, -p    Number of processors to use. Type “max/2” to use half the maximum number of processors or “max” to use all available processors.
–verbose, -v    Set to see processing messages.
–GCbiasFrequenciesFile, -freq    Path to save the file containing the observed and expected read frequencies per %GC-content. This file is needed to run the correctGCBias tool. This is a text file.
–plotFileFormat    Possible choices: png, pdf, svg, eps
image format type. If given, this option overrides the image format based on the plotFile ending. The available options are: “png”, “eps”, “pdf” and “svg”
–biasPlot    If given, a diagnostic image summarizing the GC-bias will be saved.
–regionSize    To plot the reads per %GC over a regionthe size of the region is required. By default, the bin size is set to 300 bases, which is close to the standard fragment size for Illumina machines. However, if the depth of sequencing is low, a larger bin size will be required, otherwise many bins will not overlap with any read

Share your experience or ask a question