Category

ChIP Analysis


Usage

java -Xmx20G -jar multigps.jar <options - see below>


Manual

‒‒out :  Output file prefix. All output will be put into a directory with the prefix name.
‒‒threads : Use n threads during binding event detection. Default is 1 thread.
‒‒verbose : Flag to print intermediate files and some extra output.
‒‒config : All options can be specified in a namevalue text file, where name is the name of the option without the “‒‒“. Options specified in a config file are over-ridden by command-line args.
‒‒geninfo : This file should list the lengths of all chromosomes on separate lines using the format chrNamechrLength. You can generate a suitable file from UCSC 2bit format genomes using the UCSC utility “twoBitInfo”. The chromosome names should be exactly the same as those used in your ChIP-seq read files. See some examples below.
‒‒seq : A directory containing fasta format files corresponding to every named chromosome is required if you want to run motif-finding or use a motif-prior within MultiGPS.
‒‒exptCONDNAME-REPNAME : Defines a file containing reads from a signal experiment. Replace CONDNAME and REPNAME with appropriate condition and replicate labels.
‒‒ctrlCONDNAME-REPNAME : Optional arguments. Defines a file containing reads from a control experiment. Replace CONDNAME and REPNAME with appropriate labels to match a signal experiment (i.e. to tell MultiGPS which condition/replicate this is a control for). If you leave out a REPNAME, this file will be used as a control for all replicates of CONDNAME.
‒‒format : Format of data files. All files must be the same format if specifying experiments on the command line. Supported formats are SAM/BAM, BED, and IDX index files.
‒‒design : A file that specifies the data files and their condition/replicate relationships.
‒‒fixedpb : Fixed per-base limit.
‒‒poissongausspb : Filter per base using a Poisson threshold parameterized by a local Gaussian sliding window (i.e. look at neighboring positions to decide what the per-base limit should be). Default behavior is to estimate a global per-base limit from a Poisson distribution parameterized by the number of reads divided by the number of mappable bases in the genome. The per-base limit is set as the count corresponding to the 10^-7 probability level from the Poisson.
‒‒nonunique : Flag to use non-unique reads.
‒‒mappability : Fraction of the genome that is mappable for these experiments. Default=0.8.
‒‒nocache : Flag to turn off caching of the entire set of experiments (i.e. run slower with less memory)
‒‒noscaling : Flag to turn off auto estimation of signal vs control scaling factor.
‒‒medianscale : Flag to use scaling by median ratio of binned tag counts. Default = scaling by NCIS.
‒‒regressionscale : Flag to use scaling by regression on binned tag counts. Default = scaling by NCIS.
‒‒sesscale : Flag to use scaling by SES (Diaz, et al. Stat Appl Genet Mol Biol. 2012).
‒‒fixedscaling : Multiply control counts by total tag count ratio and then by this factor. Default: scaling by NCIS.
‒‒scalewin : Window size for estimating scaling ratios. Default is 10Kbp. Use something much smaller if scaling via SES (e.g. 200bp).
‒‒plotscaling : Flag to plot diagnostic information for the chosen scaling method.
‒‒d : Binding event read distribution file for initializing models. The true distribution of reads around binding events is estimated during MultiGPS training. See examples here for file format. A default initial distribution appropriate for ChIP-seq data is used if this option is not specified.
‒‒r : Maximum number of training rounds for updating binding event read distributions. Default = 3.
‒‒nomodelupdate :  Flag to turn off binding model updates.
‒‒minmodelupdateevents : Minimum number of events to support an update of the read distribution. Default = 500.
‒‒nomodelsmoothing : Flag to turn off binding model smoothing (default = smooth with a cubic spline).
‒‒splinesmoothparam : Smoothing parameter for smoothing cubic spline. Default = 30.
‒‒gaussmodelsmoothing : Flag to turn on Gaussian model smoothing (default = smooth with a cubic spline).
‒‒gausssmoothparam : Gaussian smoothing std dev. Default = 3.
‒‒jointinmodel : Flag to allow joint events in model updates (default=do not).
‒‒fixedmodelrange : Flag to keep binding model range fixed to initial size. Default: automatically adapt range.
‒‒prlogconf : Poisson log threshold for potential region scanning. Default = -6.
‒‒fixedalpha : Impose this alpha. The alpha parameter is a sparse prior on binding events in the MultiGPS model. It can be interpreted as a minimum number of reads that each binding event must be responsible for in the model. Default: estimate alpha automatically.
‒‒alphascale : Alpha scaling factor. Increasing this parameter results in stricter binding event calls. Default = 1.0.
‒‒mlconfignotshared : Flag to not share component configs in the ML step. This mainly affects the quantification of binding levels for binding events that are not shared but are located at nearby locations across experiments.
‒‒exclude : File containing a set of regions to ignore during MultiGPS training. It’s a good idea to exclude the mitochondrial genome and other ‘blacklisted’ regions that contain artifactual accumulations of reads in both ChIP-seq and control experiments. MultiGPS will waste time trying to model binding events in these regions, even though they will not typically appear significantly enriched over the control (and thus will not be reported to the user). See the format of an exclude region file here (example for mm9).
‒‒noposprior : Flag to turn off inter-experiment positional prior (default=on).
‒‒probshared : Probability that events are shared across conditions (default=0.9).
‒‒nomotifs : Flag to turn off motif-finding & motif priors.
‒‒nomotifprior : Flag to turn off motif priors only.
‒‒memepath : Path to the meme bin dir (default: meme is in $PATH).
‒‒memenmotifs : Number of motifs MEME should find for each condition (default=3).
‒‒mememinw : Minimum motif width arg for MEME (default=6).
‒‒mememaxw : Maximum motif width arg for MEME (default=18).
‒‒memeargs : Additional args for MEME (default:  -dna -mod zoops -revcomp -nostatus).
‒‒q : Minimum Q-value (corrected p-value) of reported binding events. Default = 0.001.
‒‒minfold : Minimum event fold-change vs scaled control. Default = 1.5.
‒‒nodifftests : Flag to turn off differential enrichment tests.
‒‒rpath : Path to the R bin dir (default: R is in $PATH). Note that you need to install edgeR separately.
‒‒edgerod : EdgeR overdispersion parameter. Default = 0.15.
‒‒diffp : Minimum p-value for reporting differential enrichment. Default = 0.01.


Share your experience or ask a question