Category

Sam/Bam Manipulation


Usage

java -jar picard.jar IlluminaBasecallsToSam BASECALLS_DIR=/BaseCalls/ LANE=001 READ_STRUCTURE=25T8B25T RUN_BARCODE=run15 IGNORE_UNEXPECTED_BARCODES=true LIBRARY_PARAMS=library.params


Manual

BASECALLS_DIR (File)    The basecalls directory. Required.
BARCODES_DIR (File)    The barcodes directory with _barcode.txt files (generated by ExtractIlluminaBarcodes). If not set, use BASECALLS_DIR. Default value: null.
LANE (Integer)    Lane number. Required.
OUTPUT (File)    Deprecated (use LIBRARY_PARAMS). The output SAM or BAM file. Format is determined by extension. Required. Cannot be used in conjuction with option(s) LIBRARY_PARAMS BARCODE_PARAMS
RUN_BARCODE (String)    The barcode of the run. Prefixed to read names. Required.
SAMPLE_ALIAS (String)    Deprecated (use LIBRARY_PARAMS). The name of the sequenced sample Required. Cannot be used in conjuction with option(s) LIBRARY_PARAMS BARCODE_PARAMS
READ_GROUP_ID (String)    ID used to link RG header record with RG tag in SAM record. If these are unique in SAM files that get merged, merge performance is better. If not specified, READ_GROUP_ID will be set to . . Default value: null.
LIBRARY_NAME (String)    Deprecated (use LIBRARY_PARAMS). The name of the sequenced library Default value: null. Cannot be used in conjuction with option(s) LIBRARY_PARAMS BARCODE_PARAMS
SEQUENCING_CENTER (String)    The name of the sequencing center that produced the reads. Used to set the RG.CN tag. Default value: BI. This option can be set to 'null' to clear the default value.
RUN_START_DATE (Date)    The start date of the run. Default value: null.
PLATFORM (String)    The name of the sequencing technology that produced the read. Default value: illumina. This option can be set to 'null' to clear the default value.
READ_STRUCTURE (String)    A description of the logical structure of clusters in an Illumina Run, i.e. a description of the structure IlluminaBasecallsToSam assumes the data to be in. It should consist of integer/character pairs describing the number of cycles and the type of those cycles (B for Sample Barcode, M for molecular barcode, T for Template, and S for skip). E.g. If the input data consists of 80 base clusters and we provide a read structure of "28T8M8B8S28T" then the sequence may be split up into four reads: * read one with 28 cycles (bases) of template * read two with 8 cycles (bases) of molecular barcode (ex. unique molecular barcode) * read three with 8 cycles (bases) of sample barcode * 8 cycles (bases) skipped. * read four with 28 cycles (bases) of template The skipped cycles would NOT be included in an output SAM/BAM file or in read groups therein. Required.
BARCODE_PARAMS (File)    Deprecated (use LIBRARY_PARAMS). Tab-separated file for creating all output BAMs for barcoded run with single IlluminaBasecallsToSam invocation. Columns are BARCODE, OUTPUT, SAMPLE_ALIAS, and LIBRARY_NAME. Row with BARCODE=N is used to specify a file for no barcode match Required. Cannot be used in conjuction with option(s) LIBRARY_PARAMS SAMPLE_ALIAS (ALIAS) OUTPUT (O) LIBRARY_NAME (LIB)
LIBRARY_PARAMS (File)    Tab-separated file for creating all output BAMs for a lane with single IlluminaBasecallsToSam invocation. The columns are OUTPUT, SAMPLE_ALIAS, and LIBRARY_NAME, BARCODE_1, BARCODE_2 ... BARCODE_X where X = number of barcodes per cluster (optional). Row with BARCODE_1 set to 'N' is used to specify a file for no barcode match. You may also provide any 2 letter RG header attributes (excluding PU, CN, PL, and DT) as columns in this file and the values for those columns will be inserted into the RG tag for the BAM file created for a given row. Required. Cannot be used in conjuction with option(s) SAMPLE_ALIAS (ALIAS) OUTPUT (O) LIBRARY_NAME (LIB) BARCODE_PARAMS
ADAPTERS_TO_CHECK (IlluminaAdapterPair)    Which adapters to look for in the read. Default value: [INDEXED, DUAL_INDEXED, NEXTERA_V2, FLUIDIGM]. This option can be set to 'null' to clear the default value. Possible values: {PAIRED_END, INDEXED, SINGLE_END, NEXTERA_V1, NEXTERA_V2, DUAL_INDEXED, FLUIDIGM, TRUSEQ_SMALLRNA, ALTERNATIVE_SINGLE_END} This option may be specified 0 or more times. This option can be set to 'null' to clear the default list.
NUM_PROCESSORS (Integer)    The number of threads to run in parallel. If NUM_PROCESSORS = 0, number of cores is automatically set to the number of cores available on the machine. If NUM_PROCESSORS
FIRST_TILE (Integer)    If set, this is the first tile to be processed (used for debugging). Note that tiles are not processed in numerical order. Default value: null.
TILE_LIMIT (Integer)    If set, process no more than this many tiles (used for debugging). Default value: null.
FORCE_GC (Boolean)    If true, call System.gc() periodically. This is useful in cases in which the -Xmx value passed is larger than the available memory. Default value: true. This option can be set to 'null' to clear the default value. Possible values: {true, false}
APPLY_EAMSS_FILTER (Boolean)    Apply EAMSS filtering to identify inappropriately quality scored bases towards the ends of reads and convert their quality scores to Q2. Default value: true. This option can be set to 'null' to clear the default value. Possible values: {true, false}
MAX_READS_IN_RAM_PER_TILE (Integer)    Configure SortingCollections to store this many records before spilling to disk. For an indexed run, each SortingCollection gets this value/number of indices. Default value: 1200000. This option can be set to 'null' to clear the default value.
MINIMUM_QUALITY (Integer)    The minimum quality (after transforming 0s to 1s) expected from reads. If qualities are lower than this value, an error is thrown.The default of 2 is what the Illumina's spec describes as the minimum, but in practice the value has been observed lower. Default value: 2. This option can be set to 'null' to clear the default value.
INCLUDE_NON_PF_READS (Boolean)    Whether to include non-PF reads Default value: true. This option can be set to 'null' to clear the default value. Possible values: {true, false}
IGNORE_UNEXPECTED_BARCODES (Boolean)    Whether to ignore reads whose barcodes are not found in LIBRARY_PARAMS. Useful when outputting BAMs for only a subset of the barcodes in a lane. Default value: false. This option can be set to 'null' to clear the default value. Possible values: {true, false}
MOLECULAR_INDEX_TAG (String)    The tag to use to store any molecular indexes. If more than one molecular index is found, they will be concatenated and stored here. Default value: RX. This option can be set to 'null' to clear the default value.
MOLECULAR_INDEX_BASE_QUALITY_TAG (String)    The tag to use to store any molecular index base qualities. If more than one molecular index is found, their qualities will be concatenated and stored here (.i.e. the number of "M" operators in the READ_STRUCTURE) Default value: QX. This option can be set to 'null' to clear the default value.
TAG_PER_MOLECULAR_INDEX (String)    The list of tags to store each molecular index. The number of tags should match the number of molecular indexes. Default value: null. This option may be specified 0 or more times.


Share your experience or ask a question