GIREMI manual | BioQueue Encyclopedia

Usage

giremi [options] in1.bam [in2.bam [...]]

Manual

NOTE:

The bam files should contain all final mapped reads in all chromosomes.
If multiple bam files are provided as input to GIREMI, they are handled as replicates that can be combined into one data set to generate one set of editing sites. If it is desired that biological replicates be analyzed separately, each file should be run individually through GIREMI.

Required:

-f, --fasta-ref FILE reference genome sequence file in fasta format (NOTE: the faidx index file generated by samtools should be saved in the same directory as this fasta file)
-l, --positions FILE the list of all filtered SNVs after removing likely sequencing errors or SNVs due to other artifacts (see our paper for details)
-o, --output FILE write output to FILE.res

Options:

-m, --min INT minimal number of total reads covering candidate editing sites [default: 5]
-p, --paired-end INT 1:paired-end RNA-Seq reads; 0:single-end [default: 1]
-s, --strand INT 0:non-strand specific RNA-Seq; 1: strand-specific RNA-Seq and read 1 (first read for the paired-end reads) is sense to RNA; 2: strand-specific RNA-Seq and read 1 is anti-sense to RNA [default: 0]

Required format of the file containing the list of SNVs (-l option):

column 1 : The name of the chromosome or scaffold
column 2 : The starting position of the SNV in the chromosome or scaffold (0-based)
column 3 : The ending position of the SNV in the chromosome or scaffold (1-based)
column 4 : The name of the gene harboring this SNV; “Inte”: the SNV resides in the Intergenic region
column 5 : A flag, 1: the SNV belongs to dbSNP; 0: otherwise
column 6 : Strand (+ or -); “#” for “Inte” gene

Format of the output file:

######NOTE: This output file includes a rich list of information about the SNVs. Not all sites in this file are predicted as RNA editing sites, see the ifRNAE field.

chr : Name of the chromosome or scaffold
coordinate : Position of the SNVs in the chromosome or scaffold (1-based)
strand : Strand information
ifSNP : 1, If the SNV is included in dbSNP; 0: otherwise.
gene : Name of the gene harboring this SNV
reference_base : The nucleotide of this SNV in the reference chromosome (+ strand)
upstream_1base : The upstream neighboring nucleotide of this SNV in the reference chromosome (+ strand)
downstream_1base : The downstream neighboring nucleotide of this SNV in the reference chromosome (+ strand)
major_base : The major nucleotide of the SNV in the RNA-seq data
major_count : Number of reads with the major nucleotide
tot_count : Total number of reads covering this SNV in the RNA-Seq data
major_ratio : The ratio of major nucleotide (major_count/tot_count)
MI : The mutual information of this SNV if a value exists
pvalue_mi : P-value from the MI test if applicable
estimated_allelic_ratio : Estimated allelic ratio of the gene harboring this SNV
ifNEG : 1: this SNV was a negative control in the training data
RNAE_t : Type of RNA editing or RNA-DNA mismatches (A-to-G, etc)
A,C,G,T : Numbers of reads with specific nucleotides at this site
ifRNAE : 1: the SNV is predicted as an RNA editing site based on MI analysis; 2: the SNV is predicted as an RNA editing site based on GLM 0: the SNV is not predicted as an RNA editing site

GIREMI

Category

Usage

Manual

Share your experience or ask a question