Category

Genome Annotation


Usage

genePredToGtf database genePredTable output.gtf


Manual

This tool is part of UCSC Genome Browser's Utility Tools.

If database is 'file' then track is interpreted as a file rather than a table in database.

Options

  • -utr: Add 5UTR and 3UTR features
  • -honorCdsStat: use cdsStartStat/cdsEndStat when defining start/end codon records
  • -source=src: set source name to uses
  • -addComments: Add comments before each set of transcript records. Allows for easier visual inspection

Note: use a refFlat table or extended genePred table or file to include the gene_name attribute in the output. This will not work with a refFlat table dump file. If you are using a genePred file that starts with a numeric bin column, drop it using the UNIX cut command:

cut -f 2- in.gp | genePredToGtf file stdin out.gp

Examples

Dump GENCODE V26 (for the human genome assembly hg38), Basic Genes and Gene Predictions (wgEncodeGencodeBasicV26) from UCSC Table Browser and save the annotations to a file named utilityOutputBasic26.gtf

genePredToGtf hg38 wgEncodeGencodeBasicV26 utilityOutputBasic26.gtf

If you have downloaded a table (for example, the knownGene table for hg19) from UCSC, which will be in genePred format. You can then use this local file as input for the genePredToGtf conversion:

wget ftp://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/knownGene.txt.gz  # download the table from UCSC
zcat knownGene.txt.gz | cut -f1-10 | genePredToGtf file stdin knownGene.gtf  # convert the table to gtf format

 


Share your experience or ask a question