Category

Genomic Interval Manipulation


Usage

bedClip [options] input.bed chrom.sizes output.bed


Manual

This tool is part of UCSC Genome Browser's utilities.

Required arguments

  • input.bed: Input bed file
  • chrom.sizes:  a two column file/URL: <chromosome name> <size in bases> (columns are separated by Tab). If the assembly is hosted by UCSC, chrom.sizes can be a URL like https://hgdownload.soe.ucsc.edu/goldenPath/db/bigZips/db.chrom.sizes, for example, if you want to get the chromsome sizes for chromosomes defined in hg38, you can access https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes. Or you may use the script fetchChromSizes to download the chrom.sizes file. If not hosted by UCSC, a chrom.sizes file can be generated by running twoBitInfo on the assembly .2bit file.
  • output.bed: Write the filtered/clipped records to this place.

Options

  • -truncate: truncate items that span ends of chrom instead of the default of dropping the items
  • -verbose=2: set to get list of lines clipped and why

Examples

For the human reference genome hg38, there are 248956422 bases on chromosome 1. In the following example, we have three records in the input bed file demo.bed. After calling bedClip, records that are out of the defined chromosome ranges will be removed:

$ head demo.bed
chr1    -100  105048 invalid_start_position
chr1    104896    105048    valid
chr1    248957000    248958000    positions_lager_than_chromosome_size

$ bedClip demo.bed GRCh38_no_alt_analysis_set_GCA_000001405.15.genome demo.clip.bed

$ head demo.clip.bed
chr1    104896    105048    valid

The clipping functionality assumes all chromosomes in the input bed file have their lengths defined in the chrom.sizes file, if not errors like the following will be raised:

$ head demo2.bed
chr1    104896  105048 valid
1    104896    105048    inconsistent_chromosome_name

$ bedClip demo2.bed GRCh38_no_alt_analysis_set_GCA_000001405.15.genome demo2.clip.bed
Chromosome 1 isn't in GRCh38_no_alt_analysis_set_GCA_000001405.15.genome line 2 of demo2.bed: 1:104896-105048

 

File formats this tool works with
BED

Share your experience or ask a question