Category

Fetch Data


Usage

faSomeRecords in.fa listFile out.fa


Manual

This tool is part of UCSC Genome Browser's utilities.

Arguments

  • in.fa: FASTA file where all the sequences are stored.
  • listFile: path to the file which stores the list of IDs for the sequences that you want extract. Note: don't include the starting > in the list.
  • out.fa: Extracted sequences will be written into this file.

Options

  • -exclude: output sequences not in the list file.

Examples

Extract the sequences of chromosome 19 and chromosome 22 from the reference genome fasta file (assume it's called in.fa), then you need a list file (assume it's called listFile), and it's content should be:

$ cat listFile

chr19
chr22

The following command will extract the sequences and save them into a file called out.fa:

faSomeRecords in.fa listFile out.fa

You can check the extracted sequences with the head command:

$ head out.fa

>chr19  AC:CM000681.2  gi:568336005  LN:58617616  rl:Chromosome  M5:85f9f4fc152c58cb7913c06d6b98573a  AS:GRCh38  hm:multiple
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN


Share your experience or ask a question