Category

Sequence Analysis


Usage

faFrag [options] in.fa start end out.fa


Manual

This tool is part of UCSC Genome Browser's utilities.

Required arguments

  • in.fa: Input FASTA file.
  • start: Start position for DNA extraction.
  • end: End position for DNA extraction.
  • out.fa: Output FASTA file.

Options

  • -mixed: Preserve mixed-case in the FASTA file.

Examples

Get a subsequence from a FASTA file

In the following example, we will extract the sequence for the gene ISG15 (chr1:1013497-1014540) from the reference sequence for human chromosome 1 (chr1.fa):

$ faFrag chr1.fa 1013497 1014540 isg15.fa
Wrote 1043 bases to isg15.fa

$ head isg15.fa
>chr1:1013497-1014540
gcggctgagaggcagcgaactcatctttgccagtacaggagcttgtgccg
tggcccacagcccacagcccacagccatggtaaggcagatgtcacaggtg
gggggaggtgggctctgtgccagccaattttcgtctccctcccccagcca
aggtctcccaggggtgcagggagagcggagctgctcagagcttggccagg
ttctaagtgtgctcctgaaagcaggtcacccctgagatcctcagggtggg
gcacagaggggcaccctagcaggtaaagggaggccacgggatggcggtgg
gcagctggccttctagtaacgagccctcagtgccttctgtgcctggggtc
cctgccggcgggatgtagaggacagacaggagggagcactgtccctgggt
acaggagctcgccctgcagccagtgccttgtgtgtggtgggcctggggct

Note: if there are multiple sequences in a single FASTA file, the faFrag command will only work on the first sequence, and you will see warning messages like the following:

More than one sequence in GRCh38_no_alt_analysis_set_GCA_000001405.15.fasta, just using first

If this is not the expected behavior, you can use faOneRecord to extract the hosting sequence first (chr1 in this case), then run faFrag on the extracted file:

$ faOneRecord GRCh38_no_alt_analysis_set_GCA_000001405.15.fasta chr1 > chr1.fa

 

File formats this tool works with
FASTA

Share your experience or ask a question