faCount file(s).fa
This tool is part of UCSC Genome Browser's utilities.
faCount
calculates the occurances for A/T/C/Gs and CpGs in each sequence records in the given input files and print the result to stdout.
Count base occurrences for the human reference genome (hg38):
$ faCount GRCh38_no_alt_analysis_set_GCA_000001405.15.fasta | head #seq len A C G T N cpg chr1 248956422 67070277 48055043 48111528 67244164 18475410 2375159 chr2 242193529 71791213 48318180 48450903 71987932 1645301 2192670 chr3 198295559 59689091 39233483 39344259 59833302 195424 1673293 chr4 190214555 58561236 36236976 36331025 58623430 461888 1503429 chr5 181538259 54053328 35315012 35401468 54213385 2555066 1523709 chr6 170805979 51345477 33646690 33713330 51373025 727457 1511189 chr7 159345973 47058248 32317984 32378859 47215040 375842 1622825 chr8 145138636 43333530 29030173 29103787 43300646 370500 1338200 chr9 138394717 35736329 25099811 25170662 35783748 16604167 1255728
By adding the option -summary, you can get the overall stats for all sequences:
$ faCount GRCh38_no_alt_analysis_set_GCA_000001405.15.fasta -summary | head #seq len A C G T N cpg total 3099922541 866420001 598683433 600854940 868918077 165046090 29303965 prcnt 1.0 0.2795 0.1931 0.1938 0.2803 0.0532 0.0095
Count base occurrences (for both the forward and reverse directions by using the -strands option) for the human reference genome (hg38):
$ faCount GRCh38_no_alt_analysis_set_GCA_000001405.15.fasta -strands | head #seq len A C G T N cpg chr1 497912844 134314441 96166571 96166571 134314441 36950820 4750318 chr2 484387058 143779145 96769083 96769083 143779145 3290602 4385340 chr3 396591118 119522393 78577742 78577742 119522393 390848 3346586 chr4 380429110 117184666 72568001 72568001 117184666 923776 3006858 chr5 363076518 108266713 70716480 70716480 108266713 5110132 3047418 chr6 341611958 102718502 67360020 67360020 102718502 1454914 3022378 chr7 318691946 94273288 64696843 64696843 94273288 751684 3245650 chr8 290277272 86634176 58133960 58133960 86634176 741000 2676400 chr9 276789434 71520077 50270473 50270473 71520077 33208334 2511456