Reads Manipulation


cd-hit-dup -i input.fa -o output.fa [other options]


    -i        Input file (FASTQ or FASTA);
    -i2       Second input file (FASTQ or FASTA);
    -o        Output file;
    -o2       Output file for R2;
    -d        Description length (default 0, truncate at the first whitespace character)
    -u        Length of prefix to be used in the analysis (default 0, for full/maximum length);
    -m        Match length (true/false, default true);
    -e        Maximum number/percent of mismatches allowed;
    -f        Filter out chimeric clusters (true/false, default false);
    -s        Minimum length of common sequence shared between a chimeric read
              and each of its parents (default 30, minimum 20);
    -a        Abundance cutoff (default 1 without chimeric filtering, 2 with chimeric filtering);
    -b        Abundance ratio between a parent read and a chimeric read (default 1);
    -p        Dissimilarity control for chimeric filtering (default 1);

Share your experience or ask a question