cutadapt -a AACCGGTT -o output.fastq input.fastq
The sequence of the adapter is given with the -a
option. You need to replace AACCGGTT
with your actual adapter sequence. Reads are read from the input file input.fastq
and written to the output file output.fastq
.
Cutadapt searches for the adapter in all reads and removes it when it finds it. All reads that were present in the input file will also be present in the output file, some of them trimmed, some of them not. Even reads that were trimmed entirely (because the adapter was found in the very beginning) are output. All of this can be changed with command-line options, explained further down.
Input files for cutadapt need to be in one the these formats:
.fasta
, .fa
, .fna
).fastq
, .fq
).gz
(even .bz2
and .xz
are supported)Input and output file formats are recognized from the file name extension. You can override the input format with the --format
option.
You can even use this – without any adapter trimming – to convert from FASTQ to FASTA:
cutadapt -o output.fasta input.fastq.gz
Cutadapt supports compressed input and output files. Whether an input file needs to be decompressed or an output file needs to be compressed is detected automatically by inspecting the file name: If it ends in .gz
, then gzip compression is assumed. You can therefore run cutadapt like this and it works as expected:
cutadapt -a AACCGGTT -o output.fastq.gz input.fastq.gz
All of cutadapt’s options that expect a file name support this.
Files compressed with bzip2 (.bz2
) or xz (.xz
) are also supported, but only if the Python installation includes the proper modules. xz files require Python 3.3 or later.
Concatenated bz2 files are not supported on Python versions before 3.3. These files are created by utilities such as pbzip2
(parallel bzip2).
Concatenated gz files are supported on all supported Python versions.
If no output file is specified via the -o
option, then the output is sent to the standard output stream. Instead of the example command line from above, you can therefore also write:
cutadapt -a AACCGGTT input.fastq > output.fastq
There is one difference in behavior if you use cutadapt without -o
: The report is sent to the standard error stream instead of standard output. You can redirect it to a file like this:
cutadapt -a AACCGGTT input.fastq > output.fastq 2> report.txt
Wherever cutadapt expects a file name, you can also write a dash (-
) in order to specify that standard input or output should be used. For example:
tail -n 4 input.fastq | cutadapt -a AACCGGTT - > output.fastq
The tail -n 4
prints out only the last four lines of input.fastq
, which are then piped into cutadapt. Thus, cutadapt will work only on the last read in the input file.
In most cases, you should probably use -
at most once for an input file and at most once for an output file, in order not to get mixed output.
You cannot combine -
and gzip compression since cutadapt needs to know the file name of the output or input file. if you want to have a gzip-compressed output file, use -o
with an explicit name.
One last “trick” is to use /dev/null
as an output file name. This special file discards everything you send into it. If you only want to see the statistics output, for example, and do not care about the trimmed reads at all, you could use something like this:
cutadapt -a AACCGGTT -o /dev/null input.fastq
Cutadapt can do a lot more in addition to removing adapters. There are various command-line options that make it possible to modify and filter reads and to redirect them to various output files. Each read is processed in the following way:
cutadapt --help
under the “Additional read modifications” heading. Adapter trimming itself does not appear in that list and is done after quality trimming and before length trimming (--length
/-l
).cutadapt --help
under the “Filtering of processed reads” heading.Cutadapt supports trimming of multiple types of adapters:
Adapter type | Command-line option |
---|---|
3’ adapter | -a ADAPTER |
5’ adapter | -g ADAPTER |
Anchored 3’ adapter | -a ADAPTER$ |
Anchored 5’ adapter | -g ^ADAPTER |
5’ or 3’ (both possible) | -b ADAPTER |
Linked adapter | -a ADAPTER1...ADAPTER2 |