Trim a 3’ adapter by using cutadapt
cutadapt -a AACCGGTT -o output.fastq input.fastq
The sequence of the adapter is given with the
-a option. You need to replace
AACCGGTT with your actual adapter sequence. Reads are read from the input file
input.fastq and written to the output file
Cutadapt searches for the adapter in all reads and removes it when it finds it. All reads that were present in the input file will also be present in the output file, some of them trimmed, some of them not. Even reads that were trimmed entirely (because the adapter was found in the very beginning) are output. All of this can be changed with command-line options, explained further down.
Input files for cutadapt need to be in one the these formats:
.xz are supported)
Input and output file formats are recognized from the file name extension. You can override the input format with the
You can even use this – without any adapter trimming – to convert from FASTQ to FASTA:
Cutadapt supports compressed input and output files. Whether an input file needs to be decompressed or an output file needs to be compressed is detected automatically by inspecting the file name: If it ends in
.gz, then gzip compression is assumed. You can therefore run cutadapt like this and it works as expected:
All of cutadapt’s options that expect a file name support this.
Files compressed with bzip2 (
.bz2) or xz (
.xz) are also supported, but only if the Python installation includes the proper modules. xz files require Python 3.3 or later.
Concatenated bz2 files are not supported on Python versions before 3.3. These files are created by utilities such as
pbzip2 (parallel bzip2).
Concatenated gz files are supported on all supported Python versions.
If no output file is specified via the
-o option, then the output is sent to the standard output stream. Instead of the example command line from above, you can therefore also write:
There is one difference in behavior if you use cutadapt without
-o: The report is sent to the standard error stream instead of standard output. You can redirect it to a file like this:
Wherever cutadapt expects a file name, you can also write a dash (
-) in order to specify that standard input or output should be used. For example:
tail -n 4 prints out only the last four lines of
input.fastq, which are then piped into cutadapt. Thus, cutadapt will work only on the last read in the input file.
In most cases, you should probably use
- at most once for an input file and at most once for an output file, in order not to get mixed output.
You cannot combine
- and gzip compression since cutadapt needs to know the file name of the output or input file. if you want to have a gzip-compressed output file, use
-o with an explicit name.
One last “trick” is to use
/dev/null as an output file name. This special file discards everything you send into it. If you only want to see the statistics output, for example, and do not care about the trimmed reads at all, you could use something like this:
Cutadapt can do a lot more in addition to removing adapters. There are various command-line options that make it possible to modify and filter reads and to redirect them to various output files. Each read is processed in the following way:
cutadapt --help under the “Additional read modifications” heading. Adapter trimming itself does not appear in that list and is done after quality trimming and before length trimming (
cutadapt --help under the “Filtering of processed reads” heading.
Cutadapt supports trimming of multiple types of adapters:
|Anchored 3’ adapter
|Anchored 5’ adapter
|5’ or 3’ (both possible)