Overview of the LoCost SARS-CoV-2 protocol

LoCost overview

The LoCost workflow provides laboratory and bioinformatics protocols for whole genome amplicon sequencing of SARS-CoV-2. It has been designed to be used with the Oxford Nanopore (ONT) MinION sequencer, due its portability and relative low-cost, making it suitable for field-deployment. Sequencing can also be performed on the larger ONT platforms (GridION and PromethION) if your laboratory is equipped with them. By using ONT barcodes up to 95 samples can be processed at once, as each individual sample will be uniquely labelled, allowing multiple samples to be run in a single laboratory reaction. The laboratory protocol can be downloaded from here, and the bioinformatics protocol here.

Primer scheme

The scheme divides the primer pairs into multiple pools, to ensure no overlap in the PCR pools. The sequence of the primer sites of the amplicons will be the primer sequence, potentially masking genomic variation. Therefore, primer sites are trimmed from the read data, with the overlap between amplicons providing the “true” genomic sequence. For example, in the schematic below you can see that the true sequence for the primer binding sites of amplicon 2 will be obtained by amplicons 1 and 3.

Fig 1: Primer pool schematic

The ARTIC SARS-CoV-2primer scheme (v5.4.3) can be purchased pre-pooled.

Workflow directionality

It is crucial that the sample processing be performed in separate cabinets as described below, and that once samples/reagents have moved from one step to the next, they do not move back. Doing so poses a significant risk of contamination, and once contamination has occurred it is extremely difficult to remove.

Workflow Directionality
Fig 2: Workflow directionality

Amplification & Sequencing

LoCost is intended to be used on clinical samples, but can be used for other sample types if desired. It is assumed that RNA has already been extracted from the samples. The first steps of the laboratory work are cDNA synthesis, followed by PCR for amplification of the target genome. After this, it follows a streamlined version of the Oxford Nanopore Native Barcoding protocol, consisting of barcode ligation, sequencing adapter ligation, and sequencing on the Nanopore flowcell.

cDNA synthesis

This is a very simple step which only requires adding the reverse-transcription mix (LunaScript) to your samples (one tube for each sample). If your sample is DNA (rather than RNA) this step should be skipped. To avoid contamination of reagents, it is recommended that the LunaScript mix be added to the reaction tubes in a ‘mastermix’ cabinet, and then the samples added to the tubes in a separate ‘sample addition’ cabinet.

PCR Genome Amplification

As described above, the PCR primers are divided into two pools. As such, each sample will undergo two PCR reactions, so you will have two tubes per sample. These will be combined again later. To avoid contamination of reagents, it is recommended that the PCR mix be added to the reaction tubes in a ‘mastermix’ cabinet, and then the DNA samples added to the tubes in a separate ‘sample addition’ cabinet. After PCR it is vital that these samples are not returned to either of these cabinets, as PCR amplicons will easily contaminate pre-PCR areas. Post-PCR samples should be handled in a completely separate area.


Fig 3: Overview of the ARTIC LoCost laboratory protocol

The next stages follow ONT’s protocol, with some steps streamlined for time and cost-efficiency. As ONT’s protocols change over time as they develop and refine products, it is important to check you are using the correct reagents and protocols for the products you have purchased.

Native Barcoding

After PCR is complete, pools 1 and 2 for each sample are diluted and combined together, so once again you have one tube per sample. Each sample then undergoes the ‘End Prep’ reaction to prepare the amplicons. Each sample is then subject to the Native Barcoding reaction. A unique barcode must be used for each sample (expansion packs with additional barcodes can be purchased from ONT). Ensure accurate records are kept for which barcode is assigned to which sample. The barcoded samples are then pooled together so that you now have a single tube containing your library. The library is cleaned using SPRI magnet beads to remove excess oligos and reagents, and then quantified.

Adapter Ligation

The ONT adapters are what allow the DNA strands to be captured by the pores in the flowcell, passed through, and so sequenced. The adapter ligation takes place in a single reaction, followed again by SPRI clean-up and quantification.

MinION sequencing

The final stage is to prepare the library for sequencing with the appropriate ONT reagents, and prime the flow cell. It is important to check you are adding the correct amount of DNA library. If you load too much you risk over-loading the pores and reducing sequencing efficiency, so if your concentration is high you should dilute the library. If you add too little you may have a low sequencing yield. However, if your concentration is low, add what you can. You will then load the library to the flowcell and begin sequencing using the MinKNOW software.

Sequencing with MinKNOW

An overview is provided with the LoCost protocol. However it is important to note that ONT make updates and changes over time, so elements may be different. Key points to remember are to ensure you select the correct flowcell and reagent kits, and that you select double-ended barcoding. The Nanopore technology works by detecting changes in electrical impulses which result from the DNA strand passing through a pore. The software interprets these signals and ‘translates’ them into As, Ts, Gs, and Cs. MinKNOW can perform ‘live’ basecalling during the sequencing run, shortening the time from sample to sequence. However, basecalling can be performed after the sequencing run using GUPPY if preferred (see documentation on the ONT website). Double-ended barcoding should also be used if basecalling this way. There are three basecalling accuracy modes. ‘Fast’ is, as you would expect, the quickest, but also the least accurate. High Accuracy and Super High Accuracy are more accurate, but more computationally and time intensive. We recommend High Accuracy as a balance.

Run monitoring

MinKNOW will display information regarding the ‘success’ of the run when live basecalling is enabled, including Q scores over time, read lengths, and read-counts of the barcodes. If provided with a reference genome it can also show alignment hits, by read count, and alignment coverage. It also provides information on the ‘health’ of the run, such as pore health and activity (this is displayed whether or not you are using live basecalling).

RAMPART is a piece of command-line software which provides more in-depth analysis of your sequencing run, in real-time. As well as data on read length, depth, and coverage, it will also provide real-time phylogenetic analysis of your sample(s). Use of RAMPART requires a reasonable level of experience with command-line, and so is an optional element of the ARTIC work-flow. The required documentation is available on GitHub.

ARTIC Field bioinformatics pipeline

Reads need to have been demultiplexed and ‘polished’ prior to running the pipeline. The pipeline will process one barcode at a time (though this can be automated to work through all barcodes if you are confident with command-line). The reads are aligned to the reference genome, producing a consensus sequence of your sample (FASTA file), a list of detected variants (VCF file), and a BAM file for visualisation. N.B. This pipeline will soon be updated.

Further data analyses

See the Beginner’s Guide to ARTIC for further information on data analyses, and some of the tools available.