Minimum coverage required?

helena_ss · 30 March 2020 07:54

The pipeline seems to trim coverage to mean 400x ish in *trimmed.sorted.bam, probably with >artic minion --normalise 200 ?
Is this a minimum for submission to https://gisaid.org?
Have you set minimum coverage for accurate variant calling?

nick · 30 March 2020 20:11

From empirical testing, 100x is an adequate coverage level to resolve most variants.

In the pipeline we normalise to 200 (per direction) because we’ve found that more coverage than this slows down the pipeline considerably (particularly the nanopolish version).

However, the vast majority of “simple mutations” can be resolved easily at 20x, which is the default setting in the pipeline. In nanopore sequencing the hardest contexts are SNPs neighbouring or within homopolymers, but there are also other k-mer pairs that can be tricky to resolve. In general, the harder the context, the more evidence you need to call the variant confidently. This is reflected in the QUAL value (which is a sum of log-likelihoods) reported in the VCF. QUAL will typically go up or down with more coverage, and this trajectory is useful in determining if the variant is true. We typically divide this value by depth to give a useful filter (values of <3 are rejected).

To account for any very hard to call mutations coupled with lower coverage we use a masking model to apply Ns to regions that fail the filter, in order to represent uncertainty. This also works well with phylogenetic approaches where missing regions are effectively imputed.

We first introduced this idea in our Ebola paper (https://www.nature.com/articles/nature16996) where there is some more detail about our approach in the Supp. Methods.

Finally, we have drafted a more extensive guide to interpreting nanopolish output which we will post up soon.

Topic		Replies	Views
Medaka/Longshot pipeline Bioinformatics	1	2018	23 April 2021
Human Read Removal Bioinformatics	0	918	28 April 2020
SARS-CoV-2 version 4 scheme release Laboratory	0	7644	24 June 2021
Bcftools consensus error Bioinformatics	2	2426	8 April 2020
Frequently Asked Questions about ARTIC protocol	0	1694	25 March 2020

Minimum coverage required?

Related topics