Medaka/Longshot pipeline

For those that have been using 1.1.0-rc1 of the pipeline over the past week or so with the experimental Medaka mode, it has been noted by two users that the Longshot step can occasionally filter genuine variants (detected by Nanopolish). This seems to be the result of a very high stringency strand-bias filter which is not suitable for ONT data.

This has now been fixed in the latest Github repository. If you have run the pipeline over the past week in Medaka mode, you may want to re-process your results. The nanopolish mode is unaffected.

Additionally, a bug that could prevent the pipeline running if a variant is detected at the same location in two pools has been fixed also in the latest master version. If you encountered this bug then you would not have generated a consensus sequence for that sample.

I will roll these fixes into a new release candidate 1.1.0-rc2 today or tomorrow.

Sorry for any inconvenience.

2 Likes

Cheers. I wrote a small script to recover the lost variants using the pass/fail vcfs, and edit the consensus fasta accordingly, in case someone doesnt feel like running the whole pipeline over (or for those looking for multiple haplotypes).
N.B. it uses 0.5 as a default variant allele frequency for the consensus.