Nanopore long-read-only genome assembly of clinical Enterobacterales isolates is complete and accurate
Nagy D., Pennetta V., Rodger G., Hopkins K., Jones CR., Hopkins S., Crook D., Walker AS., Robotham J., Hopkins KL., Ledda A., Williams D., Hope R., Brown CS., Stoesser N., Lipworth S.
Whole bacterial genome sequence reconstruction using Oxford Nanopore Technologies (‘Nanopore’) long-read-only sequencing may offer a lower-cost, higher-throughput alternative for pathogen surveillance to ‘hybrid’ assembly with recent improvements in Nanopore sequencing accuracy. We evaluated the accuracy, including plasmid reconstruction, of Nanopore long-read-only genome assemblies of Enterobacterales. We sequenced 92 genomes from clinical Enterobacterales isolates, collected in England under a national surveillance programme, with long-read Nanopore (R10.4.1, Dorado v5.0.0 super-high-accuracy basecalled) and short-read Illumina (NovaSeq) sequencing approaches. Genomes were assembled using three long-read-only (Flye, Hybracter long and Autocycler) and three hybrid assemblers (Hybracter hybrid, Unicycler normal and bold). Three polishing modalities (Medaka v2 with subsampled or un-subsampled long-reads; Polypolish+Pypolca with short-reads) were investigated. Autocycler circularised the most chromosomes [87/92 (95%)]. Plasmid sequence reconstruction was comparable among all assemblers except Flye, all recovering 90–96% of plasmids, although the ‘ground truth’ was uncertain. Flye performed worse than other assemblers on almost all metrics. Autocycler+Medaka (un-subsampled long-reads) was the most accurate long-read-only assembler/polisher combination, comparable to hybrid assemblies [median 0 (IQR: 0–0) single nucleotide variants (SNVs) and 0 (IQR: 0–1) insertions/deletions (indels) per genome; median quality value/ Q score 100 (IQR: 64–100)], with only 4/92 genome sequences having >10 SNVs/indels. Medaka polishing with un-subsampled long-reads resulted in small improvements in indels, but not SNVs for both Flye and Autocycler assemblies. Seven-locus multi-locus sequence type, antimicrobial resistance, virulence and stress gene annotation was equivalent across assembler/polisher combinations. Nanopore long-read-only bacterial genome assembly with Autocycler combined with Medaka polishing (using un-subsampled reads) is similarly accurate and possibly more complete than hybrid assemblies, representing a viable alternative for incorporating high-quality genomic data, including plasmids, into Enterobacterales surveillance.