PacBio Long Reads Improve Metagenomic Assemblies, Gene Catalogs, and Genome Binning

Xie, Haiying and Yang, Caiyun and Sun, Yamin and Igarashi, Yasuo and Jin, Tao and Luo, Feng (2020) PacBio Long Reads Improve Metagenomic Assemblies, Gene Catalogs, and Genome Binning. Frontiers in Genetics, 11. ISSN 1664-8021

[thumbnail of pubmed-zip/versions/1/package-entries/fgene-11-516269/fgene-11-516269.pdf] Text
pubmed-zip/versions/1/package-entries/fgene-11-516269/fgene-11-516269.pdf - Published Version

Download (1MB)

Abstract

PacBio long reads sequencing presents several potential advantages for DNA assembly, including being able to provide more complete gene profiling of metagenomic samples. However, lower single-pass accuracy can make gene discovery and assembly for low-abundance organisms difficult. To evaluate the application and performance of PacBio long reads and Illumina HiSeq short reads in metagenomic analyses, we directly compared various assemblies involving PacBio and Illumina sequencing reads based on two anaerobic digestion microbiome samples from a biogas fermenter. Using a PacBio platform, 1.58 million long reads (19.6 Gb) were produced with an average length of 7,604 bp. Using an Illumina HiSeq platform, 151.2 million read pairs (45.4 Gb) were produced. Hybrid assemblies using PacBio long reads and HiSeq contigs produced improvements in assembly statistics, including an increase in the average contig length, contig N50 size, and number of large contigs. Interestingly, depth-based hybrid assemblies generated a higher percentage of complete genes (98.86%) compared to those based on HiSeq contigs only (40.29%), because the PacBio reads were long enough to cover many repeating short elements and capture multiple genes in a single read. Additionally, the incorporation of PacBio long reads led to considerable advantages regarding reducing contig numbers and increasing the completeness of the genome reconstruction, which was poorly assembled and binned when using HiSeq data alone. From this comparison of PacBio long reads with Illumina HiSeq short reads related to complex microbiome samples, we conclude that PacBio long reads can produce longer contigs, more complete genes, and better genome binning, thereby offering more information about metagenomic samples.

Item Type: Article
Subjects: South Asian Archive > Medical Science
Depositing User: Unnamed user with email support@southasianarchive.com
Date Deposited: 20 Feb 2023 10:43
Last Modified: 21 May 2024 13:33
URI: http://article.journalrepositoryarticle.com/id/eprint/166

Actions (login required)

View Item
View Item