ARPHA Conference Abstracts : Conference Abstract
Conference Abstract
Haplotype-level DNA metabarcoding from freshwater macroinvertebrate community samples
expand article infoJoeselle M. Serrana‡,§, Kozo Watanabe§
‡ Graduate School of Science and Engineering, Ehime University, Matsuyama, Japan
§ Center for Marine Environmental Studies (CMES), Ehime University, Matsuyama, Japan
Open Access


DNA metabarcoding is a robust method for environmental impact assessments of freshwater ecosystems that enables the simultaneous multi-species identification of complex mixed community samples from different origins using extracellular and total genomic DNA. The development and evaluation of DNA metabarcoding protocols for haplotype level resolution require attention, specifically for basic population genetic applications, i.e., analysis to allow genetic diversity estimations and dispersal abilities of the species present in the bulk community samples. Various literature has proposed using DNA metabarcoding for population genetics, and few studies have provided preliminary applications and proof of concepts that always refer to particular taxa. However, further exploration and assessment of the laboratory and bioinformatics strategies are required to unlock the potential of metabarcoding-based population-level ecological assessments. Here, we assessed the ability to infer haplotype information of freshwater macroinvertebrate species from DNA metabarcoding community sequence. Using mock samples with known Sanger-sequenced haplotypes, we also assayed the effects of PCR cycle for the detection and reduction of spurious haplotypes obtained from DNA metabarcoding. We tested our haplotyping strategy on a mock sample containing 20 specimens from four species with known haplotypes based on the 658-bp Folmer region of the mitochondrial cytochrome c oxidase (mtCOI) gene.

The read processing and denoising-step resulted in 14 zero-radius operational taxonomic units (ZOTUs) of 421-bp length, with 12 ZOTUs having 100% match with 12 of the mock haplotype sequences. The remaining eight haplotypes that were not detected from the DNA metabarcoding dataset were all the A. decemseta samples (0.01, 0.05, 0.10 ng/μL DNA template concentrations), two E. bulba (0.01 and 0.05 ng/μL), E. latifolium (0.01 ng/μL), and two K. tibialis (0.01 and 0.10 ng/μL). Given that most of the undetected samples had low concentrations, we report the influence of initial DNA template concentration on the amplification from a mock community sample. Our observation is in accordance with previous studies that reported that samples or taxa with low DNA template concentrations have lower detection probability. Accordingly, abundant taxa or samples with high biomass tend to have higher detection probabilities than those rare, smaller or have low biomass from mixed-community samples. The difference in biomass affects haplotypes' detection since most of the large specimens would be retained after read processing. Hence, these factors need to be addressed when metabarcoding-based haplotyping is to be used to infer abundance-based analysis for population genetics applications. The phylogenetic-based analysis (Fig. 1) revealed that the two ZOTUs without taxonomic matches clustered with one of the species from the mock sample. This supports our observation that only the samples with low concentration were unrepresented from the DNA metabarcoding data. Although we still reported false positive detections because two of the 14 ZOTUs failed to have a 100% match with the mock reference sequences, we could at least identify them as A. decemseta sequences based on the phylogenetic approach.

Figure 1.  

Neighbor-joining tree of the mock and the zero-radius operational taxonomic units (ZOTU) sequences. Sequences are highlighted based on species. The red bar represents a ZOTU without a taxonomic match, and the text in red represents a haplotype without a ZOTU match. Two of the ZOTUs (i.e., ZOTU10 and ZOTU14) that did not have a taxonomic match against the mock sample sequences clustered with the A. decemseta sequences.

Quality passing reads relatively increased with increasing cycle number, and the relative abundance of each ZOTUs was consistent for each cycle number. This suggests that increasing the cycle number, from 24 to 64, did not affect the relative abundance of quality passing filter reads. Our study demonstrated that DNA metabarcoding data could be used to infer intraspecific variability, showing promise for possible applications in population-based genetic studies. As DNA metabarcoding becomes more established and laboratory protocols and bioinformatics pipelines are continuously being developed, our proof of concept study demonstrated that the method could be used to infer intraspecific variability, showing promise for possible applications on population-based genetic studies.


DNA metabarcoding; freshwater macroinvertebrates; intraspecific; genetic diversity; haplotype; zero-radius operational taxonomic units

Presenting author

Joeselle M. Serrana

Presented at

1st DNAQUA International Conference (March 9-11, 2021)