Development and evaluation of a meat mitochondrial metagenomic (3MG) method for composition determination of meat from fifteen mammalian and avian species

Jiang, Mei; Xu, Shu-Fei; Tang, Tai-Shan; Miao, Li; Luo, Bao-Zheng; Ni, Yang; Kong, Fan-De; Liu, Chang

doi:10.1186/s12864-021-08263-0

Research
Open access
Published: 07 January 2022

Development and evaluation of a meat mitochondrial metagenomic (3MG) method for composition determination of meat from fifteen mammalian and avian species

Mei Jiang¹,
Shu-Fei Xu²,
Tai-Shan Tang³,
Li Miao⁴,
Bao-Zheng Luo⁵,
Yang Ni⁶,
Fan-De Kong² &
…
Chang Liu¹

BMC Genomics volume 23, Article number: 36 (2022) Cite this article

1609 Accesses
4 Citations
4 Altmetric
Metrics details

Abstract

Background

Bioassessment and biomonitoring of meat products are aimed at identifying and quantifying adulterants and contaminants, such as meat from unexpected sources and microbes. Several methods for determining the biological composition of mixed samples have been used, including metabarcoding, metagenomics and mitochondrial metagenomics. In this study, we aimed to develop a method based on next-generation DNA sequencing to estimate samples that might contain meat from 15 mammalian and avian species that are commonly related to meat bioassessment and biomonitoring.

Results

In this project, we found the meat composition from 15 species could not be identified with the metabarcoding approach because of the lack of universal primers or insufficient discrimination power. Consequently, we developed and evaluated a meat mitochondrial metagenomics (3MG) method. The 3MG method has four steps: (1) extraction of sequencing reads from mitochondrial genomes (mitogenomes); (2) assembly of mitogenomes; (3) mapping of mitochondrial reads to the assembled mitogenomes; and (4) biomass estimation based on the number of uniquely mapped reads. The method was implemented in a python script called 3MG. The analysis of simulated datasets showed that the method can determine contaminant composition at a proportion of 2% and the relative error was < 5%. To evaluate the performance of 3MG, we constructed and analysed mixed samples derived from 15 animal species in equal mass. Then, we constructed and analysed mixed samples derived from two animal species (pork and chicken) in different ratios. DNAs were extracted and used in constructing 21 libraries for next-generation sequencing. The analysis of the 15 species mix with the method showed the successful identification of 12 of the 15 (80%) animal species tested. The analysis of the mixed samples of the two species revealed correlation coefficients of 0.98 for pork and 0.98 for chicken between the number of uniquely mapped reads and the mass proportion.

Conclusion

To the best of our knowledge, this study is the first to demonstrate the potential of the non-targeted 3MG method as a tool for accurately estimating biomass in meat mix samples. The method has potential broad applications in meat product safety.

Peer Review reports

Background

Meat represents a significant portion of daily human consumption. However, meat adulteration has become a global issue. Valuable and expensive meat, such as beef and mutton, is often detected mixed with cheaper chicken, duck, pork, mink and animal meat [1, 2]. For instance, two of the nine beef samples examined by Erol et al. contained horse and deer meat [3]. Such adulteration harms consumers’ rights and interests [4] and disrupts market order [5]. Therefore, identifying adulterated ingredients in meat and meat products is essential.

Based on next-generation DNA sequencing, many methods for determining the biological composition of mixed samples have been developed, including metabarcoding [6], metagenomics [7, 8] and mitochondrial metagenomics (MMG) [9]. The metabarcoding approach depends on the PCR amplification of a particular marker for species determination. The metagenomics approach consists of two steps for species determination and biomass quantification, namely, shotgun sequencing and mapping of read to whole nuclear genomes. MMG is essentially a metagenomic method using mitochondrial genomes (mitogenome) instead of nuclear genomes as references. The PCR amplification-dependent metabarcoding method is the workhorse for the molecular determination of biological composition.

Numerous markers have been tested on animals, including 18S rRNA genes from the nuclear genome, 16S rRNA gene and cytochrome c oxidase I (COX1, CO1 or COI) gene from the mitogenome [10]. However, these PCR-dependent methods have limitations. Firstly, they require universal primers targeting particular markers, usually lacking across all taxa [11]. Different sets of universal markers and primer pairs complicate data integration when different markers are used, and different primer pairs are used for the same markers. Secondly, even with universal primers, template DNA molecules with different sequences have different melting properties, leading to amplification bias [12]. Consequently, the direct quantification of template DNA molecules with different sequences is difficult.

All-Food-Seq (AFS) is a recently developed metagenomics method [8], in which the non-targeted deep sequencing of total genomic DNA from foodstuff, followed by bioinformatics analysis, can identify species from all kingdoms of life with high accuracy. It facilitates the quantitative measurement of the main ingredients and detection of unanticipated food components. Conceptually, the AFS method has set up a framework for ultimate bio-surveillance.

However, the AFS method has several practical limitations. Firstly, the method is probably extremely complex for bioassessment and biomonitoring because a whole genome has a high degree of complexity. Secondly, although whole-genome databases have expanded rapidly, obtaining high-quality whole-genome sequences for a species requires many years. The effect of genomic diversity on bioassessment and biomonitoring is unknown. Thirdly, this study used simulated data rather than experimental data.

MMG delimits closely related species from mixed samples [13, 14]. This method is desirable because of its advantages. Firstly, a mitogenome and its genes are common phylogenetic, DNA barcoding and metabarcoding markers. Secondly, the structures of mitogenomes are conserved, whereas sequences can be highly diverse. Thirdly, mitogenomes are small and easy to obtain and can be directly reconstructed using bioinformatics methods. Fourthly, large numbers of mitogenomes are available in public databases. More than 10,000 mitogenomes have been included in the GenBank in December 2020. The performance and accuracy of metabarcoding and MMG in biomass estimation in invertebrate community samples have been evaluated [15]. Overall, MMG yields more informative predictions of biomass content from bulk macroinvertebrate communities than metabarcoding. However, despite that MMG has been applied to ecological assessment [9, 16,17,18,19,20,21], the use of MMG in mammalian and avian meat mixed samples have not been examined to the best of our knowledge.

In this study, we intended to use either metabarcoding or MMG to detect the potential mixing of meat from 15 mammalian and avian species on the basis of a market survey. Preliminary studies suggested that the most commonly used metabarcoding markers, COI and 16S, are unsuitable for simultaneously detecting meat from these 15 species. Thus, we tested MMG in mixed meat samples. This approach, called ‘meat mitochondrial mitogenome (3MG)’, circumvent the problem of marker selection, PCR bias and sequencing bias. Additionally, this approach takes advantage of the availability of mitogenomes for many species. The results showed that it can accurately determine the biological composition of meat mix samples and accurately estimate biomass. The method has a wide range of applications in food and pharmaceutical industries involving animal products.

Materials and methods

Meat samples and mock mixed meat samples

We prepared mock samples with meat from the legs of 15 mammalian and avian species: Anas platyrhynchos (duck), Bos taurus (cattle), Camelus bactrianus (camel), Canis lupus familiaris (dog), Equus caballus (horse), Gallus gallus (chicken), Mus musculus (mouse), Mustela putorius voucher (ferret), Myocastor coypus (nutria), Nyctereutes procyonoides (raccoon dog), Oryctolagus cuniculus (rabbit), Ovis aries (sheep), Rattus norvegicus (rat), Sus scrofa domesticus (pig) and Vulpes vulpes (fox). Efforts had been made to extract meat samples with homogenous compositions intraspecificly and interspecificly. We obtained camel, nutria, fox, donkey and deer meat from breeding farms. Nanjing Medical University provided the mouse, rabbit and rat samples. The Entry-exit Inspection and Quarantine Bureau provided other meat samples. The detailed information regarding sample origin, particularly cities and institutions, is provided in Table 1.

Table 1 Information for meat samples used in this study

Full size table

Two methods were used in mixing the samples. One mix contained meat samples in equal amounts from 15 species. This mix was referred to as the ‘mix containing meat from 15 species’ or ‘M1’. The other mix contained meat from S. scrofa domesticus (pig) and G. gallus (chicken) in the following proportions: 10:0 (‘sample 1; mix containing two species’ or ‘M2-S1’), 8:2 (M2-S2), 6:4 (M2-S3), 4:6 (M2-S4), 2:8 (M2-S5) and 0:10 (M2-S6). Each M1 or M2 sample has three replicates.

Loop-mediated isothermal amplification (LAMP)

We performed loop-mediated isothermal amplification (LAMP) experiments to validate the composition of mock samples (M1 and M2). LAMP methods for detecting ingredients that contain cattle, sheep, pig, chickens and duck meat were developed by the Technology center of Xiamen Entry-Exit Inspection and Quarantine Bureau of the People’s Republic of China [22,23,24]. The probe and primer sequences target cytB genes from the corresponding species were provided in Table S1. The PCR reaction mix contained isothermal master mix (15 μL), primer mix (FIP, 2 μL; BIP, 2 μL; F3, 1 μL; B3, 1 μL) and DNA (1 μL). We added RNase-free water to the final reaction of 50 μL. The experimental conditions were as follows: amplification at 60 °C for 90 min and annealing from 98 °C to 80 °C at a rate of 0.05 °C per second.

DNA extraction, library construction and next-generation sequencing (NGS)

We extracted genomic DNA with a modified sodium dodecyl sulfate (SDS)-based method [25]. The integrity and concentration of the extracted DNA were detected through electrophoresis in 1% (w/v) agarose gel and spectrophotometer (Nanodrop 2000; Thermo Fisher Scientific, USA). The extracted DNA samples (100 ng) were subjected to library construction using NEBNext® Ultra™ II DNA library prep kit for Illumina® (New England BioLabs, USA) according to the manufacturer’s recommendations. Each library had an insert size of 500 bp. The quantity and quality of the libraries were analysed using Agilent 2100 Bioanalyser (Agilent Technologies, USA). We sequenced the libraries using the HiSeq X reagent kits (Illumina, USA) in an Illumina Hiseq X sequencer. We deposited the data generated in this study in GenBank. The accession numbers were SRR9107560 and SRR9140737.

Construction of mitogenome reference databases

We constructed a database (15MGDB), which had complete mitogenome sequences from the 15 species. The 15 mitogenome sequences were downloaded from GenBank, with the following accession numbers: A. platyrhynchos (NC_009684), B. taurus (NC_006853), C. bactrianus (NC_009628), C. lupus familiaris (NC_002008), E. caballus (NC_001640), G. gallus (NC_001323), M. musculus (NC_005089), M. putorius voucher (NC_020638), M. coypus (NC_035866), O. cuniculus (NC_001913), O. aries (NC_001941), N. procyonoides (NC_013700), R. norvegicus (NC_001665), S. scrofa domesticus (NC_012095), V. vulpes (NC_008434). The sequences in 15MGDB were used in constructing a searchable database with the makeblastdb command from the BLAST+ (v2.7.1) software package [26] and with the option ‘-dbtype nucl -parse_seqids’.

Development of the 3MG analysis pipeline

The 3MG pipeline was developed using Python 2.7.15 with the following third-party software applications: pandas module in python, BBMap (v35.66; https://sourceforge.net/projects/bbmap/), MITOBim (v1.9.1) [27], Blast+ (v2.7.1) [26], bowtie2 (v2.3.4) [28] and samtools (v1.9) [29]. The source code, sample data and instruction for using the locally installed copy of the 3 mg pipeline and a singularity container for running the 3 mg pipeline can be found using the following link: http://1kmpg.cn/3mg/.

Determination of 3MG detection errors using simulation

We generated 21 sets of data through simulation. Reads from an M2-S1 sample containing 100% pork was used as the background. Reads were then extracted from the reads of M2-S6 containing only chicken with the seqtk program (v1.3-r106) and with the option ‘seqtk sample -s100’. The reads extracted from M2-S6 were mixed with those from M2-S1 in the following percentages: 0.01, 0.1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90 and 100%. We prepared five replicates of simulated data for each percentage level and used default seeds. The resulting sample sets were then analysed using the 3MG pipeline. We calculated relative detection errors with the following formula as described previously [8].

$$\mathrm{Relative}\ \mathrm{error}= \mid \left(\mathrm{Number}\ \mathrm{of}\ \mathrm{chicken}\ \mathrm{reads}\ \mathrm{detected}-\mathrm{Number}\ \mathrm{of}\ \mathrm{chicken}\ \mathrm{reads}\ \mathrm{in}\ \mathrm{the}\ \mathrm{sample}\right) \mid /\left(\mathrm{Number}\ \mathrm{of}\ \mathrm{chicken}\ \mathrm{reads}\ \mathrm{in}\ \mathrm{the}\ \mathrm{sample}\right)$$

Comparison of the reference and assembled mitogenomes

We aligned the assembled sequences with their reference sequences for each species using the CLUSTALW2 (v2.0.12) program [30] with option ‘-type = dna -output = phylip’. We used these aligned sequences in constructing phylogenetic trees with the maximum likelihood (ML) method implemented in RaxML (v8.2.4) [31]. We calculated the intra-specific and inter-specific distances among mitogenomes using the distmat program from EMBOSS (v6.3.1) [28] with the options ‘-nucmethod = 0’. Corrections for multiple substitutions cannot be made through this method. Finally, we calculated the p-distances among mitogenomes with MEGA (v7) [32].

Detection of other contaminating biological composition

Taxon content in reads unmapped to mitogenomes were analysed using the RDP classifier (v2.12) [33]. The unmapped reads were assigned to COX1 and 16S rRNA database with an assignment confidence cutoff of 0.8. The 16S rRNA database is part of the RDP Classifier package. The COX1 database was downloaded from https://github.com/terrimporter/CO1Classifier/releases/tag/v3.2 [34]. The results were visualised using MEGAN (v6) [35] with the following LCA parameters: ‘minSupportPercent = 0.02, minSupport = 1, minScore = 50.0, maxExpected = 0.01, topPercent = 10.0 and readAssignmentMode = readCount’.

Results

Evaluation of the metabarcoding method for the 15 mammalian and avian species

To determine whether the mixture containing 15 species can be identified using metabarcoding, we analysed the availability of universal primers and the ability of their amplified products (if applicable) to distinguish the 15 species. For the COX1 gene, no primer matched the sequences from all the species. For instance, the maximum number of matched species was five when the primer pair I-B1 and COI-C04 was used (Table S2). For the 16S rRNA gene, only one primer set, 16Sbr-H, matched the sequences of all the species, and the amplified products showed high degrees of variations that were sufficient for distinguishing the 15 species (Fig. S1). For the 18S rRNA, only the primer Uni18S was found in the sequences of all the species, but the amplified products were highly conserved and could not be used in distinguishing the 15 species (Fig. S2). Previously, the performance of COI metabarcoding and that of shotgun mitogenome sequencing were compared. Shotgun sequencing can provide highly significant correlations between read number and biomass [17]. As a result, we focused on developing the metagenomic approach for the direct biomass estimation of meat samples from the 15 species.

Development of 3MG method

The 3MG pipeline can be divided into four steps (Fig. 1). The first step is ‘extracting mitochondrial reads’. We searched next-generation sequencing (NGS) reads against 15MGDB by using the BLASTN command with the following parameters: -evalue = 1e-5 and –outfmt = 6. We extracted the matched reads using the ‘filterbyname.sh’ command in the BBMap software package (v35.66). The extracted reads were called ‘mitochondrial reads’ and used in the subsequent procedures.

The second step is assembling mitogenomes from mitochondrial reads. The mitogenomes in the public database might have originated from a particular individual or subspecies. Thus, the sequences from the samples might differ from those in the public database because of intra-specific variations. To ensure accurate qualitative and quantitative analyses, we assembled the mitogenomes according to the NGS reads and used MITOBim (v1.9.1) [27] with the default parameters. The mitogenome sequences downloaded from GenBank were used as references. They were called reference mitogenomes in the subsequent text.

The third step is mapping reads to the assembled mitogenomes. We mapped the reads to each assembled sequence of the species with bowtie2 (v2.3.4) [28], using default parameters. We then extracted the mapped reads using samtools (v1.9) [29] with the following command: ‘samtools view -bF 4.’

The fourth step is identifying and counting reads uniquely mapped to the assembled mitogenomes. The mitogenome sequences were highly conserved. Some reads may be mapped to the mitogenomes of multiple species. We calculated the p-distances among these mitogenomes to determine how conserved they were. The p-distances among the 15 mitogenomes ranged from 0.14 to 0.49 (Fig. S3). These numbers indicated a high degree of mitogenome sequence conservation. All the mapped reads may have originated from multiple sources. To overcome this problem, we developed a custom python script to remove non-specific reads. Specifically, we obtained 15 files recording the mapped reads of each species. We compared the mapped reads of the target specie with those of other 14 species and deleted non-specific reads appeared multiple files. The proportions of unique reads mapped to the mitogenome of a particular species in all mitochondrial reads were calculated. When the proportion was greater than 2% (the cutoff of 2% was set according to the results of Determination of detection sensitivity for 3MG methods based on simulated datasets section), the species was called ‘presence’.

Determination of detection sensitivity for 3MG methods based on simulated datasets

We constructed 21 simulated datasets (Table 2, SD01-SD21) mixed with 30,000 pork (major composition) and chicken mitochondrial reads (minor composition) to determine detection sensitivity. The percentages of chicken mitochondrial reads were 0.01, 0.1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20,30, 40, 50, 60, 70, 80, 90 and 100% of all the mitochondrial reads in the simulated dataset. We prepared five replicates at each proportion. We analysed the data using the 3MG analysis pipeline. We calculated the relative detection error at each proportion (Table 2). At a high proportion, the quantitative detected results were similar to the simulation results. At 2–100%, we detected the minor composition with a relative error of < 5%. However, the accuracy was significantly reduced at 1, 0.1 and 0.01%. These results indicated that the method can detect a contaminant at a proportion of 2% and has an error rate of < 5%.

Table 2 Relative detection errors determined using the simulated dataset

Full size table

Qualitative determination of biological composition with the 3MG method

Sequencing results and characterisation

We constructed mixed samples containing meat from 15 animals (M1). The animals fall into several categories. For example, pork, cattle, chicken, lamb, duck and rabbit are primarily used for human consumption. Ferret, nutria, raccoon dog and fox are commonly used in the fur industry. Dogs are companion animals. Camel and horse are used for multiple purposes. Rat and mouse co-inhabit with humans and their meat can potentially contaminate other meat for human consumption. Adulteration of meat not meant for human consumption have been reported, particularly through the addition of meat of fur animals to pork or beef or substitution of pork with horse meat [2]. The motivation is primarily economic gain, as profits can be made when expensive meat is replaced with cheap meat. Some of these adulterations can be culturally offensive, for example, adding pork in food for Islamic consumers [36, 37].

We constructed three M1 samples labeled as ‘R1’, ‘R2’ and ‘R3’, respectively. We obtained 23.45, 24.10 and 28.56 GB for each of the three replicates (Table 3). The percentages of bases having Quality scores ≥30 were 89.51, 88.97 and 89.97%.

Table 3 Summary of sequencing data for samples M1 and M2

Full size table

Qualitative analysis of M1 sample’s biological composition

We analysed the NGS data generated from the M1 samples using the pipeline 3MG. In step one, 331,866 (0.43%), 222,702 (0.28%) and 267,495 (0.28%) reads were mapped to the mitogenomes and were extracted as mitochondrial reads (Table 4). In step two, we successfully assembled all 15 mitogenomes from mitochondrial reads. We constructed a phylogenetic tree using 15 pairs of assembled and reference mitogenomes (Fig. 2). The alignment of the 15 pairs of mitogenomes is shown in Fig. S4. The reference and assembled mitogenomes for the same species were clustered together (left part of Fig. 2).

Table 4 Summary of reads mapped to the mitogenomes of the 15 species

Full size table

To compare the intra-specific and inter-specific distances, we calculated the distances, as shown in the right part of Fig. 2. Intra-specific distance was the distance between the assembled and reference mitogenomes for a particular species. By contrast, inter-specific distance was the average distance between the assembled mitogenomes of the focal species and those of the other 14 species. Intra-specific nucleotide distances were much smaller than the inter-specific distances. Thus, we assembled the mitogenomes of specific species from the metagenomic data with high accuracy. Our assembled mitogenomes unlikely contained chimeric sequences because the inter-specific distances were significantly larger than the intra-specific distances.

In step three, we mapped these mitochondrial reads to the 15 assembled mitogenomes. Approximately 10,000–90,000 reads were mapped to each mitogenome (Table 4). However, an average of 52.07% reads was mapped to multiple species. Thus, using the total number of mapped reads led to the overestimation of the meat content of a particular species. For example, 79.77% of the reads mapped to the B. taurus (cattle) mitogenome were non-specific, and 63.70% of the reads mapped to the O. aries (sheep) mitogenome were non-specific. Using the total number of reads in estimating the beef and lamb content led to overestimation. Hence, 3MG determines biological composition according to the number of reads uniquely mapped to a particular mitogenome.

In step four, we identified reads uniquely mapped to the mitogenome of each species. The proportion of unique reads to all mitochondrial reads was more than 2% for 12 species in at least one replicate sample (Table 4). The mapped read proportions for B. taurus, O. cuniculus and R. norvegicus were approximately 1%. In summary, through the analysis of the 15 species mix with this 3MG pipeline, 12 of 15 (80%) species were successfully identified with a confidence level of 95%.

Validation of 3MG analysis results for M1 samples by using LAMP experiments

LAMP is commonly used in detecting the biological composition of meat products. We used LAMP results to evaluate the accuracy of the 3MG results. Unfortunately, LAMP protocols are available for the meat of only five of 15 species (pig, sheep, cattle, duck and chicken). Thus, only these five species in the M1 samples were tested. The experiments were conducted separately for each target species (Fig. 3). The results confirmed the presence of meat from pig (Fig. 3A), sheep (Fig. 3B), cattle (Fig. 3C), duck (Fig. 3D) and chicken (Fig. 3E) in our experimental samples, consistent with the results obtained from the 3MG method.

Quantitative determination of biological composition in mixed sample

Sequencing results and characteristics

To determine the performance of 3MG in estimating the biological composition in biomass, we prepared a series of samples by mixing meat from S. scrofa domesticus (pig) and G. gallus (chicken) in different proportions. We performed DNA extraction, library construction, DNA sequencing and DNA analyses in the same way as those for the M1 samples. The sequencing results are summarised in Table 3. We generated an average of 2.95 GB of data with 19,749,625 raw reads for each M2 sample. Approximately 90% of bases had quality scores greater than 30.

Quantitative analysis of M2 samples’ biological composition

The number of reads mapped to the pig and chicken mitogenomes were shown in Table S2. The proportion of reads uniquely mapped to the pig mitogenomes was called meat content estimated with 3MG. They were plotted against the known meat content (Fig. 4A). Regression analyses showed that the pork’s estimated and known meat content had a correlation coefficient of 0.98. Similarly, based on relative read counts, the estimated meat content for chicken were plotted against known meat content (Fig. 4B). Regression analyses showed that the estimated and known meat content had a correlation coefficient of 0.98. The high correlation coefficient between the estimated and known content suggested that the 3MG method can use the percentage of uniquely mapped reads in quantitatively determining biological composition in a meat mix.

Validation of 3MG analysis results using LAMP experiments

We conducted a LAMP experiment to determine the quantity of pork and chicken in different ratios. We then compared the LAMP results with those obtained from the 3MG method. The peak time for detecting composition was called meat content estimated by LAMP and plotted against the known meat content of pig (Fig. 4C) and chicken (Fig. 4D). Regression analyses showed that the estimated and known meat content had correlation coefficients of 0.99 (pork) and 0.96 (chicken). Consequently, the 3MG results were consistent with the LAMP results. However, the variations in the LAMP results for chicken were significantly higher than those in the 3MG results. This observation suggested that the 3MG results were more stable than the LAMP results, at least for chicken meat.

Estimation of the relative correction factor for DNA–biomass ratio from different species

We mixed the meat of 15 species to construct M1 samples in equal mass ratios as described earlier. However, the number of reads mapped to each mitogenome of the 15 species varied significantly (Table 4). This discrepancy was likely due to the differences in mitogenome DNA content among the 15 species at the same meat biomass. As a result, a correction factor was needed when the meat mass was estimated from uniquely mapped read counts for a particular species. Using the number of reads uniquely mapped to the S. scrofa domesticus mitogenome as the baseline, we calculated the relative correction factors for the other species. The correction factors were 1.00 for A. platyrhynchos, 0.77 for C. bactrianus, 1.46 for C. lupus familiaris, 0.59 for E. caballus, 0.65 for G. gallus, 0.32 for M. musculus, 0.40 for M. putorius voucher, 1.24 for M. coypus, 0.43 for N. procyonoides, 0.31 for O. aries and 1.94 for V. vulpes. This set of relative correction factors might correlate with the relative copy numbers of mitogenome in the muscle tissues of each species. They can be used in estimating meat content for different species. Detailed discussions on the ratios of nuclear DNA to mitochondrial DNA and DNA to biomass are provided in the following text.

Detection of other contaminating biological composition

To determine the presence of unexpected biological composition in the samples, we classified the unmapped reads with the RDP classifier and analysed them using MEGAN6. The unmapped reads can be divided into four categories: bacteria, Archaea, Eukaryota and ‘not assigned’ (Fig. 5). Five genera belonging to Eukaryota were annotated: Myocastor, Canis, Sus, Anas and Gallus. These reads may have been extremely diverse and thus were not mapped to the mitogenomes in the 3MG process. Overall, we detected few contaminants from other mammals and bacteria in our mock mix samples.

Discussions

Meat adulteration and contamination can affect consumers’ well-being, disrupt market order and insult religious beliefs. Hence, the development of qualitative, quantitative and unbiased methods for detecting the composition of meat products is of great importance. In the present study, we found that meat composition from 15 species cannot be identified with the metabarcoding approach because of the lack of universal primers or the needed discrimination power. Therefore, we developed a meat mitochondrial metagenomics (3MG) method to determine the composition of 15 meat most commonly found in food markets.

The evaluation of detection sensitivity for the 3MG methods based on simulated datasets indicated that the method can detect a contaminating composition at a proportion of 2% and has an error rate of < 5%. This method successfully identified the presence of 12 of 15 (80%) species with the threshold of detection sensitivity. This result showed that the method can simultaneously detect the presence of multiple species with high sensitivity. It can detect a wide variety of adulterated meat in the market. In addition, the analyses of the two species mixed samples revealed correlation coefficients of 0.98 for pork and 0.98 for chicken between the number of uniquely mapped reads and the mass proportion. The 3MG results were more stable than the LAMP results, at least for chicken meat, indicating that the method can use the percentage of uniquely mapped reads in quantitatively determining biological composition in a meat mix.

To the best of our knowledge, this study is the first to demonstrate the usefulness of the mitochondrial metagenomics method in detecting meat composition and estimating biomass. This method has several advantages over methods based on PCR amplification and particular markers. It is a non-targeted approach and does not need to assume the biological composition of samples. Consequently, it is likely to have a lower false-negative detection rate. Given that PCR-based methods require species primers, they often fail to amplify sequences not matched by primers. Furthermore, the 3MG method is not affected by problems in PCR reactions, such as the generation of multiple PCR bands resulting from non-specific amplification. The detection of multiple composition with PCR-based methods requires simultaneous PCR reactions specific to multiple biological composition. This approach can be quite expensive. By contrast, the 3MG methods can potentially reduce the cost in this case. The 3MG method may facilitate the analysis of high-value products, such as medical and health-promoting products. In general, the 3MG method is suitable for non-targeted biomonitoring and requires meat composition with an abundance above specific levels, whereas PCR-based method is suitable for targeted biomonitoring and can detect biological composition at considerably low abundance levels.

We showed that the mitogenome DNAs of the 15 mammalian and avian species represent 0.28–0.43% of the total DNA. The generation of 1 GB of data costs around US$ 10, and 1 GB of data can produce sufficient mitochondrial reads for determining biological composition qualitatively and quantitatively. Mitogenomes from animals are relatively small and easy to assemble. In December 2020, more than ten thousand animal mitogenomes had been deposited in the NCBI RefSeq database (https://www.ncbi.nlm.nih.gov/genome/browse#!/organelles/). Thus, expecting that all species used in food and medicine will have their mitogenomes available soon is reasonable. Owing to the rapid drop in sequencing costs, fast accumulation of mitogenomes and the integrated bioinformatics software tools, we can expect the broad application of the 3MG method in the near term.

One problem encountered in this study was that beef was successfully detected using LAMP but was not detected in the mock sample when the 3MG method was used. One explanation is that the cattle mitogenome sequence has relatively small percentages of unique sequences. Our data showed that only 17.12–23.93% of reads mapped to the cattle mitogenome were unique to the cattle mitogenome (Table 4). LAMP primers were unique enough to amplify the cattle sequence successfully. Hence, the proportion of unique regions on a mitogenome is essential for its successful detection with the 3MG method. In addition, rabbit or rat meat was not successfully detected with the 3MG method. One explanation is that the mitochondrial DNA has a low proportion of all cellular DNA. Our data showed that the total number of reads mapped to the mitochondrial genomes of these two species was significantly lower than those mapped to the other species (Table 4). Additional studies are needed to optimise the 3MG for the detection of such species mixed samples. Several improvements can be made for the 3MG method. Firstly, internal controls can be added to the samples for the accurate determination of the amount of mito-DNA for particular animal species. As described previously [14, 38], the internal controls can be commonly used as metabarcoding markers, particularly COI and 18S. As the lack of universal primers prevent these markers in metabarcoding analysis, they should represent perfect sequences serving as internal controls for 3MG analysis. Secondly, correction factors should be estimated for biomass estimation based on read counts. For meat biomonitoring, biomass is more commonly used than read counts. Thus, an appropriate conversion rate from read count to the biomass for each meat type is needed. It should be emphasized that the sampling locations may affect the results of biomass estimation. For example, meat from different locations of the legs might have different ratios of fat and fibers, resulting in the variations in the DNA extraction rate and the nuclear to mitochondrial DNA ratio. In this study, we tried to extract samples with homogeneous compositions intraspecificly and interspecificly to minimize this effect. The sample heterogeneity problem is difficult to solve not only for the 3MG method, but also for other traditional detection methods in general. Therefore, we need to estimate two types of correction factors. The first one is the nuclear to mitochondrial DNA ratio (also known as the nuclear–mito ratio). The DNA to biomass ratio (DNA-mass ratio) should be calculated as well. Given that meat might contain different proportions of fats, a high degree of variations in nuclear–mito ratio and DNA–mass ratio are expected among different species. Thirdly, many studies have focused on the meat from 15 mammalian and avian species used in food safety biomonitoring. Meat from many other animal species is commonly consumed but has not been tested in the current study. For instance, fish represents another large group of animal meat widely consumed. The 3MG methods developed in the current study can be applied to fish meat in theory. Parameter optimisation and validation of 3MG on the assessment of fish meat are interesting subjects. Lastly, we should build an extensive reference database for unique mitogenomic DNA sequences from different varieties of related animal species given that many animals, such as chickens, pigs, cattle and sheep, have many endemic species. Building an extensive database containing variety-specific mitochondrial genome sequences will facilitate the identification of the sources of particular animal species.

The current research has laid the foundation for developing accurate and standard procedures for detecting the composition of meat qualitative and quantitatively. The methods will be necessary for the bioassessment and biomonitoring of meat products worldwide and significantly contribute to meat safety management.

Availability of data and materials

The Next Generation Sequencing data for this research is available on the SRA database (https://www.ncbi.nlm.nih.gov/sra). The accession number of a mixture of 15 commonly used animal meat raw sequence reads in Biosample is SAMN11812028. The raw data can be download with the SRA accessions number SRR9107560.

References

Cawthorn D-M, Steinman HA, Hoffman LC. A high incidence of species substitution and mislabelling detected in meat products sold in South Africa. Food Control. 2013;32(2):440–9.
Article Google Scholar
Premanandh J. Horse meat scandal – a wake-up call for regulatory authorities. Food Control. 2013;34(2):568–9.
Article Google Scholar
Ayaz Y, Ayaz ND, Erol I. Detection of species in meat and meat products using enzyme-linked immunosorbent assay. J Muscle Foods. 2006;17(2):214–20.
Article Google Scholar
Zhang W, Xue J. Economically motivated food fraud and adulteration in China: an analysis based on 1553 media reports. Food Control. 2016;67:192–8.
Article Google Scholar
Peng GJ, Chang MH, Fang M, Liao CD, Tsai CF, Tseng SH, et al. Incidents of major food adulteration in Taiwan between 2011 and 2015. Food Control. 2016;72:145–52.
Article Google Scholar
Bik HM, Porazinska DL, Creer S, Caporaso JG, Knight R, Thomas WK. Sequencing our way towards understanding global eukaryotic biodiversity. Trends Ecol Evol. 2012;27(4):233–43.
Article Google Scholar
Carvalho DC, Palhares RM, Drummond MG, Gadanho M. Food metagenomics: next generation sequencing identifies species mixtures and mislabeling within highly processed cod products. Food Control. 2017;80:183–6.
Article CAS Google Scholar
Ripp F, Krombholz CF, Liu Y, Weber M, Schafer A, Schmidt B, et al. All-food-Seq (AFS): a quantifiable screen for species in biological samples by deep DNA sequencing. BMC Genomics. 2014;15:639.
Article Google Scholar
Crampton-Platt A, Yu DW, Zhou X, Vogler AP. Mitochondrial metagenomics: letting the genes out of the bottle. GigaScience. 2016;5:15.
Article Google Scholar
Rennstam Rubbmark O, Sint D, Horngacher N, Traugott M. A broadly applicable COI primer pair and an efficient single-tube amplicon library preparation protocol for metabarcoding. Ecol Evol. 2018;8(24):12335–50.
Article Google Scholar
Deagle BE, Jarman SN, Coissac E, Pompanon F, Taberlet P. DNA metabarcoding and the cytochrome c oxidase subunit I marker: not a perfect match. Biol Lett. 2014;10(9):20140562.
Article Google Scholar
Polz MF, Cavanaugh CM. Bias in template-to-product ratios in multitemplate PCR. Appl Environ Microbiol. 1998;64(10):3724–30.
Article CAS Google Scholar
Tang M, Tan M, Meng G, Yang S, Su X, Liu S, et al. Multiplex sequencing of pooled mitochondrial genomes-a crucial step toward biodiversity analysis using mito-metagenomics. Nucleic Acids Res. 2014;42(22):e166.
Article Google Scholar
Ji Y, Huotari T, Roslin T, Schmidt NM, Wang J, Yu DW, et al. SPIKEPIPE: a metagenomic pipeline for the accurate quantification of eukaryotic species occurrences and intraspecific abundance change using DNA barcodes or mitogenomes. Mol Ecol Resour. 2020;20(1):256–67.
Article CAS Google Scholar
Bista I, Carvalho GR, Tang M, Walsh K, Zhou X, Hajibabaei M, et al. Performance of amplicon and shotgun sequencing for accurate biomass estimation in invertebrate community samples. Mol Ecol Resour. 2018;18(5):1020–34.
Article CAS Google Scholar
Choo LQ, Crampton-Platt A, Vogler AP. Shotgun mitogenomics across body size classes in a local assemblage of tropical Diptera: phylogeny, species diversity and mitochondrial abundance spectrum. Mol Ecol. 2017;26(19):5086–98.
Article Google Scholar
Crampton-Platt A, Timmermans MJ, Gimmel ML, Kutty SN, Cockerill TD, Vun Khen C, et al. Soup to tree: the phylogeny of beetles inferred by mitochondrial metagenomics of a Bornean rainforest sample. Mol Biol Evol. 2015;32(9):2302–16.
Article CAS Google Scholar
Gillett CP, Crampton-Platt A, Timmermans MJ, Jordal BH, Emerson BC, Vogler AP. Bulk de novo mitogenome assembly from pooled total DNA elucidates the phylogeny of weevils (Coleoptera: Curculionoidea). Mol Biol Evol. 2014;31(8):2223–37.
Article CAS Google Scholar
Gómez-Rodríguez C, Crampton-Platt A, Timmermans MJTN, Baselga A, Vogler AP. Validating the power of mitochondrial metagenomics for community ecology and phylogenetics of complex assemblages. Methods Ecol Evol. 2015;6(8):883–94.
Article Google Scholar
Tang M, Hardman CJ, Ji Y, Meng G, Liu S, Tan M, et al. High-throughput monitoring of wild bee diversity and abundance via mitogenomics. Methods Ecol Evol. 2015;6(9):1034–43.
Article Google Scholar
Liu S, Wang X, Xie L, Tan M, Li Z, Su X, et al. Mitochondrial capture enriches mito-DNA 100 fold, enabling PCR-free mitogenomics biodiversity analysis. Mol Ecol Resour. 2016;16(2):470–9.
Article CAS Google Scholar
Xu S, Kong F, Miao L, Lin S. Establishment and application of fluorescent loop-mediated isothermal amplification for detecting chicken or duck derived ingredients. Chin J Anim Quarantine. 2018;35(2):77–81.
CAS Google Scholar
Xu S, Kong F, Miao L, Cai Z, Lin Z, Zhao R. Establishment of the loop-mediated isothermal amplification for the detection of bovine-derived materials. China Anim Health Inspect. 2017;33(12):94–9.
Google Scholar
Xu S, Kong F, Miao L, Lin S, Lin Z. Establishment of loop-mediated isothermal amplification for detection of mutton-derived ingredients based on isothermal amplification platform. J Econ Anim. 2016;20(4):200–6.
Google Scholar
Zhou J, Bruns MA, Tiedje JM. DNA recovery from soils of diverse composition. Appl Environ Microbiol. 1996;62(2):316–22.
Article CAS Google Scholar
Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10:421.
Article Google Scholar
Hahn C, Bachmann L, Chevreux B. Reconstructing mitochondrial genomes directly from genomic next-generation sequencing reads--a baiting and iterative mapping approach. Nucleic Acids Res. 2013;41(13):e129.
Article CAS Google Scholar
Rice P, Longden I, Bleasby A. EMBOSS: the European molecular biology open software suite. Trends Genet. 2000;16(6):276–7.
Article CAS Google Scholar
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map (SAM) format and SAMtools. Bioinformatics. 2009;25(1 Pt 2):1653–4.
Google Scholar
Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, et al. Clustal W and Clustal X version 2.0. Bioinformatics. 2007;23(21):2947–8.
Article CAS Google Scholar
Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30(9):1312–3.
Article CAS Google Scholar
Kumar S, Stecher G, Tamura K. MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol Biol Evol. 2016;33(7):1870–4.
Article CAS Google Scholar
Wang Q, Garrity GM, Tiedje JM, Cole JR. Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol. 2007;73(16):5261–7.
Article CAS Google Scholar
Porter TM, Hajibabaei M. Automated high throughput animal CO1 metabarcode classification. Sci Rep. 2018;8(1):4226.
Article Google Scholar
Huson DH, Auch AF, Qi J, Schuster SC. MEGAN analysis of metagenomic data. Genome Res. 2007;17(3):377–86.
Article CAS Google Scholar
Aida AA, Che Man YB, Wong CMVL, Raha AR, Son R. Analysis of raw meats and fats of pigs using polymerase chain reaction for Halal authentication. Meat Sci. 2005;69(1):47–52.
Article CAS Google Scholar
Nakyinsige K, Man YBC, Sazili AQ. Halal authenticity issues in meat and meat products. Meat Sci. 2012;91(3):207–14.
Article Google Scholar
Harrison JG, John Calder W, Shuman B, Alex Buerkle C. The quest for absolute abundance: the use of internal standards for DNA-based community ecology. Mol Ecol Resour. 2021;21(1):30–43.
Article Google Scholar

Download references