Mitogenomic phylogeny of Callithrix with special focus on human transferred taxa

Background Callithrix marmosets are a relatively young primate radiation, whose phylogeny is not yet fully resolved. These primates are naturally para- and allopatric, but three species with highly invasive potential have been introduced into the southeastern Brazilian Atlantic Forest by the pet trade. There, these species hybridize with each other and endangered, native congeners. We aimed here to reconstruct a robust Callithrix phylogeny and divergence time estimates, and identify the biogeographic origins of autochthonous and allochthonous Callithrix mitogenome lineages. We sequenced 49 mitogenomes from four species (C. aurita, C. geoffroyi, C. jacchus, C. penicillata) and anthropogenic hybrids (C. aurita x Callithrix sp., C. penicillata x C. jacchus, Callithrix sp. x Callithrix sp., C. penicillata x C. geoffroyi) via Sanger and whole genome sequencing. We combined these data with previously published Callithrix mitogenomes to analyze five Callithrix species in total. Results We report the complete sequence and organization of the C. aurita mitogenome. Phylogenetic analyses showed that C. aurita was the first to diverge within Callithrix 3.54 million years ago (Ma), while C. jacchus and C. penicillata lineages diverged most recently 0.5 Ma as sister clades. MtDNA clades of C. aurita, C. geoffroyi, and C. penicillata show intraspecific geographic structure, but C. penicillata clades appear polyphyletic. Hybrids, which were identified by phenotype, possessed mainly C. penicillata or C. jacchus mtDNA haplotypes. The biogeographic origins of mtDNA haplotypes from hybrid and allochthonous Callithrix were broadly distributed across natural Callithrix ranges. Our phylogenetic results also evidence introgression of C. jacchus mtDNA into C. aurita. Conclusion Our robust Callithrix mitogenome phylogeny shows C. aurita lineages as basal and C. jacchus lineages among the most recent within Callithrix. We provide the first evidence that parental mtDNA lineages of anthropogenic hybrid and allochthonous marmosets are broadly distributed inside and outside of the Atlantic Forest. We also show evidence of cryptic hybridization between allochthonous Callithrix and autochthonous C. aurita. Our results encouragingly show that further development of genomic resources will allow to more clearly elucidate Callithrix evolutionary relationships and understand the dynamics of Callithrix anthropogenic introductions into the Brazilian Atlantic Forest. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07533-1.


Background
Callithrix species represent a relatively young radiation, and divergence among lineages within the genus is estimated to be between approximately 0.7 and 2.5 million years ago (Ma) [1][2][3]. Two major subgroups occur within the genus, the aurita group (C. aurita/C. flaviceps) and the jacchus group (C. geoffroyi/C. kuhlii/C. jacchus/C. penicillata), but the phylogeny of these lineages is not yet fully resolved. Callithrix species are naturally para-and allopatric across the Brazilian Atlantic Forest, Cerrado, and Caatinga biomes ( Fig. 1) [4,5], and natural hybridization occurs between some species [6]. However, C. geoffroyi, C. jacchus, and C. penicillata have high invasive potential [7,8] and have spread widely outside of their native ranges due to the legal and illegal pet trades. These species have established several allochthonous populations in the southeastern Brazilian Atlantic Forest [6,9,10] and hybridize with other allochthonous and autochthonous congeners [6,[9][10][11], including endangered C. aurita and C. flaviceps [12,13]. Yet, determining evolutionary relationships between autochthonous, allochthonous, and hybrid Callithrix populations across Brazil is complicated by the unresolved Callithrix phylogeny.
In general, mitochondrial DNA (mtDNA) can be utilized for an initial look into evolutionary relationships among taxa (e.g., [14,15]) as well as track dispersal and gene flow patterns of allochthonous species [16]. MtDNA sequence data can also provide initial genetic insight in the direction of introgression (if sex-biased) when two species hybridize due to incongruences between phenotypes and haplotypes (e.g., [14,15]). The effective population size of mtDNA is one quarter of that of nuclear DNA from a diploid, bisexual population, which allows mtDNA lineages to coalescence relatively more quickly [17]. MtDNA is also considered a relatively fast mutating genetic marker [18]. As a result, lineage sorting and reciprocal monophyly are expected to occur faster in mtDNA than nuclear DNA, which can provide insight into shallow evolutionary relationships expected for young radiations.
One major challenge in applying genetic and genomic methods in Callithrix studies is an overall lack of genomic resources and sample material for most Callithrix species. Studies of Callithrix species have utilized mtDNA markers that generally resulted in polytomies and/or poorly supported branching patterns, as well as polyphyly for C. penicillata and C. kuhlii [19][20][21][22][23]. Also, the few available genetic studies of allochthonous and hybrid Callithrix within the Atlantic Forest, all conducted within Rio de Janeiro state, used portions of mtDNA or the Y-chromosome that could not fully resolve the evolutionary relationships of Callithrix lineages (e.g., [11,23]). Nonetheless, [24] obtained a wellresolved phylogeny for the jacchus group using complete mitogenomes, but they only sampled one individual/species with unknown provenances.
To build upon the above previous Callithrix studies, we have conducted the largest to-date geographical sampling of Callithrix mitogenomes across Brazil ( Fig. 1) with the following aims: (1) improve resolution of phylogenetic relationships and divergence times estimates between Callithrix mtDNA haplotypes; (2) determine which Callithrix mtDNA lineages are autochthonous across Callithrix ranges; and (3) identify allochthonous Callithrix mtDNA lineages in the southeastern Atlantic Forest and their possible biogeographic origins. We sequenced, for the first time, the complete mitogenome of C. aurita, and in total obtained 49 new mitogenome sequences from four species (C. aurita, C. geoffroyi, C. jacchus, C. penicillata), and four hybrid types (C. aurita x Callithrix sp., C. penicillata x C.jacchus, Callithrix sp. x Callithrix sp., C. penicillata x C. geoffroyi) for these analyses.

Results
Using Illumina whole genome sequencing (WGS) and Sanger sequencing approaches, we sequenced complete mitogenomes from 49 Callithrix (Fig. 1, Table 1, and  Table S1). We combined these new mitogenomes with previously published primate mitogenome sequences for downstream analyses (listed in Table S1). The length of the resulting sequence alignment after combining all of these mitogenomes was 17,132 bases. Sampled individuals that possessed the same mtDNA haplotypes are listed in Table S2. The organization of the C. aurita mitogenome was consistent with previously published Callithrix mitogenomes from [24]. This mitogenome includes 12 protein-coding genes, two rRNAs, and 14 tRNAs on the heavy strand and one protein-coding gene and eight tRNAs on the light strand, as well as the control region (Table S3). The length of the C. aurita mitogenome presented in Table S3 was 16,471 bases.

Phylogenetic trees and divergence times of Callithrix mitochondrial clades
Maximum-likelihood (ML) and Bayesian inference produced well-supported phylogenetic trees that show mostly congruent phylogenetic relationships between the aurita and jacchus groups (Fig. 2, Figures S1-S3). The main difference in the topology of the ML and Bayesian trees was in grouping patterns of some haplotypes within the C. jacchus clade described below. A number of nodes in the ML tree possessed 100% bootstrap support but most had bootstrap scores of > 70% ( Figure S1). Most nodes in the Bayesian trees had posterior probabilities of 1 (Fig. 2, Figures S2-S3). Major node names and divergence times within and outside the Callithrix clade are shown in Fig. 2, Figure S3, Table 2, and Table S4.
Callithrix diverged from Cebuella approximately 6.83 Ma ( Fig. 2 node E) and the initial split within Callithrix, separating C. aurita and the jacchus group, occurred approximately 3.54 Ma (Fig. 2 node D) (Table 2). Thus, C. aurita formed the Callithrix basal clade, and C. geoffroyi formed the most basal clade within the jacchus group by arising 1.18 Ma (node C). Callithrix penicillata haplotypes grouped into three polyphyletic clades that corresponded to three different biome regions, an Atlantic Forest-Cerrado transition area, Cerrado, and Caatinga. The first of these C. penicillata clades to diverge after C. geoffroyi was the Atlantic Forest-Cerrado transition clade at 0.92 Ma. Afterward, the C. penicillata Cerrado clade appeared at 0.87 Ma, followed by the C. kuhlii clade at 0.82 Ma (Fig. 2 node B). The C. penicillata Caatinga clade and the C. jacchus clades represent the two youngest clades within the phylogeny, splitting about 0.51 Ma (Fig. 2 node A). As the C. jacchus clade showed some of the shallowest branch tips among Callithrix haplotypes and poor phylogenetic resolution, a ParsimonySplits network was constructed for haplotypes within this clade (Fig. 3).

Ancestral origins and biogeography of Callithrix Mitogenomes
The ancestral origins of Callithrix phylogenetic mitogenome clades and subclades based on BMM biogeographic analysis were largely concordant with the assigned Brazilian states and regions of origin of sampled mitogenomic haplotypes ( Fig. 4 and Table S5). BMM analyses resulted in > 70% posterior probability of an ancestral origin for Node 93, which represented the basal node of the C. aurita clade, in Rio de Janeiro state. Within the C. aurita clade, node 92 showed > 97% posterior probability of an ancestral original of Rio de Janeiro state for two haplotypes sampled within this region from C. aurita-phenotype individuals and a C. aurita x Callithrix sp. hybrid. On the other hand, BMM analysis for nodes 89-91, which represent the other C. aurita subclade, assigned posterior probabilities between 44 and 65% for an origin of the Minas Gerais state portion of the natural C. aurita range. These haplotypes were obtained from C. aurita-phenotype individuals sampled in Minas Gerais, São Paulo, and Rio de Janeiro states, as well as a C. aurita x Callithrix sp. hybrid from São Paulo state.
Node 87 in Fig. 4 represents the basal node of the C. geoffroyi clade, and BMM analyses calculated a collective posterior probability of over 75% of this clade originating within the natural range of C. geoffroyi. With a BMM posterior probability of 91.93% that node 85 originated in southeastern Espírito Santo state, biogeographic analysis accurately reflected the sampling origin of haplotypes BJT70 and BJT169. These haplotypes come from C. geoffroyi-phenotype individuals, as well as one Callithrix sp. x Callithrix sp. hybrid. For the other C. geoffroyi subclade, BMM analyses posterior probabilities support an ancestral origin of associated haplotypes within the natural distribution of C. geoffroyi.  For the three C. penicillata clades, BMM analysis showed high posterior probabilities for each clade's corresponding geographic area as also being each respective clade's ancestral region. Nodes 79-81 ( Fig. 4), which represent the C. penicillata Atlantic Forest-Cerrado transition clade, each possessed > 98% posterior probabilities of originating in the Atlantic Forest-Cerrado transition zone of Minas Gerais. This clade contained several haplotypes from C. penicillata-phenotype individuals sampled in this transition zone, as well as a hybrid sampled in São Paulo state. The BMM posterior probability for the central Brazil Cerrado being the ancestral region for node 77 ( Fig. 4), which encompassed the C. penicillata Cerrado clade, was 98.16%. The C. penicillata Cerrado clade included haplotypes from C. penicillata-phenotype individuals sampled in Brasília. Finally, nodes 69-73 ( Fig. 4), representing the C. penicillata Caatinga clade, possessed BMM posterior probability support between 96.14-99.60% for the Caatinga of Bahia state as the ancestral region of this clade. The clade contained haplotypes from C. penicillataphenotype animals sampled at CEMAFAUNA, which clustered with a haplotype from a C. penicillata x C.  Figure S3). Major nodes are identified by capital letters, and blue bars at nodes indicate 95% highest posterior densities (HPD) of divergence times. Node support is shown for major nodes where either posterior probability was < 1 in the BEAST tree, posterior probability was < 1 in the MRBAYES tree, or bootstrap support < 70% in the ML tree. Haplotype colors at tips correspond to the 'Species and Hybrid Phenotypes' legend, and indicate phenotypes associated with each given haplotype Table 2 Divergence times in million years (Ma) for Callithrix species and select nodes (MRCA = Most recent common ancestor; values in brackets = 95% highest posterior density). Node names follow major node destinations shown in Fig. 2  jacchus hybrid with a C. jacchus phenotype sampled at CPRJ. Two haplotypes, representing eleven C. penicillata x C. geoffroyi hybrids sampled in Viçosa as well as a C. jacchus x C. penicillata hybrid, clustered within the C. penicillata Caatinga group. BMM biogeographic analysis of the C. jacchus clade calculated high posterior probability (> 99%) that haplotypes associated with nodes 60-64 originated in Ceará and/or Pernambuco states, regions whose dominant biome is the Caatinga. These haplotypes were obtained from marmosets with C. jacchus phenotypes sampled at CEMAFAUNA and the Guarulhos Zoo, as well as three C. aurita phenotype individuals sampled within São Paulo. For nodes 65-68, BMM analyses calculated posterior probabilities of the associated haplotypes originating first from Pernambuco, and then from Ceará and/or Pernambuco. In particular, nodes 66 and 67 had respective posterior probabilities of 49.68 and 76.88% of originating in Pernambuco state. Haplotypes associated with these nodes came from a C. aurita x Callithrix sp. hybrid sampled in São Paulo state and a CEMAFAUNA C. jacchus-phenotype individual.

Genetic distance between Callithrix phylogenetic clades
Pairwise genetic distances between the above established phylogenetic clades are shown in Table 3 as measures of D xy . The C. aurita clade was the most genetically distant from all other Callithrix clades, with D xy = 0.055-0.056. The smallest genetic distance can be observed between C. jacchus and the C. penicillata Caatinga clade at D xy = 0.009. The remaining pairwise genetic distances varied between D xy = 0.013-0.015, but the C. geoffroyi clade was the most distant relative to all other jacchus group clades.

Discussion
Callithrix mitochondrial phylogenetic relationships and divergence times Our ML and Bayesian phylogenies were generally well supported and corroborated Callithrix divergence patterns from previous nuclear and mtDNA studies [3,[24][25][26]. In ours and these previous phylogenies, the C. aurita clade was the most basal within the genus, the C. geoffroyi clade was most basal within the jacchus group, and C. penicillata and C. jacchus was the most recently diverged sister clade. Finally, our mtDNA analysis also showed that C. penicillata mitochondrial clades are polyphyletic, similar to the results obtained by [21,23]. The latter two studies also showed that C. kuhlii mitochondrial clades are polyphyletic. Given the recent divergence times of Callithrix species, Callithrix polyphyly may be explained by incomplete lineage sorting when ancestral polymorphisms at a given locus are not fixed before population divergence [27]. Another possibility to  explain C. penicillata and C. kuhlii polyphyly may be due to past hybridization between these species and other Callithrix taxa or perhaps recent migrations of C. kuhlii outside of its native range. However, we do not believe is likely for any of these cases. For these alternative scenarios, we would expect to find at least some instances of allochthonous C. kuhlii mtDNA lineages, which to our knowledge have not been yet been reported. Additionally, we did not observe any discordance involving of C. kuhlii genotype/phenotype with that of any other Callithrix species, nor did mitogenome haplotypes from any hybrids sampled in this or previous studies group with C. kuhlii phylogenetic clades. Finally, locations where we sampled Callithrix species within native ranges were far removed from any natural hybridization zones with C. kuhlii, so natural secondary contact between C. kuhlii and other Callithrix species was unlikely at our sampling locations. Thus, incomplete lineage sorting is the most parsimonious explanation for Callithrix polyphyly observed in this and previous Callithrix studies.
The Callithrix divergence time estimates from our study, being between approximately 0.5 and 6.8 Ma, are within the range of previously published estimates [6,25,26]. These time estimates place the divergence of Callithrix species into the Pleistocene. In this epoch, climatic oscillations that may have promoted para-and allopatric speciation in South America, including that of Callithrix species, through repeated contractions and expansions of forested refuge [28,29].

Biogeography origins of autochthonous and Allochthonous Callithrix Mitogenomes
Previous phylogenetic and biogeographic analyses support a Callithrix origin in the southeastern Brazilian Atlantic Forest and then a northward expansion [3,4,[24][25][26]. Our biogeographic analysis of Callithrix mitogenome lineages reflected similar biogeographic patterns for Callithrix species, as well as a natural geographic separation of major phylogenetic clades. When considering sampled mitogenome lineages of known provenance, the biogeographic origins of our reconstructed phylogenetic clades was strongly influenced by the geographic origin of our samples across natural Callithrix ranges. For example, within the C. geoffroyi clade, the Minas Gerais state lineage formed a separate clade from the Espírito Santo state lineages. Callithrix penicillata possesses the largest natural geographic distribution of all Callithrix species [5] and biogeographic origins of paraphyletic clades were defined by where samples were collected within the Cerrado, Atlantic Forest, and Caatinga biomes. Although most sampled C. jacchus clade haplotypes did not possess known provenance and showed shallow tips, biogeographic analyses showed strong evidence that these haplotypes likely originated from within the Caatinga biome. Further evidence that C. jacchus mtDNA lineages tend to group geographically under denser sampling was shown by [23] with a geographically broader sampling of C. jacchus mtDNA D-loop sequences. We did observe a single haplotype from an individual with a C. jacchus phenotype group within the C. penicillata Caatinga clade. This individual represents a cryptic hybrid C. penicillata x C. jacchus that was sampled in captivity at CPRJ. Since this hybrid was sampled in southeastern Brazil, far removed from any of the natural C. penicillata x C. jacchus hybrid zones that occur in northeastern Brazil [6], this marmoset is likely an anthropogenic hybrid.
Allochthonous Callithrix species began appearing in portions of the southeastern Brazilian Atlantic Forest within approximately the last 20-30 years ( [30,31], pers. obs. C. Igayara). Our biogeographic analyses can be used to infer the probable origins of parental populations of these allochthonous Callithrix species and anthropogenic hybrid Callithrix found in the southeastern Brazilian Atlantic Forest. Overall, biogeographic patterns show that the parental populations of these Callithrix likely possess multiple geographic origins from within and

Implications of biological invasions for Callithrix genetic integrity, hybridization, and conservation
Several species of non-native fauna and flora have been introduced to the Brazilian Atlantic Forest [32] -one of the most anthropogenically disturbed, yet highly biodiverse biomes on Earth [33,34]. Deliberate and accidental relocations of species beyond their natural geographic ranges by humans may lead to the establishment of nonautochthonous populations and biological invasions within new geographic localities. Such introductions alter the ecological relationships among taxa, and in cases of closely related species, gene flow may occur due to hybridization [35][36][37]. Indeed, our analyses show, for the first time, evidence of introgression of genetic material from allochthonous Callithrix species into the genetic background of an endangered, autochthonous Callithrix species. Two mtDNA haplotypes that grouped within the C. jacchus clade were associated with three individuals with pure C. aurita phenotypes sampled within São Paulo state. Two of these individuals were sampled in two different regions of São Paulo state in municipalities (Mogi Das Cruzes and São José dos Campos) that lie 60 km apart. These data not only show the first genetic evidence for cryptic hybridization within the aurita group marmosets, but also suggest two independent occurrences of a C. jacchus female mating with a C. aurita male that led to genetic introgression. Under scenarios of biological invasions, theoretical and empirical data show that hybridization between allochthonous species and endangered, native species creates extinction risk for the latter [38]. Our initial mtDNA data strongly prompt for the development of diagnostic genetic markers to detect the actual extent of allochthonous Callithrix genetic introgression in C. aurita populations, particularly within São Paulo state.
Contemporary anthropogenic hybrid Callithrix and allochthonous Callithrix species are normally found in urban or peri-urban areas of southeastern Brazil, due to releases of exotic pet marmosets into such locales [6,9,10,31], where they may encounter autochthonous Callithrix species. Indeed, cases exist of native C. aurita and C. flaviceps meeting up and interbreeding with hybrid and allochthonous Callithrix at urban fringes [6, 9-11, 39, 40]. Such interactions likely facilitate gene flow from invasive C. jacchus and C. penicillata into marmoset populations in southeastern Brazil, with consequences that may include outbreeding depression, admixture, hybrid swamping, or introgressive replacement [41][42][43][44].
Invasive C. jacchus and C. penicillata represent a potential risk for genetic extinction of the other two jacchus group species, C. geoffroyi and C. kuhlii, and [10] recently showed that C. penicillata is encroaching on the range of C. geoffroyi. We sampled one Callithrix sp.
x Callithrix sp. hybrid in Santa Teresa, Espírito Santo state, a city that straddles the native ranges of C. flaviceps and C. geoffroyi. Although this hybrid possessed a C. geoffroyi clade mitogenome lineage, the individual had a phenotype that strongly suggested some level of C. penicillata or C. jacchus ancestry-a white "star" on the forehead [30]. Anthropogenic hybridization of jacchus group species generally results in the formation of hybrid swarms, admixed populations that lost parental phenotypes and genotypes [2,6,30]. Should large numbers of exotic C. jacchus or C. penicillata ever invade native ranges of C. kuhlii or C. geoffroyi, the latter two species may be threatened with genetic swamping by the former two species, a process through which parental lineages are replaced by hybrids that have admixed genetic ancestry [38]. As C. kuhlii is considered vulnerable [45], biological invasions by other marmosets present potential conservation risks for this species.
Brazil already possesses several legal instruments for the conservation and protection of wildlife (discussed in [46]). These instruments include national species plans that legally lay out and action plans for the protection of specific groups of endangered species. A national species plan already exists that includes C. aurita and C. flaviceps, the National Action Plan for the Conservation of Atlantic Forest Primates and Collared Sloth (PAN PPMA, [46]), and this plan may eventually need to include C. kuhlii. The PAN PPMA considers hybridization as a major threat to the survival of the aurita group marmoset species. Thus, the expanded perspective on marmoset hybridization provided by this work should be considered within the context of Brazilian legal instruments that protect endangered marmosets. Such an evaluation is important for incorporating new biological information about marmoset hybridization, as it may call for adopting new legal measures or modifying existing ones to further protect endangered Brazilian fauna.

Conclusions
We provide a robust Callithrix phylogeny based on the largest to-date geographical sampling of Callithrix mitogenomes across Brazil, showing that the aurita group is basal to the jacchus group. Our divergence time estimates show these two groups diverged approximately 3.54 Ma, and within the jacchus group, C. jacchus diverged most recently from the C. penicillata Caatinga clade approximately 0.51 Ma. With future sampling of C. flaviceps, full mitogenomes can likely be utilized to fully resolve the Callithrix phylogeny. Nonetheless, we used our current wellsupported phylogenies and biogeographic analyses to elucidate, for the first time, evolutionary relationships among autochthonous, allochthonous, and anthropogenic hybrid marmosets across Brazil. We show that parental populations of allochthonous and anthropogenic hybrid marmosets within the southeastern Brazilian Atlantic Forest incorporate local populations and populations broadly distributed outside of the regions. We also show, for the first time, evidence of allochthonous Callithrix species genetic introgression into the genetic background of endangered, autochthonous C. aurita. At this time, further determination is needed of the ancestry of Callithrix anthropogenic hybrids in southeastern Brazil as well as the fitness and viability of these hybrids. Such data will help determine to what extent anthropogenic hybrids and allochthonous Callithrix species threaten the genetic integrity, or ability of a population to preserve its genotypes over generations [47], of autochthonous Atlantic Forest Callithrix species.

Samples
In 2011, skin samples were collected from two C. penicillata individuals that were captured in Brasília, Federal District. Between 2010 and 2016, skin tissue was collected from: (1) wild marmosets in Minas Gerais and Espírito Santo states as well as the Brazilian Federal District; (2) captive-born, wild-caught, and confiscated marmosets housed at the Guarulhos Municipal Zoo, Guarulhos, São Paulo, CEMAFAUNA (Centro de Manejo de Fauna da Caatinga), Petrolina, Pernambuco, and Centro de Primatologia do Rio de Janeiro (CPRJ), Guapimirim, Rio de Janeiro; (3) a wild group from Natividade, Rio de Janeiro that was caught and housed at CPRJ; (4) a captive-born C. geoffroyi sample donated by the Callitrichid Research Center (CRC), Omaha, Nebraska, US; (5) a captive born C. jacchus donated by the New England Primate Research Center (NEPRC, now closed), Southborough, Massachusetts, US. Sampling consisted of a total of 49 Callithrix individuals as described in Table 1,  Table S1, and Fig. 1. Table S1 also lists information on utilized sequences that were published elsewhere. Marmoset capture and sampling methodology has been described elsewhere [23]. All individuals were allowed to recover after sample collection, and wild marmosets were released at their point of capture. Specimens were classified phenotypically as pure C. aurita, C. geoffroyi, C. jacchus and C. penicillata or hybrid (C. aurita x Callithrix sp., C. jacchus x C. penicillata, and C. penicillata x C. geoffroyi) based on published descriptions [2,11,23,30].

Laboratory protocols
DNA from skin samples was extracted using a standard proteinase K/phenol/chloroform protocol [48]. Buffers used for extraction, precipitation and elution of DNA from blood and skin tissue are listed elsewhere [24]. DNA from the Callitrichid Research Center samples was extracted at Arizona State University (ASU). DNA from Brasília individuals was extracted at Northern State Fluminense University, Rio de Janeiro State, Brazil, and then exported to ASU (CITES permit #11BR007015/DF). DNA from all other individuals was extracted at the Federal University of Viçosa (UFV), Viçosa, Minas Gerais, Brazil.
Mitogenomes were obtained for a subset of the samples (Table S1) following the long-range PCR (LR-PCR) methodology of [24], and sequenced on an ABI 3730 sequencer with the BigDye Cycle Sequencing Kit (Applied Biosystems) by the ASU School of Life Science DNA Core Laboratory. The remainder of mitogenomes was obtained from whole genome sequencing (WGS). Individual WGS sequencing libraries were prepared at UFV and ASU with Illumina Nextera DNA Flex Library Prep Kits (catalog #20018704) following manufacturer's instructions. Individual libraries were barcoded with Illumina Nextera DNA CD Indexes (catalog # 20018707), and pooled in equimolar amounts and sequenced on an Illumina NextSeq using v2 chemistry for 2 × 150 cycles at the ASU Genomic Core Facilities.

Mitogenome alignment and data analysis
Genetic samples collected since 2015 have been registered in the Brazilian CGen SISGEN database (Supplementary Table S6) and newly sequenced mitogenomes have been deposited in GenBank (Table S1). Trace files of resulting forward and reverse reads from LR-PCR products for each individual sequence were inspected by eye and merged into a single contig for each sampled individual using SEQMAN PRO software from the DNAStar Lasergene Core 10 suite (DNASTAR, Madison, WI). Mitogenomes from WGS data were assembled with NOVOPlasty 2.6.4 [49] (scripts available at https:// github.com/Callithrix-omics/Callithrix_mtDNA.git). We downloaded mitogenome sequences of several primate species from GenBank (Supporting Information Table  S1). All mitogenomes were aligned in MAAFT (https:// www.ebi.ac.uk/Tools/msa/mafft/) with default settings and this MAFFT alignment was confirmed visually in Mesquite 3.5 [50]. Gene, tRNA, rRNA, and control region features within the newly generated marmoset mitogenomes were manually annotated based on the GenBank record of C. kuhlii (Accession number KR817257). To check for the presence of nuclear mitochondrial DNA (numts) in mitochondrial sequence data, we followed the strategy described in [24].
We kept mitogenomes in their entirety, but trimmed part of tRNA-Phe, 12 s rRNA and the control region to accommodate the length of all utilized sequences. Mitogenome haplotypes were determined with DnaSP 6.12.03 [51], and haplotypes were used for phylogenetic reconstruction. Individuals that possess identical mtDNA haplotypes are listed in Table S3, and these groups are represented in phylogenetic reconstructions by a single haplotype. We added data from several other New World monkeys (Table S1) and reconstructed phylogenetic trees with ML and Bayesian algorithms using IQ-TREE 2.0.3 [52] and MrBayes 3.2.6 [53,54], respectively. For the ML phylogeny, we used the optimal substitution model (GTR + F + R4) as calculated with ModelFinder [55,56] in IQ-TREE under the Bayesian Information Criterion (BIC). We performed the ML analysis in IQ-TREE with 10,000 ultrafast bootstrap (BS) replications [57]. In MrBayes, we used the closest available substitution model GTR + G. The Bayesian tree was reconstructed via Markov Chain Monte Carlo (MCMC) runs with 10,000,000 generations and tree and parameter sampling occurring every 100 generations. Upon completion of the two runs, the first 25% of generations were discarded as burn-in. To check convergence of all parameters and the adequacy of the burn-in, we assessed the uncorrected potential scale reduction factor (PSRF) [58] and that all parameter Estimated Sample Size (ESS) values were above 200. We calculated posterior probabilities (PP) and a phylogram with mean branch lengths from the posterior density of trees using MrBayes. Phylogenetic trees were visualized and edited with Fig-Tree 1.4.2 (http://tree.bio.ed.ac.uk/software/figtree/). Pairwise genetic distances between each of the resulting Callithrix mitochondrial clades was measured in DnaSP 6.12.03 as D xy , the average number of per site nucleotide substitutions between clades.
The divergence time calculation was performed with the BEAST 2.4.8 package [59] using a relaxed lognormal clock model of lineage variation [60] and by applying a Yule tree prior and the best-fit model of sequence evolution as obtained by ModelFinder. To calibrate the molecular clock, we applied fossil data to constrain the splits between Cebinae and Saimirinae and between Callicebinae and Pitheciinae with hard minimum and soft maximum bounds using a log normal prior following settings and fossils described in detail in [61]. Briefly, for the Cebinae -Saimirinae split, we used an offset of 12.6, mean of 1.287 and standard deviation of 0.8, which translates into a median divergence of 16.2 million years ago (Ma) (95% highest posterior density [HPD]: 13.4-30.0 Ma). For the Callicebinae -Pitheciinae split, we used an offset of 15.7, mean of 1.016 and standard deviation of 0.8, resulting in a median divergence of 18.5 Ma (95% HPD: 16.3-28.9 Ma). We performed two independent runs each with 50 million generations and tree and parameter sampling setting in every 5000 generations. To assess the adequacy of a 10% burn-in and convergence of all parameters, we inspected the trace of the parameters across generations using Tracer 1.6 [62]. We combined sampling distributions of both replicates with LogCombiner 2.4.8 and summarized trees with a 10% burn-in using TreeAnnotator 2.4.8 (both programs are part of the BEAST package). A ParimonySplits network of a subset of mtDNA haplotypes was made with default settings in SplitsTree4 [63].
To reconstruct the biogeographic history of Callithrix mitochondrial lineages, we applied the Bayesian Binary Method (BBM) in Reconstruct-Ancestral-States-in-Phylogenies 4.0 (RASP) [64,65]. The ML phylogeny obtained with IQTREE was used for the BBM analysis, which was conducted as two independent runs of 10 chains that ran for 5,000,000 generations and sampled every 100 generations. The fixed Jukes-Cantor+Gamma evolutionary model was implemented for each run. For haplotypes states of origin within a given phylogenetic clade, presence and absence of each associated taxon was determined using a combination of information of known provenance for sampled individuals and recognized Callithrix geographical distribution following [5]. For haplotypes obtained from exotic Callithrix species and anthropogenic hybrids, we noted where each of these haplotypes clustered among resulting phylogenetic clades, and assigned probable origin for these haplotypes according to the likely natural geographic range associated with each clade.
Additional file 1: Table S1. Metadata for newly collected samples as well as primate mitogenome sequences obtained from GenBank. The 'Sample' column gives ID of each new sampled individual or the species for sequences obtained from previous studies. The 'Accession' column gives GenBank accession numbers for each sequence, the 'Assembly Method This Study' column states the manner in which new Callithrix mitogenomes were sequenced and assembled (S=Sanger; N= NOVOPlasty2.6.4). The 'Phenotype' column indicates whether the sampled individual possessed a pure species or hybrid phenotype, and capital letters in parentheses next to C. penicillata x C. geoffroyi category are specific phenotype classifications following Figure Table S2. Each cell lists individuals that possess the same mtDNA haplotypes. Table S3.
Organization of the C. aurita mitogenome based on 16,471 sequenced bases of individual BJT065 (Accession number MT041703). Table S4. Divergence times for Callithrix species and select nodes (MRCA = Most recent common ancestor; values in brackets = 95% highest posterior density). Node names follow major node designations shown in Figure  S3 as capital letters. Table S5. BMM posterior probabilities for Fig. 4 nodes. Location abbreviations follow Fig. 4. Table S6. Summary of record numbers for collected samples that have been entered into the Brazilian CGEN SISGEN sample database (ES=Espírito Santo, MG=Minas Gerais, PE= Pernambuco, RJ=Rio de Janeiro, SP=São Paulo).
Additional file 2: Figure S1. Maximum-likelihood (ML) tree showing phylogenetic relationships among Callithrix haplotypes as calculated from mitogenome sequences. Numbers at nodes indicate bootstrap support for a given node, otherwise node bootstrap support was 100%. Haplotype colors at tips correspond to the 'Species and Hybrid Phenotypes' legend, and indicate phenotypes associated with each given haplotype. Figure S2. Bayesian tree showing phylogenetic relationships among Callithrix species and hybrid haplotypes from mitogeomes sequences. Numbers at nodes indicate posterior probability for a given node, otherwise node posterior probability was 1. Haplotype colors at tips correspond to the 'Species and Hybrid Phenotypes' legend, and indicate phenotypes associated with each given haplotype. Figure S3. BEAST tree showing phylogenetic relationships and divergence ages in million years (Ma) among Callithrix haplotypes and other New World primates as calculated from mitogeome sequences. Major nodes are identified by capital letters, and blue bars at all nodes indicate 95% highest posterior densities (HPD) of divergence times. Haplotype colors at tips correspond to the 'Species and Hybrid Phenotypes' legend, and indicate phenotypes associated with each given haplotype.