Skip to main content

The evolution of Homo sapiens denisova and Homo sapiens neanderthalensis miRNA targeting genes in the prenatal and postnatal brain

Abstract

Background

As the evolution of miRNA genes has been found to be one of the important factors in formation of the modern type of man, we performed a comparative analysis of the evolution of miRNA genes in two archaic hominines, Homo sapiens neanderthalensis and Homo sapiens denisova, and elucidated the expression of their target mRNAs in bain.

Results

A comparative analysis of the genomes of primates, including species in the genus Homo, identified a group of miRNA genes having fixed substitutions with important implications for the evolution of Homo sapiens neanderthalensis and Homo sapiens denisova. The mRNAs targeted by miRNAs with mutations specific for Homo sapiens denisova exhibited enhanced expression during postnatal brain development in modern humans. By contrast, the expression of mRNAs targeted by miRNAs bearing variations specific for Homo sapiens neanderthalensis was shown to be enhanced in prenatal brain development.

Conclusions

Our results highlight the importance of changes in miRNA gene sequences in the course of Homo sapiens denisova and Homo sapiens neanderthalensis evolution. The genetic alterations of miRNAs regulating the spatiotemporal expression of multiple genes in the prenatal and postnatal brain may contribute to the progressive evolution of brain function, which is consistent with the observations of fine technical and typological properties of tools and decorative items reported from archaeological Denisovan sites. The data also suggest that differential spatial-temporal regulation of gene products promoted by the subspecies-specific mutations in the miRNA genes might have occurred in the brains of Homo sapiens denisova and Homo sapiens neanderthalensis, potentially contributing to the cultural differences between these two archaic hominines.

Background

Early in the 21st century, fossils of an ancient man who lived 50-45 thousand years (ky) ago were found in the Denisova cave in the Altai mountains [1]. The Homo sapiens denisova (H. s. d.) nuclear genome was first sequenced in 2010 [2] and then re-sequenced in 2012 [3]. The Homo sapiens neanderthalensis (H. s. n.) nuclear genome was first sequenced in 2010 [4] and for a second time in 2014 [5]. A comparative analysis of the archaic hominine (H. s. d., H. s. n.) and Homo sapiens sapiens (H. s. s.) genomes assessed the genetic contribution of H. s. d. and H. s. n to the genetic and biological profile of the modern H. s. s. populations [6]. It was shown that the H. s. n. lineage is a sister lineage to H. s. d. [2]. The population split of Neanderthals and Denisovans from modern humans was estimated to have occurred 550-765 ky ago and the split time of Neanderthals and Denisovans 380-473 ky ago [5]. Considering that the evolution of miRNA genes could be one of the key factors in the formation of the modern type of man [710], we analyzed the evolution of miRNA genes in H. s. n. and H. s. d. and the expression of their target mRNAs. Using improved versions of the H. s. d. and H. s. n. genomes, we performed computer-assisted comparisons of the H. s. n., H. s. d. and H. s. s. genomes to reveal the structural and functional organization of microRNAs (miRNAs) as well as the mRNAs targeted by these miRNAs. In both the Neanderthal and Denisovan genomes, we found miRNA genes with fixed substitutions in mature miRNAs and multiple substitutions in pre-miRNA regions involved in pre-miRNA processing.

Our analysis of spatiotemporal gene expression in human tissues demonstrated that the miRNAs bearing new genetic variants fixed in the Denisovan genome regulated target mRNAs with the highest levels of expression in the postnatal human brain.

Results and discussion

To identify miRNA genes that evolved divergently in the genus Homo, we first used the map and pre-miRNA gene sequence data for 1595 experimentally confirmed H. s. s. miRNAs from the miRBase database rel. 19 [11]. We selected the H. s. d. and H. s. n. pre-miRNAs with highly confident sequences that are orthologous to H. s. s. pre-miRNAs. Using the selection procedure described in the Methods section (Selection of miRNA genes in H. s. d. and H. s. n. with no match in H. s. s.), we identified 1298 H. s. d. and 1329 H. s. n. pre-miRNAs perfectly matched to H. s. s. pre-miRNA gene sequences. In addition, we identified and selected for further study 106 H. s. d. and 102 H. s. n. diverged genes for pre-miRNA sequences, which were different from H. s. s. pre-miRNA genes by at least a single nucleotide. The other 191 H. s. d. and 164 H. s. n. pre-miRNAs did not pass quality sequence control in the selection procedur

Further, we explored the alignment of the genomes of six primates available in Ensembl rel. 69 [12]. Each of the selected H. s. d. and H. s. n. pre-miRNA genes non-identical to H. s. s. orthologs had at least one ortholog in another primate. This result indicated that these pre-miRNA coding genes were present in the genome of the common ancestor of H. s. s., H. s. d. and H. s. n. long before the evolutionary split between these hominines. In following up this analysis, we confirmed the existence of orthologs for all selected H. s. d. and H. s. n. miRNA genes except one H. s. d. miRNA gene.

Next, we analyzed all substitutions found in the archaic hominine miRNA genes selected as described above. First, we excluded a few doubtful nucleotide substitutions observed in the sequencing data for the pre-miRNA genes of H. s. d. and H. s. n. (see Methods section). Then, we mapped the remaining genetic variants found in the pre-miRNA genes of H. s. d. and H. s. n. onto the secondary structures of the corresponding pre-miRNAs in H. s. s. available at the miRBase rel. 19 [11]. We selected the H. s. d. and H. s. n. pre-miRNAs with (i) nucleotide substitutions (deletions, insertions) in the regions corresponding to the sequences of mature miRNAs (and/or miRNAs*) responsible for binding to target mRNAs or (ii) multiple (two or more) densely spaced substitutions within a pre-miRNA region involved in pre-miRNA processing. From these pre-miRNA groups, we excluded the pre-miRNAs that occur in the H. s. s. genome in more than one copy. The aim of this exclusion was to select pre-miRNAs with unique functions. The selection yielded H. s. d. and H. s. n. pre-miRNAs with nucleotide substitutions that might contribute to significant functional differences from the H. s. s. pre-miRNAs.

The functional annotation of the diverged pre-miRNA genes in archaic hominines was performed as follows. First, we selected the miRNAs expressed in the central nervous system (CNS) using the miRGator 3.0 [13] and ChIP-seq data for H3K4me3 (histone H3 trimethyl K4) modified histones marking active promoters in primate cortical neurons [14]. Second, the potential target mRNAs of these central nervous system (CNS)-active miRNAs were identified using data from miRGator 3.0 [13]. miRGator 3.0 contains experimental miRNA/target-mRNAs co-expression data obtained under various conditions. Therefore, we extracted from miRGator 3.0 genes with co-expression correlation coefficient with miRNA r≤-0.9 [13]. We used the co-expression evidence to identify putative interactions between miRNAs and their target mRNAs instead of the CLIP-seq experiments results because the latter technology requires cell lysis, which facilitates the interaction of components that are usually segregated by cellular compartments [15], and many miRNA/mRNAs interactions identified by CLIP-seq are non-canonical, which in turn does not mediate repression [16]. However, co-expression data indicate pathways (not genes) targeted by miRNA but do not allow the identification of direct miRNA/mRNA interactions. The latter has a side advantage because, in this approach, we selected against target genes with limited effect on tissue functioning [17]. Third, to identify the human brain structures (topologically different brain regions) showing increased levels of the target mRNAs (henceforth referred to as the target brain structures), we performed a randomization test using the Human Allen Brain Atlas [18] and the BrainSpan Atlas [19]. We employed the randomization test for this purpose because the standard analytic annotation enrichment techniques are inapplicable for miRNA functional enrichment analysis [20].

The characteristics of the nucleotide substitutions in each of the CNS-active pre-miRNA genes diverged in Denisovan and Neanderthal are shown in Tables 1 and 2. These miRNA genes were stratified into four groups. The first group consists of 9 miRNAs with fixed mutations unique to the H. s. d. lineage. The second group includes 5 miRNAs with fixed mutations unique to the H. s. n. lineage. Two other miRNA gene groups were compiled of 14 H. s. d. and 18 H. s. n. miRNA genes bearing variants that match to one of the polymorphic alleles in H. s. s. Most polymorphisms in these miRNAs are SNPs (single nucleotide polymorphisms). Several substitutions are observed in the seed region (based on Table 1 and 2) in six out of 14 H. s. d. and five out of 18 H. s. n. pre-miRNAs common to human polymorphisms. Only one indel in the seed region was observed in hsa-mir-3161 in both H. s. n. and H. s. d. The minority of miRNAs with polymorphisms in the seed region is not sufficient to claim that the general capacity of these miRNA pools have altered binding specificity to target mRNAs. This result allowed us to use the third and fourth pools as a control to compare the target specificities with miRNA bearing fixed variants specific for H. s. n. and H. s. d.

Table 1 Homo sapiens denisova miRNAs that differ from Homo sapiens sapiens miRNAs.
Table 2 Homo sapiens neanderthalensis miRNAs that differ from Homo sapiens sapiens miRNAs.

Next, we searched for the brain regions and CNS development stage showing the most abundant expression of mRNAs targeted by the miRNAs that diverged in archaic hominines (see Methods section). For this purpose, we used data in (1) the BrainSpan Exon microarray (BrainSpan) [19] and (2) the Allen Human Brain Atlas as updated March 7, 2013 (AHBA) [18].

The results of the analysis of BrainSpan data [19] demonstrated that mRNAs targeted by miRNAs bearing fixed mutations specific for H. s. d. are enriched in the postnatal stage of brain development, whereas mRNAs targeted by miRNAs bearing fixed mutations specific for H. s. n. are most abundant in the prenatal brain (Figure 1). In contrast, no development stage-dependent expression was found for mRNAs targeted by miRNAs matched to polymorphic variants in H. s. s. (Figure 2). Thus, we can anticipate the differential effect of genetic variations found in archaic hominines on brain function and development in Denisovans and Neanderthals. The effects are not likely to be due to bias in the number of mRNA targets. The number of targeted genes is similar between miRNAs bearing fixed mutations specific for H. s. n. and H. s. d. (Ensembl gene IDs): 1260 target genes for H. s. n. and 1547 for H. s. d. (excluding targets of hsa-mir-671-3p miRNA that possess specific variants common for H. s. d. and H. s. n.). In addition, the number of targeted genes is similar between H. s. n. and H. s. d. miRNAs matched to polymorphic variants in H. s. s. (Ensembl gene IDs): 4072 target genes for H. s. n. and 3653 for H. s. d. It is interesting that the highest expression of transcripts targeted by miRNAs with polymorphisms unique for H. s. d. is confined to the broad thalamus regions and the prefrontal cortex (Figure 3). Nevertheless, there are multiple forebrain regions known to be important for speech that exhibit higher expression of miRNA targets from the set with polymorphisms unique for H. s. d. (Figure 3): the inferiolateral temporal cortex, medial and ventrolateral prefrontal cortex. It is of interest that the thalamus plays a significant role in the shaping of the human language-ready brain [21].

Figure 1
figure1

Target stages of human brain development for H. s. d . and H. s. n . miRNA pools expressed in neural tissues and having unique mutations. Y-axis, comparisons of Cohen's d values of the regulatory potential for selected H. s. d. (positive values) and H. s. n. (negative values) miRNA pools; X- axis, age: pcw - post conception weeks, m - postnatal months, y - postnatal years.

Figure 2
figure2

Target stages of human brain development for H. s. d . and H. s. n . miRNA pools expressed in neural tissues and having mutations shared with known human polymorphisms. Y-axis, comparisons of Cohen's d values of the regulatory potential for selected H. s. d. (positive values) and H. s. n. (negative values) miRNA pools; X-axis, age: pcw - post conception weeks, m - postnatal months, y - postnatal years.

Figure 3
figure3

Target structures of human brain for tested and control H. s. d . and H. s. n . miRNA pools expressed in neural tissues and having unique mutations. Y-axis, comparisons of Cohen's d values of the regulatory potential for selected H. s. d. (positive values) and H. s. n. (negative values) miRNA pools; X- axis, large brain structures.

In contrast, when we consider miRNAs of ancient humans expressed in the central nervous system and having variants shared with known human polymorphisms, the picture changes drastically (Figure 4): all brain structures are targeted more by polymorphic H. s. n. miRNAs than by polymorphic H. s. d. miRNAs. The larger signal in H. s. n., shown in Figures 2 and 4, is likely due to a higher number of miRNAs expressed in the central nervous system and sharing variation with the human population for H. s. n. (18, see Table 2) than for H. s. d. (14, see Table 1). Considering common miRNAs sharing human polymorphisms in these organisms (10 miRNAs), the specific numbers are 4 and 8, respectively (351 H. s. d.-specific targets versus 972 H. s. n.-specific targets (Ensembl gene IDs)). Given that the largest number of gene targets was found for the polymorphic hsa-mir-149 (5p) miRNA (Table 3) in H. s. n., this difference between the two hominine subspecies can be attributed to this particular miRNA.

Figure 4
figure4

Target structures of human brain for tested and control H. s. d . and H. s. n . miRNA pools expressed in neural tissues and having mutations shared with known human polymorphisms. Y-axis, comparisons of Cohen's d values of the regulatory potential for selected H. s. d. (positive values) and H. s. n. (negative values) miRNA pools; X- axis, large brain structures.

Table 3 Number of target genes for each miRNA shown in Tables 1 and 2, based on MiRGator 3.0 data [13].

The fine target structures identified using AHBA data for gene expression in the left hemisphere of the brain [18] are presented in Tables 4 and 5. For identification of the fine target brain structures, we used the difference in regulatory potential between the H. s. d. and H. s. n. samples (Table 4, 5) as a marker (see Methods section). Importantly (Table 4), the difference between the target structures of unique H. s. d. and H. s. n. miRNAs expressed in the central nervous system tissues is not significant (effect size measured as the absolute value of difference between Cohen's d statistics<0.5) for all fine brain structures except two: the fusiform gyrus for H. s. d. miRNAs and the precentral gyrus for H. s. n. miRNAs. The picture obtained based on the target mRNAs for miRNAs expressed in the central nervous system tissues with known human polymorphisms (Table 4) is the same (compare Table 4 and 5): for the vast majority of fine brain structures, the difference between targeting by H. s. d. and H. s. n. is not significant, except for two thalamus structures and the lateral orbital gyrus, targeted by H. s. d. miRNAs. It is important that the fusiform and lateral orbital gyri are responsible for face perception and socialization, respectively [22, 23]. Taken together, these findings indicate an important role of unique H. s. d. miRNAs in the development of specific brain structures in Denisovans, which likely enabled them to reach an unparalleled level of craftsmanship [1].

Table 4 Results of the identification of brain structures of the left hemisphere characterized by the difference in regulatory potential for H.s. d. and H. s. n. miRNAs expressed in neural tissues and having unique substitutions.
Table 5 Results of the identification of brain structures of the left hemisphere characterized by the difference between regulatory potential for H.s. d. and H. s. n. miRNAs expressed in neural tissues and having substitutions shared with known human polymorphisms.

In addition to the identification of target brain structures for H. s. d. and H. s. n. miRNA activity, we found highly enriched functional categories of target gene annotations. We used the DAVID 6.7 functional annotation system [24] and the SP_PIR_KEYWORDS data class, with genes expressed in the human brain as background (16767 genes from BrainSpan). The results of the functional enrichment test are shown in Table 6. The most important difference between the target genes for unique H. s. d. and H. s. n. miRNAs expressed in the central nervous system tissues is the transport category enriched in target genes of H. s. d. miRNAs. Similar observations can be made considering miRNAs expressed in central nervous system tissues with substitutions shared with known human polymorphisms: "functional response" categories (such as activator, transcription regulation, synapse) enriched in target genes of H. s. d. and the developmental category enriched in target genes of such H. s. n. miRNAs. It would be reasonable to speculate that that the evolutionary changes in miRNA-transregulators might contribute to alterations in functional activities in specific brain regions in Denisovans, whereas in Neanderthals, the mutations in miRNAs could promote alterations in brain development and structure.

Table 6 Annotation enrichment of genes targeted by H. s. d. and H. s. n. miRNAs expressed in neural tissues, based on DAVID 6.7 [24]; BrainSpan gene list (16767 genes expressed in human brain) used as a background.

Conclusions

In this work, we identified miRNA genes of archaic humans bearing sequence variations in comparison with the modern human genome sequence: 29 genes for H. s. d. and 31 genes for H. s. n. Almost one-third of those genes contain variations specific for archaic humans (11 genes for H. s. d. and 9 for H. s. n.). The analysis of human gene expression data resulted in 9 H. s. d. and 5 H. s. n. genes with specific archaic variations and the expression of their human orthologs in the central nervous system.

The detailed analysis of the human brain gene expression data demonstrated that the brain regions with the most abundant expression of mRNAs targeted by the H. s. d. miRNAs with fixed mutations are confined to the thalamus and the prefrontal cortex (especially the fusiform gyrus), no large brain regions are targeted by H. s. n. miRNAs with specific mutations. The only small brain regions targeted by H. s. n. miRNAs with such mutations are the superior lateral aspect and the central sulcus of the precentral gyrus.

We identified differences in gene expression pattern during human brain development for the targets of human orthologs for the selected miRNAs. Targets for miRNA genes with mutations specific for H. s. d. were expressed predominantly in the later prenatal and early postnatal development stages. Targets for miRNAs with mutations specific for H. s. n. were expressed predominantly in early prenatal brain development stages. These results may reflect a potential association between the changes in the Denisovan miRNA genes reported in this study and the brain development of H. s. d., which likely allowed them to reach their high level of craftsmanship.

Methods

Selection of miRNA genes in H. s. d. and H. s. n. with no match in H. s. s.

We used data on the chromosomal localization of 1595 experimentally confirmed H. s. s. pre-miRNAs in the miRBase database rel. 19 [11] and selected the best-sequenced H. s. d. and H. s. n. pre-miRNAs orthologous to H. s. s. pre-miRNA genes. The H. s. d. and H. s. n. genomes appear as short nucleotide sequences (reads) mapped onto the human genome [3]. To ensure that only the best-sequenced H. s. d. and H. s. n. pre-miRNAs would remain, we used filtering methods.

  1. (I)

    The consensus sequences of H. s. d. and H. s. n. pre-miRNAs were combined from reads with a quality of mapping onto H. s. s. pre-miRNA genes not less than 15 [3].

  2. (II)

    The consensus sequences of H. s. d. and H. s. n. pre-miRNAs included only those read nucleotides (a) for which the Phred sequence quality [25] was not less than 30 [2] and (b) that were located in the middle part of the reads: the first and last three read nucleotides were discarded according to [3].

  3. (III)

    Any position in the consensus with coverage of less than 5 was assumed to be undetermined [3].

  4. (IV)

    The consensus sequences of H. s. d. and H. s. n. pre-miRNAs with undetermined positions were discarded.

  5. (V)

    The consensus sequences of H. s. d. and H. s. n. pre-miRNAs that (a) had nucleotide substitutions that H. s. s. lacked and (b) were combined from reads mapped onto the human genome with a quality of less than 30 were discarded [2, 3].

  6. (VI)

    If there were two or more polymorphic states at a consensus position in H. s. d. and H. s. n. pre-miRNAs, only those alternative states that (a) had comparable frequencies (the difference between the frequencies would not exceed 1.3) and (b) passed filtering stages I-V were considered.

  7. (VII)

    Additionally, we manually analyzed all mapped reads of selected H. s. d. and H. s. n. pre-miRNAs for the presence of PCR artifacts. We did not find any PCR artifacts in the positions of selected H. s. d. and H. s. n. pre-miRNAs.

To select pre-miRNA coding genes that were present in the genome of the common ancestor of H. s. s., H. s. d. and H. s. n. we analyzed the multiple genome alignment of six primates from Ensembl rel. 69 [12]. The minimum length of any ortholog (excluding unreadable and polymorphic nucleotides) was considered to be not less than 70% of the length of the corresponding H. s. s. pre-miRNA.

We mapped all the nucleotide substitutions found by comparing the pre-miRNA orthologs in the H. s. s./H. s. d. and H. s. s./H. s. n. pairs onto the secondary structures of the corresponding pre-miRNAs in H. s. s. contained in the miRBase database rel. 19 [11].

We selected H. s. d. and H. s. n. pre-miRNA genes with (i) nucleotide substitutions (deletions, insertions) in the regions corresponding to the sequences of mature miRNAs (and/or miRNAs*) or (ii) multiple substitutions within a region not larger than 1/20 of the pre-miRNA length. In the first case (i), the altered mature miRNAs are of special interest because they change the pattern of complementary interactions between miRNAs and their target mRNAs in H. s. d. or H. s. n. and make it different from H. s. s. In the second case (ii), the probability of observing multiple changes within a small region of H. s. d. or H. s. n. pre-miRNA by random chance is extremely low. For instance, the estimate of the total number of differences between H. s. d. and H. s. s. is approximately 1,650,000 [3]. Under the assumption of a uniform distribution of mutation fixation events in a 6400 bp long sequence (1/20 of the length of 1600 pre-miRNA genes, each gene being approximately 80 bp in length), the estimate of the probability of two mutations being fixed simultaneously is less than 0.002.

To ensure that the mutations in our selection of miRNAs were not DNA sequencing errors, we analyzed the frequencies of all polymorphic consensuses of H. s. d. and H. s. n. DNA (see steps V and VI of the filtering protocol), considering their tetranucleotide context. Thus, we used data in the UCSC Genome Browser phyloP46way to select evolutionarily conserved regions of the human genome (phyloP conservation score > 0.1; regions were ≥7 bp in length, which eventually totaled 454775413 positions or ~15% of the human genome). The occurrence of the polymorphic variants of the consensuses in the H. s. d. and H. s. n. genomes was then assessed considering their tetranucleotide context. We selected pre-miRNAs with mutations that either do not occur in the 454,775,413-strong selection of evolutionary conservative positions analyzed (p ≤ 2·10-9) or are single occurrences (p = 2·10-9).

To ensure that the mutations in the set of selected miRNAs were not a result of PCR amplification in DNA sequencing (see step VII of the filtering protocol), we performed a manual analysis of H. s. d. and H. s. n. short reads alignment on a reference H. s. s. genome (hg19). We found no traces of the PCR amplification of short reads in miRNA genome positions: any single read in these positions started and stopped at unique genome locations.

Using the blastn program in BLAST 2.2.26+ [26] on the pre-miRNAs containing substitutions/polymorphisms in mature miRNAs at the last filtering stage, we discarded pre-miRNAs that occurred in more than one copy in the H. s. s. genome (E-value cut-off: 1·10-6; blastn default settings).

Identification of genes targeted by H. s. s. miRNAs orthologous to H. s. d. and H. s. n. miRNAs

For each orthologous H. s. s. miRNA, we identified the genes encoding its target mRNAs. This identification was performed using miRGator 3.0 [13] (we used only those target mRNAs whose expression correlated with microRNA expression with r<-0.9) and the ChIP-seq data for H3K4me3 histones marking active promoters in primate cortical neurons [14]. We used the ID converter from bioDBnet [27] to convert the HUGO name list of target mRNA IDs (from miRGator 3.0 prediction) into a list of Ensembl gene IDs and a list of Refseq IDs.

For functional annotation of the mRNAs targeted by the selected miRNAs, we chose two independent sources of experimental data that contained the most complete quantitative information on mRNA expression in human central nervous system tissues. One of these sources was BrainSpan [19] (Exon microarray, 16767 genes), which integrates normalized quantitative data on gene expression provided by expression time series experiments in broad brain regions. The other source of data was the Allen Human Brain Atlas, as updated March 2013 [18] (20791 genes), which contains normalized quantitative data on the expression of mRNAs in tiny brain regions. The expression of these mRNAs was analyzed in six brains: H0351.1009, H0351.1012, H0351.1015, H0351.1016, H0351.2001 and H0351.2002. If the microarray contained more than one probe for measuring the expression of the same mRNA, then only the probe producing the maximum value that significantly differed from background values was taken for analysis. In addition, to improve the robustness of our results, we used only data that were consistent for at least 4 out of 6 brains.

Identification of tissues with increased levels of mRNA expression regulated by the pool of miRNAs

In our work, we are interested in the tissues and brain structures most likely affected by the pool of miRNAs with structural changes between ancient and modern humans. Unfortunately, the data on miRNA expression in the vast majority of human tissues and brain structures are currently incomplete both because the number of functional miRNAs is incomplete [30] and because of the number of functional miRNA/target-mRNA interactions under ongoing debates regarding the usage of various factors in miRNA/target-mRNA binding [31, 32]. Therefore, we used an indirect approach to identify tissues subject to dynamical changes in miRNA expression. Because miRNAs are negative post-transcriptional regulators of gene expression, we suppose that changes in their expression levels will largely affect the tissues with high abundance of their mRNA targets. Based on this assumption, likely candidate tissues are the ones with high expression of the target mRNAs for the set of miRNAs under consideration. To express quantitatively the degree of post-transcriptional regulation by specific miRNAs in different tissues, we introduced the miRNAs' regulatory potential.

Let W(k) denote the set of human mRNAs expressed in the k-th tissue and targeted by all known human miRNAs. It should be noted that the overwhelming majority of genes in the human genome (~20000) [33] can be targeted by various miRNAs [13, 30, 34]. Therefore, the set of all genes in the genome is a good approximation of the miRNA targetome. Thus, we approximated W(k) as the set of all human mRNAs expressed in the k-th tissue. W(k Q ) denotes the subset of target mRNAs regulated in the k-th tissue by the miRNA pool, Q, and J(k Q ) denotes the size of this subset.

The regulatory potential, L k Q , exerted by the miRNA pool, Q, on mRNAs with expression in the k-th tissue was estimated using the following formula:

L k Q = j W ( k Q ) E j k
(1)

Here, E j k is the level of expression of the j-th target mRNA in the subset W(k Q ) in the k-th tissue, and E j k values were obtained using data from BrainSpan [18] or AHBA[19]. The L k Q value obtained in this manner corresponds to the number of mRNA molecules that can be regulated in the k-th tissue (or brain structure) by mature miRNAs in the pool Q and characterizes its post-transcriptional regulatory potential in that tissue.

In search of the tissues targeted by the miRNAs from the pool Q, we selected the ones with the highest regulatory potential L k Q in comparison with the background level of expression of all miRNA targets expressed in the k-th tissue, W(k). To estimate the significance of the difference between the pool's Q regulatory potential and the background, we used a resampling test (see Figure 5).

Figure 5
figure5

A toy example of the resampling test to identify tissues with increased levels of mRNAs expression regulated by the pool Q of the miRNAs in tissue k. A). Calculation of the regulatory potential for the miRNA pool Q, L k Q . B). Calculation of the regulatory potential L k r a n d Q for the resampled miRNA pool Q in W(k Qrand ) subset. C). Calculation of the regulatory potential L krand in the W(k rand ) subset. D). Distribution of L k r a n d Q (blue) and L krand (brown) values in 105 resampling experiments and their comparison by Cohen's d statistics.

First, we performed random sampling without replacement of J(k Q )/2 mRNAs from the set W(k Q ) of human mRNAs targeted by miRNAs in the pool Q (the W(k Qrand ) subset). For the W(k Qrand ) subset, we calculated L k r a n d Q using Formula (1). The sampling procedure was repeated 105 times. We estimated the mean and standard deviation of the L k r a n d Q values. The selection of half the number of mRNAs in the W(k Q ) subset allowed us to account for uncertainties in miRNA annotation (for example, the possibility that the miRNA dataset under consideration is incomplete and has incomplete annotation of target mRNAs).

Second, we performed sampling of mRNAs from the background distribution, using the W(k) set of all miRNA targets expressed in the k-th tissue. For this purpose, we selected J(k Q )/2 mRNAs from the W(k) randomly without replacement. We termed this subset W(k rand ). We calculated

L k r a n d = j W ( k r a n d ) E j k ,
(2)

as the estimate of the background miRNA regulatory potential; here, E j k is the expression level of the j-th target mRNA in the subset W(k rand ). This procedure was repeated 105 times and allowed us to estimate the mean and standard deviation of L krand .

We calculated the difference between the L krand and L k r a n d Q value distributions using Cohen's d statistic (effect size value) [35, 36]. The p-value of L k r a n d L k r a n d Q was calculated using Student's t-statistic. The difference between the L krand and L k r a n d Q value distributions was considered significant at p-value less or equal to 0.01, implying that the k-th tissue was targeted by the miRNAs from the pool Q. It is important to note that the data resampling approach described above does not rely on distributional assumptions and simultaneously allowed us to control Type I error, and corrects for any hidden data correlation [3739]. Therefore, we did not need to perform multiple testing corrections (e.g., Bonferroni correction) of the significance level [3639] and therefore could use the selected p-value threshold directly.

References

  1. 1.

    Derevyanko AP: The upper paleolithic in Africa and Eurasia and tho origin of anatomically modern humans. 2011, Novosibirsk: Institute of Archaeology and Ethnography SB RAS Press

    Google Scholar 

  2. 2.

    Reich D, Green RE, Kircher M, Krause J, Patterson N, Durand EY, Viola B, Briggs AW, Stenzel U, Johnson PL, Maricic T, Good JM, Marques-Bonet T, Alkan C, Fu Q, Mallick S, Li H, Meyer M, Eichler EE, Stoneking M, Richards M, Talamo S, Shunkov MV, Derevianko AP, Hublin JJ, Kelso J, Slatkin M, Pääbo S: Genetic history of an archaic hominin group from Denisova Cave in Siberia. Nature. 2010, 468 (7327): 1053-1060.

    PubMed  CAS  Article  Google Scholar 

  3. 3.

    Meyer M, Kircher M, Gansauge MT, Li H, Racimo F, Mallick S, Schraiber JG, Jay F, Prüfer K, de Filippo C, Sudmant PH, Alkan C, Fu Q, Do R, Rohland N, Tandon A, Siebauer M, Green RE, Bryc K, Briggs AW, Stenzel U, Dabney J, Shendure J, Kitzman J, Hammer MF, Shunkov MV, Derevianko AP, Patterson N, Andrés AM, Eichler EE, Slatkin M, Reich D, Kelso J, Pääbo S: A high-coverage genome sequence from an archaic Denisovan individual. Science. 2012, 338 (6104): 222-226.

    PubMed  CAS  Article  Google Scholar 

  4. 4.

    Green RE, Krause J, Briggs AW, Maricic T, Stenzel U, Kircher M, Patterson N, Li H, Zhai W, Fritz MH, Hansen NF, Durand EY, Malaspinas AS, Jensen JD, Marques-Bonet T, Alkan C, Prüfer K, Meyer M, Burbano HA, Good JM, Schultz R, Aximu-Petri A, Butthof A, Höber B, Höffner B, Siegemund M, Weihmann A, Nusbaum C, Lander ES, Russ C, Novod N, Affourtit J, Egholm M, Verna C, Rudan P, Brajkovic D, Kucan Z, Gusic I, Doronichev VB, Golovanova LV, Lalueza-Fox C, de la Rasilla M, Fortea J, Rosas A, Schmitz RW, Johnson PL, Eichler EE, Falush D, Birney E, Mullikin JC, Slatkin M, Nielsen R, Kelso J, Lachmann M, Reich D, Pääbo S: A draft sequence of the Neandertal genome. Science. 2010, 328 (5979): 710-722.

    PubMed  CAS  Article  Google Scholar 

  5. 5.

    Prüfer K, Racimo F, Patterson N, Jay F, Sankararaman S, Sawyer S, Heinze A, Renaud G, Sudmant PH, de Filippo C, Li H, Mallick S, Dannemann M, Fu Q, Kircher M, Kuhlwilm M, Lachmann M, Meyer M, Ongyerth M, Siebauer M, Theunert C, Tandon A, Moorjani P, Pickrell J, Mullikin JC, Vohr SH, Green RE, Hellmann I, Johnson PL, Blanche H, Cann H, Kitzman JO, Shendure J, Eichler EE, Lein ES, Bakken TE, Golovanova LV, Doronichev VB, Shunkov MV, Derevianko AP, Viola B, Slatkin M, Reich D, Kelso J, Pääbo S: The complete genome sequence of a Neanderthal from the Altai Mountains. Nature. 2014, 505 (7841): 43-49.

    PubMed  Google Scholar 

  6. 6.

    Reich D, Patterson N, Kircher M, Delfin F, Nandineni MR, Pugach I, Ko AM, Ko YC, Jinam TA, Phipps ME, Saitou N, Wollstein A, Kayser M, Pääbo S, Stoneking M: Denisova admixture and the first modern human dispersals into Southeast Asia and Oceania. Am J Hum Genet. 2011, 89 (4): 516-528.

    PubMed  CAS  Article  Google Scholar 

  7. 7.

    Hu HY, Guo S, Xi J, Yan Z, Fu N, Zhang X, Menzel C, Liang H, Yang H, Zhao M, Zeng R, Chen W, Pääbo S, Khaitovich P: MicroRNA expression and regulation in human, chimpanzee, and macaque brains. PLoS Genet. 2011, 7 (10): e1002327-

    PubMed  CAS  Article  Google Scholar 

  8. 8.

    Hu HY, He L, Fominykh K, Yan Z, Guo S, Zhang X, Taylor MS, Tang L, Li J, Liu J, Wang W, Yu H, Khaitovich P: Evolution of the human-specific microRNA miR-941. Nat Commun. 2012, 3: 1145-

    PubMed  Article  Google Scholar 

  9. 9.

    Somel M, Liu X, Tang L, Yan Z, Hu H, Guo S, Jiang X, Zhang X, Xu G, Xie G, Li N, Hu Y, Chen W, Pääbo S, Khaitovich P: MicroRNA-driven developmental remodeling in the brain distinguishes humans from other primates. PLoS Biol. 2011, 9 (12): e1001214-

    PubMed  CAS  Article  Google Scholar 

  10. 10.

    Gunbin KV, Afonnikov DA, Kolchanov NA, Derevianko AP: The Importance of Changes to Microrna in the Evolution of Homo Neanderthalensis and Homo sapiens denisova. Archaeology, Ethnology and Anthropology of Eurasia. 2012, 40: 22-30.

    Article  Google Scholar 

  11. 11.

    Kozomara A, Griffiths-Jones S: miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Res. 2011, 39 (Database issue): D152-D157.

    PubMed  CAS  Article  Google Scholar 

  12. 12.

    Flicek P, Amode MR, Barrell D, Beal K, Brent S, Carvalho-Silva D, Clapham P, Coates G, Fairley S, Fitzgerald S, Gil L, Gordon L, Hendrix M, Hourlier T, Johnson N, Kähäri AK, Keefe D, Keenan S, Kinsella R, Komorowska M, Koscielny G, Kulesha E, Larsson P, Longden I, McLaren W, Muffato M, Overduin B, Pignatelli M, Pritchard B, Riat HS, Ritchie GR, Ruffier M, Schuster M, Sobral D, Tang YA, Taylor K, Trevanion S, Vandrovcova J, White S, Wilson M, Wilder SP, Aken BL, Birney E, Cunningham F, Dunham I, Durbin R, Fernández-Suarez XM, Harrow J, Herrero J, Hubbard TJ, Parker A, Proctor G, Spudich G, Vogel J, Yates A, Zadissa A, Searle SM: Ensembl 2012. Nucleic Acids Res. 2012, 40 (Database issue): D84-D90.

    PubMed  CAS  Article  Google Scholar 

  13. 13.

    Cho S, Jang I, Jun Y, Yoon S, Ko M, Kwon Y, Choi I, Chang H, Ryu D, Lee B, Kim VN, Kim W, Lee S: MiRGator v3.0: a microRNA portal for deep sequencing, expression profiling and mRNA targeting. Nucleic Acids Res. 2013, 41 (Database issue): D252-D257.

    PubMed  CAS  Article  Google Scholar 

  14. 14.

    Shulha HP, Crisci JL, Reshetov D, Tushir JS, Cheung I, Bharadwaj R, Chou HJ, Houston IB, Peter CJ, Mitchell AC, Yao WD, Myers RH, Chen JF, Preuss TM, Rogaev EI, Jensen JD, Weng Z, Akbarian S: Human-specific histone methylation signatures at transcription start sites in prefrontal neurons. PLoS Biol. 2012, 10 (1): e1001427-

    PubMed  CAS  Article  Google Scholar 

  15. 15.

    Mili S, Steitz JA: Evidence for reassociation of RNA-binding proteins after cell lysis: implications for the interpretation of immunoprecipitation analyses. RNA. 2004, 10 (11): 1692-1694.

    PubMed  CAS  Article  Google Scholar 

  16. 16.

    Agarwal V, Bell GW, Nam JW, Bartel DP: Predicting effective microRNA target sites in mammalian mRNAs. Elife. 2015, 4: e05005-

    Article  Google Scholar 

  17. 17.

    Guo H, Ingolia NT, Weissman JS, Bartel DP: Mammalian microRNAs predominantly act to decrease target mRNA levels. Nature. 2010, 466 (7308): 835-840.

    PubMed  CAS  Article  Google Scholar 

  18. 18.

    Ball S, Gilbert TL, Overly CC: The human brain online: an open resource for advancing brain research. PLoS Biol. 2012, 10 (12): e1001453-

    PubMed  CAS  Article  Google Scholar 

  19. 19.

    BrainSpan: Atlas of the Developing Human Brain [Internet]. Funded by ARRA Awards 1RC2MH089921-01, 1RC2MH090047-01, and 1RC2MH089929-01. © 2011. Available from: http://www.brainspan.org/

  20. 20.

    Bleazard T, Lamb JA, Griffiths-Jones S: Bias in microRNA functional enrichment analysis. Bioinformatics. 2015, 31 (10): 1592-1598.

    PubMed  Article  Google Scholar 

  21. 21.

    Boeckx C, Benítez-Burraco A: The shape of the human language-ready brain. Front Psychol. 2014, 5: 282-

    PubMed  Google Scholar 

  22. 22.

    Nestor PG, Nakamura M, Niznikiewicz M, Thompson E, Levitt JJ, Choate V, Shenton ME, McCarley RW: In search of the functional neuroanatomy of sociality: MRI subdivisions of orbital frontal cortex and social cognition. Soc Cogn Affect Neurosci. 2013, 8 (4): 460-467.

    PubMed  Article  Google Scholar 

  23. 23.

    Ma Y, Han S: Functional dissociation of the left and right fusiform gyrus in self-face recognition. Hum Brain Mapp. 2012, 33 (10): 2255-2267.

    PubMed  Article  Google Scholar 

  24. 24.

    Huang da W, Sherman BT, Lempicki RA: Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009, 4 (1): 44-57.

    PubMed  Article  Google Scholar 

  25. 25.

    Ewing B, Green P: Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Research. 1998, 8 (3): 186-194.

    PubMed  CAS  Article  Google Scholar 

  26. 26.

    Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL: BLAST+: architecture and applications. BMC Bioinformatics. 2009, 10: 421-

    PubMed  Article  Google Scholar 

  27. 27.

    Mudunuri U, Che A, Yi M, Stephens RM: bioDBnet: the biological database network. Bioinformatics. 2009, 25 (4): 555-556.

    PubMed  CAS  Article  Google Scholar 

  28. 28.

    1000 Genomes Project Consortium, Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, Handsaker RE, Kang HM, Marth GT, McVean GA: An integrated map of genetic variation from 1,092 human genomes. Nature. 2012, 491 (7422): 56-65.

    Article  Google Scholar 

  29. 29.

    Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K: dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2001, 29 (1): 308-311.

    PubMed  CAS  Article  Google Scholar 

  30. 30.

    Londin E, Loher P, Telonis AG, Quann K, Clark P, Jing Y, Hatzimichael E, Kirino Y, Honda S, Lally M, Ramratnam B, Comstock CE, Knudsen KE, Gomella L, Spaeth GL, Hark L, Katz LJ, Witkiewicz A, Rostami A, Jimenez SA, Hollingsworth MA, Yeh JJ, Shaw CA, McKenzie SE, Bray P, Nelson PT, Zupo S, Van Roosbroeck K, Keating MJ, Calin GA, Yeo C, Jimbo M, Cozzitorto J, Brody JR, Delgrosso K, Mattick JS, Fortina P, Rigoutsos I: Analysis of 13 cell types reveals evidence for the expression of numerous novel primate- and tissue-specific microRNAs. Proc Natl Acad Sci U S A. 2015, 112 (10): E1106-E1115.

    PubMed  CAS  Article  Google Scholar 

  31. 31.

    Cloonan N: Re-thinking miRNA-mRNA interactions: intertwining issues confound target discovery. Bioessays. 2015, 37 (4): 379-388.

    PubMed  CAS  Article  Google Scholar 

  32. 32.

    Clark PM, Loher P, Quann K, Brody J, Londin ER, Rigoutsos I: Argonaute CLIP-Seq reveals miRNA targetome diversity across tissue types. Sci Rep. 2014, 4: 5947-

    PubMed  CAS  Article  Google Scholar 

  33. 33.

    Ezkurdia I, Juan D, Rodriguez JM, Frankish A, Diekhans M, Harrow J, Vazquez J, Valencia A, Tress ML: Multiple evidence strands suggest that there may be as few as 19,000 human protein-coding genes. Hum Mol Genet. 2014, 23 (22): 5866-5878.

    PubMed  Article  Google Scholar 

  34. 34.

    Dweep H, Gretz N: miRWalk2.0: a comprehensive atlas of microRNA-target interactions. Nat Methods. 2015, 12 (8): 697-

    PubMed  CAS  Article  Google Scholar 

  35. 35.

    Sullivan GM, Feinn R: Using Effect Size-or Why the P Value Is Not Enough. J Grad Med Educ. 2012, 4 (3): 279-282.

    PubMed  Article  Google Scholar 

  36. 36.

    Nakagawa S: A farewell to Bonferroni: the problems of low statistical power and publication bias. Behavioral Ecology. 2004, 15 (6): 1044-1045.

    Article  Google Scholar 

  37. 37.

    Westfall PH, Young SS: Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment. 1993, New York: Wiley-Interscience, 1st

    Google Scholar 

  38. 38.

    North BV, Curtis D, Sham PC: A note on the calculation of empirical P values from Monte Carlo procedures. Am J Hum Genet. 2002, 71 (2): 439-441.

    PubMed  CAS  Article  Google Scholar 

  39. 39.

    Sham PC, Purcell SM: Statistical power and significance testing in large-scale genetic studies. Nat Rev Genet. 2014, 15 (5): 335-346.

    PubMed  CAS  Article  Google Scholar 

  40. 40.

    Gong J, Liu C, Liu W, Wu Y, Ma Z, Chen H, Guo AY: An update of miRNASNP database for better SNP selection by GWAS data, miRNA expression and online tools. Database (Oxford). 2015, 2015: bav029-

    Article  Google Scholar 

Download references

Acknowledgements

This work and its publication was supported in part by grant 13-06-12063-OFI-m from the RFBR (data collection, algorithms development), grant 14.B25.31.0033 from the Russian Federation Government (Resolution No. 220) (gene expression analysis), budget project VI.61.1.2 (computational experiments), Russian Science Foundation grant No 14-15-01121 (Alzheimer's Disease gene network analysis), grant No 14-15-00077 (miRNA and targets in brain analysis) and grant No 14-50-00036 (to A.P.D.). The authors are grateful to Yurii Shvarev for critical reading of the manuscript and to Valentin Suslov for helpful discussion. The Siberian Branch of the Russian Academy of Sciences (SB RAS) Siberian Supercomputing Center (CCU "Bioinformatics") and the Novosibirsk State University High-Performance Computing Center are gratefully acknowledged for providing computer facilities.

This article has been published as part of BMC Genomics Volume 16 Supplement 13, 2015: Selected articles from the 7th International Young Scientists School "Systems Biology and Bioinformatics" (SBB'2015): Genomics. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcgenomics/supplements/16/S13.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Dmitry A Afonnikov.

Additional information

Competing interests

The authors have declared that no competing interests exist.

Authors' contributions

K.V.G., A.D.A. and E.I.R. conceived the project. K.V.G. performed all data analysis. N.A.K., E.I.R. and A.P.D. coordinated the project. All authors contributed to the final manuscript preparation, discussed the results and their implications, and have read and approved the final manuscript.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Gunbin, K.V., Afonnikov, D.A., Kolchanov, N.A. et al. The evolution of Homo sapiens denisova and Homo sapiens neanderthalensis miRNA targeting genes in the prenatal and postnatal brain. BMC Genomics 16, S4 (2015). https://doi.org/10.1186/1471-2164-16-S13-S4

Download citation

Keywords

  • miRNA
  • functionally important substitutions
  • Homo sapiens neanderthalensis
  • Homo sapiens denisova