De novo assembly of the dual transcriptomes of a polymorphic raptor species and its malarial parasite
© Pauli et al. 2015
Received: 9 October 2015
Accepted: 27 November 2015
Published: 9 December 2015
Studies of non-model species are important for understanding the molecular processes underpinning phenotypic variation under natural ecological conditions. The common buzzard (Buteo buteo; Aves: Accipitriformes) is a widespread and common Eurasian raptor with three distinct plumage morphs that differ in several fitness-related traits, including parasite infestation. To provide a genomic resource for plumage polymorphic birds in general and to search for candidate genes relating to fitness, we generated a transcriptome from a single dead buzzard specimen plus easily accessible, minimally invasive samples from live chicks.
We not only de novo assembled a near-complete buzzard transcriptome, but also obtained a significant fraction of the transcriptome of its malaria-like parasite, Leucocytozoon buteonis. By identifying melanogenesis-related transcripts that are differentially expressed in light ventral and dark dorsal feathers, but which are also expressed in other regions of the body, we also identified a suite of candidate genes that could be associated with fitness differences among the morphs. These include several immune-related genes, providing a plausible link between melanisation and parasite load. qPCR analysis of a subset of these genes revealed significant differences between ventral and dorsal feathers and an additional effect of morph.
This new resource provides preliminary insights into genes that could be involved in fitness differences between the buzzard colour morphs, and should facilitate future studies of raptors and their malaria-like parasites.
KeywordsTranscriptome Non-model organism Common buzzard Buteo buteo Plasmodium Melanin pathway Molecular markers
High-throughput DNA sequencing is transforming our understanding of the molecular mechanisms underlying physiological, morphological and even behavioural traits [1, 2]. In particular, although laboratory organisms make excellent models for understanding links between genotype and phenotype, new sequencing approaches have opened the way to studying natural populations [3–6]. This has brought more varied and realistic biological systems into reach and offers the opportunity to explore genotype-environment interactions.
High sequencing costs and a paucity of reference genomes initially restricted large-scale sequencing and marker development to a handful of model systems [7, 8]. However, these limitations have largely disappeared thanks to technological advances that allow increasing amounts of data to be generated at tumbling costs [1, 2]. Moreover, by directing sequencing effort towards the transcriptome, which is usually an order of magnitude smaller than the full genome, costs and downstream bioinformatic analysis can be further reduced.
A large avian order that differs considerably from common model species and for which few genomic resources are available is the Accipitriformes (comprising buzzards, hawks, eagles, harriers, kites, vultures and osprey). This group is globally distributed across all of the continents except for Antarctica . As top predators and scavengers, these species are of great ecological importance, and many are of conservation concern . Raptors are also interesting from the perspective of plumage colour polymorphism, which is found in many member genera including most Buteo hawk and buzzard species . This often takes the form of discrete colour morphs, which not only differ in the extent of plumage melanisation, but often also in important fitness-related traits [11–15].
Melanin-based colouration is widespread throughout the animal kingdom, with the pigments involved ranging from yellow, through orange and red to brown and black . These pigments are produced in specialised organelles, the melanosomes . Polymorphisms within a transmembrane receptor of these organelles, the melanocortin 1 receptor (MC1R), have been linked to variation in plumage or pelage colouration in several species of birds and mammals [reviewed by 18]. However, the molecular mechanisms linking colouration to fitness remain unclear. This is partly due to the fact that mutations at this locus are believed to have minimal pleiotropic effects [18, 19].
One intriguing possibility is that fitness differences among the morphs may relate to the differential effects of parasites on the three colour morphs. The dark morph has a tendency to be more heavily infested with the blood sucking fly Carnus haemapterus while the lighter morphs tend to carry higher loads of the malaria-like blood parasite Leucocytozoon buteonis (formerly known as L. toddi) . Leucocytozoon in particular can dramatically reduce host fitness [23–25] by causing anaemia and organ damage . Leucocytozoon is closely related to malaria-causing Plasmodium  and has a similar life-history involving stages in the liver and blood cells of the vertebrate host .
Here, we construct a de novo transcriptome for the common buzzard, thereby generating the first genomic resource for a plumage-polymorphic member of the Accipitriformes (see  for the genomes of two species of Falconiformes, a divergent and independently evolved group of predatory birds [29, 30]). In parallel, we also partially sequenced and de novo assembled the parasitic L. buteonis transcriptome, which is again a first for this genus. To achieve this, we sampled RNA from growing feathers, circulating blood and several different organs, including the liver, which is the primary host organ of early Leucocytozoon infection stages. As a first application of this new resource, we also analysed tissue-specific patterns of transcript presence and absence in order to identify transcripts involved in melanogenesis that might also affect biological processes beyond plumage colouration. Using this approach, we identified several candidate genes that could be tested for a role in morph-specific fitness differences.
All samples were collected from nestling buzzards (Buteo buteo) in a study population in western Germany during the breeding season of 2012. Trees containing nests were climbed and the chicks were lowered to the ground to collect feather and blood samples. Samples were obtained from chicks selected to equally represent all three colour morphs. To allow quantification of differences in feather gene expression relating to colour, we sampled RNA from both dorsal (dark) and ventral (light) contour feathers. Feather samples were transferred to RNAlater immediately and stored at room temperature for up to 12 h before being frozen at –80 °C. For comparison, approximately 300μl of blood was taken from the brachial vein, transferred to RNAlater immediately and stored at room temperature for up to 12 h before again being frozen at –80 °C.
We also collected samples from the lung, liver, skin, heart muscle, breast muscle and brain of a single chick of the light morph that had fallen from its nest before we climbed the tree and which died of its injuries. The chick was frozen whole at –80°C within a few minutes of death and transferred to the laboratory, where each organ was dissected while the rest of the body was kept frozen. All tissue samples were taken within 5 min of the respective organ having been thawed, and were then transferred immediately to RNAlater where they were stored at –80 °C.
RNA isolation, cDNA generation and sequencing
RNA was extracted by a guanidinium thiocyanate-phenol-chloroform method using TRIzol® (Life Technologies). Samples from organs and feathers were dry-blotted and immediately ground up in liquid nitrogen. Extraction then followed the standard TRIzol® protocol. Blood samples were first centrifuged at 12,000 g for 15 min and the supernatant RNAlater discarded. The remaining sample was then lysed and homogenized in 1ml of TRIzol® by repeated pipetting before continuing with the standard protocol. The resulting RNA extracts were then used to create four pools: blood, ventral feathers, dorsal feathers and organs. We did not prepare separate sub-libraries for the three colour morphs as this would have required a total of twelve normalised libraries, which was beyond the scope of our budget. The blood pool contained equal amounts of RNA from each of 30 different individuals, 10 of each morph. Dorsal and ventral feathers each contained equal amounts of RNA from 15 individuals, five of each morph. The organs pool contained equal amounts of RNA from each tissue type collected from the single dead chick. The RNA concentrations of each sample were measured on a Qubit (Life Technologies). Final RNA concentrations of the four pools were measured on a Bioanalyzer (Agilent). cDNA generation, library barcoding (one barcode each for organs, blood, dorsal and ventral feathers) and normalisation of the four pools was performed by vertis Biotechnologie (Freising, Germany). The normalisation was based on a kinetic denaturation-reassociation technique described at http://www.vertis-biotech.com/index.php?ip=132. A single MiSeq run was then conducted at the Center for Biotechnology, Bielefeld University.
The quality of the reads was checked using the FastQC toolkit (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). The FastQC analysis identified biased nucleotide content in the first 15 and last 10 bases. We therefore trimmed the first 15 and the last 10 bases of every read together with any other low-quality regions (Phred < 15).
To maximise data quality and the inclusion of sequences representing genuine transcripts, we used a three-step assembly strategy. First, we assembled the quality-filtered reads de novo using Trinity version r2013-02-25  and obtained additional assemblies based on different k-mer sizes using Velvet version 1.1.07 and Oasis version 0.2.08 [32, 33]. These programs were run with all odd k-mer values between 21 and 121 and default parameters, creating 51 additional de novo assemblies. In the second step, the contigs from all 52 assemblies combined were screened for likely protein-coding regions (CDS). All possible open reading frames (ORFs) were extracted using the TransDecoder tool included in the Trinity package. The translated protein sequences of all ORFs were mapped to the zebra finch reference protein set using blat . The results were screened for hits that covered both the ORF and the reference protein by 100 % without any gaps. If more than one hit was found for a given reference protein, one was randomly chosen. These ORFs were then used as a training set to create the hexamer score used by TransDecoder. Additionally, all ORFs were searched against the Pfam-A database using the hmmscan tool . All sequences lacking a likely CDS were discarded. In the final step, all predicted CDS sequences were translated to protein sequences and clustered using cd-hit version 4.6  with 95 % global sequence identity (parameter -G 1 -c 0.95), keeping the longest sequence of each cluster in the final data set.
Screening the contigs for CDS and selecting representative sequences from each cluster should improve overall data quality, but it might also lead to the loss of some transcripts. To estimate the extent of this possible loss, we compared both the reduced and complete set of contigs with 15,431 zebra finch Ref Seq proteins (http://www.ncbi.nlm.nih.gov/bioproject/PRJNA32405) using BLAT . We then counted how many zebra finch proteins aligned to our contigs with 80% coverage and a maximum of 5 % gaps.
BLAST mapping, sequence annotation and comparative genomics
The final data set was uploaded to the SAMS system  and an automatic functional annotation was performed using a best blast hit strategy against various databases including SwissProt , KEGG  and KOG . Additionally, the translated protein sequences were blasted against the chicken and zebra finch RefSeq proteins with known genomic locations. Since buzzards are known to have a high prevalence and often high infection intensities of the malaria-like parasite Leucocytozoon buteonis [11, 41] translated protein sequences were also blasted against the genomes of three human malarial species, P. falciparum, P. vivax and P. knowlesi. The blast results were filtered with an e-value cutoff of 10-5. Each transcript was assigned to a chicken and zebra finch chromosome on the basis of the top blast hit.
Mining for molecular markers
We searched for microsatellites containing six or more repeat units using sputnik with the default parameters. For SNP detection, reads were mapped to the transcripts separately for each pool using the “alignReads.pl” script from the Trinity distribution within Bowtie2 . SNPs were called using samtools . Any SNPs with Phred quality scores below 15 were discarded.
Assigning transcripts to the four pools
As the libraries had been normalised, it was not possible to analyse quantitative differences in transcript abundance among the different pools. We therefore focused on tissue-specific patterns of transcript presence and absence. To assign a transcript as ‘present’ in one of the four pools, we counted the percentage of bases covered by reads from this sample, using samtools depth  on the “alignReads.pl” mappings. If at least 80 % of bases were covered by at least one read, the transcript was considered to be present in the sample.
Transcripts found in different tissues
We compared the overall presence / absence patterns of transcripts corresponding to known genes in the zebra finch among the four pools by annotating transcripts with respect to the RefSeq database at NCBI. Given that some of the pools were generated from different sample sizes of individuals, we expect there to be some level of error associated with certain comparisons. As the buzzard colour morphs differ in fitness, we were particularly interested in differences involving transcripts related to melanogenesis. We therefore extracted 86 proteins from the zebra finch melanin pathway from the KEGG database (http://www.kegg.jp/kegg/pathway.html) and used BLASTp to match our transcripts to these genes. Based on the premise that the colour morphs are defined by the extent to which they grow dark and light feathers, we then compared transcripts associated with melanogenesis between dorsal (dark) and ventral (light) feathers, as a proxy for morph, in order to identify putative candidate genes that might be involved in morph-specific fitness differences. As fitness-relevant genes are expected to be expressed not only in feathers, we then cross-referenced these transcripts with the organs pool to derive two subsets of transcripts from the melanogenesis pathway, one occurring in only ventral feathers and organs, and the other occurring in only dorsal feathers and organs. These were then queried for GO slim categories and involvement in biological pathways, according to KEGG.
To verify patterns identified from the transcriptomic data, we performed qPCR analyses for seven of the differentially expressed genes in 20 samples (see Additional file 1: Table S1 for details). The genes that we tested were ADORA2A, FZD3, FZD4, LIMK1, RAC1, STK4 and WNT4. Additionally, as a control that is unrelated to melanogenesis, we quantified the expression of the gene Clock . cDNA from single feathers was synthesized with peqGold cDNA-Synthesis Kit H Plus. The qPCR was performed in a CFX Connect Real Time PCR Detection System (BioRad) using 96-well plates. The 10μl reaction volume contained 2 μl EvaGreen (qPCR-MixII, my-Budget), 1.25 μM of each primer and 0.1 μl of feather cDNA. Each qPCR reaction was performed in duplicate. Cq scores for each reaction were divided by the mean Cq score of the respective sample for Clock. Relative expression of separate genes relative to Clock was analysed with mixed-effects models incorporating sample, plate and gene identity as random factors and morph and body region of the feather as fixed factors. Factor significance was estimated by comparing models with likelihood ratio tests after exclusion of a fixed factor.
Total RNA yields
We characterised the common buzzard transcriptome by creating four pools of RNA samples: blood, organs, ventral feathers and dorsal feathers. Organ samples (lung, liver, skin, heart muscle, breast muscle and brain) were obtained from a single dead chick to achieve as complete a transcriptome as possible without sacrificing individuals. Average RNA yields were highest from the organs (1089 ng/μl) followed by ventral and dorsal feathers (802 ng/μl and 761 ng/μl, respectively) and lowest from blood (286 ng/μl). Samples were pooled in roughly equimolar amounts into tissue-specific libraries that were then normalised prior to sequencing.
Sequence data and transcriptome assembly
Estimating data loss
Although filtering and clustering should improve overall data quality by enriching the pool of sequences for genuine transcripts, it is also possible that some transcripts might be inadvertently lost. To estimate the extent of this possible loss, we aligned 15,431 reference zebra finch proteins to the initially assembled sequences and to the final dataset. We then counted the numbers of reference proteins with a best bidirectional hit covering at least 80 % of the reference protein with at most 5 % gaps. In the initially assembled sequences, 6,231 reference proteins could be found, whereas 5,989 proteins remained in the final dataset. This suggests that the rate of data loss due to filtering and clustering is low, at around 3.9 %.
Number and overall percentages of transcripts showing sequence homology to Plasmodium species
Number of matched transcripts
Percentage of matched transcripts
Molecular marker discovery
Number of microsatellite markers found in transcripts with top BLAST matches to either vertebrates or blood parasites of the genus Plasmodium
Species group transcripts matched to
Repeat motif length
No. of repeats
% of all microsatellites
Searching the dataset for Single Nucleotide Polymorphisms (SNPs) revealed 96,966 SNPs residing in transcripts matched to vertebrates and 15,072 SNPs in transcripts matched to Plasmodium. After we filtered for a minimum Phred quality score of 15, there remained 64,907 and 10,067 SNPs respectively. Of all 36,722 transcripts matching to vertebrate sequences, 17,332 contained SNPs. The distribution of SNPs across contigs was fairly even, with just 303 transcripts containing >20 SNPs. On average, transcripts containing more than 20 SNPs were three times as long as those containing fewer SNPs.
Differential expression patterns and melanin pathway characterisation
Numbers of hits to known genes of transcripts obtained from different tissues. A transcript was assigned as being associated with melanogenesis if it showed sequence homology to one of 86 proteins from the melanin pathway of the zebra finch (KEGG database)
Unique to pool
Unique to pool [%]
Associated with melanogenesis
% of all genes associated with melanogenesis
Feathers (Ventral + Dorsal)
To characterise genes involved in melanogenesis, we identified a total of 2,648 transcripts from all four pools showing homology to genes from the zebra finch melanin pathway. These transcripts were annotated by reference to the zebra finch RefSeq database, resulting in 778 hits to known genes (there was considerable redundancy in these data due to different transcripts having their best hits to different parts of the same gene). Of these, 586 (75.3 %) were from ventral feathers and 497 (63.9 %) were from dorsal feathers, while the combined feather pool accounted for 639 (82.1 %) hits. 53 of these were only found in dorsal but not in ventral feathers, whereas the opposite was true for 142 hits. Organs and blood contained 420 and 184 melanogenesis-related transcripts respectively (Table 3).
Transcripts associated with melanogenesis that occur in either organs and ventral feathers or organs and dorsal feathers (see Methods for details). A transcript was assigned as being associated with melanogenesis if it showed sequence homology to one of 86 proteins from the melanin pathway of the zebra finch (KEGG database)
GO slim categories
Best BLAST score against Gallus gallus
Adenosine A2a Receptor
ADP-Ribosylation Factor 1
0.00E + 00
Large Tumor Suppressor Kinase 1
LIM domain kinase 1
0.00E + 00
Mitogen-activated protein kinase-activated protein kinase 2
0.00E + 00
NIMA-Related Kinase 2
P21 Protein (Cdc42/Rac)-Activated Kinase 2
0.00E + 00
Phospholipase C-Like 1
0.00E + 00
Polo-Like Kinase 2
Ras-related protein Rab-11A
RAB8A, Member RAS Oncogene Family
Ras-Related C3 Botulinum Toxin Substrate 1
Ras-Related Protein Ral-A
SAR1 Homolog A
Serine/threonine-protein kinase 4
Tyrosine-protein kinase SYK
Tau Tubulin Kinase 2
Vav 3 Guanine Nucleotide Exchange Factor
Assembly and functional annotation
We de novo assembled transcriptomes of the common buzzard and its blood parasite L. buteonis, identified species-specific molecular markers, and generated a list of candidate genes that could provide a link between plumage morph and fitness through impacts on various biochemical pathways, several of which are involved in immunity. We thereby provide a starting point for future molecular studies of the Accipitriformes, an ecologically distinctive and interesting group.
The newly assembled buzzard sequences obtained from a single sequencing run cover 67.3 % and 72.3 % of all protein sequences described in chicken and zebra finch respectively. The actual number of genes present in accipitriform species is unknown, but it is reasonable to assume based on comparison to these model avian species that a considerable fraction of common buzzard genes are represented in our dataset. Recently published data on peregrine and saker falcons  indicate that similar numbers of genes are also present in falconiforms. We therefore corroborate earlier studies suggesting that sampling from just one [48, 49] or a small number of different tissues [49–52] can provide reasonable coverage of the transcriptome. This is supported by the broad chromosomal distribution of genes matched to the buzzard transcripts in chicken and zebra finch and the strong positive correlations obtained between chromosome length and the number of matching transcripts in both species.
The distribution of zebra finch homologues across the three different tissue pools offers a number of insights that could be important for future studies of bird transcriptomes. First, blood contained a very small proportion of unique genes (only 9.3 % of genes detected in blood were unique to this pool), despite bird erythrocytes being nucleated and hence presumably transcriptionally active. As the blood samples yielded the lowest overall concentration of RNA, this could suggest either a technical artefact or limited transcriptional activity. The latter appears reasonably likely as most cells in the blood are erythrocytes. These produce mainly haemoglobins, one of the major transcripts we expected to reduce by normalising the RNA samples.
Second, we found a surprisingly large number of transcripts and genes that were unique to feathers in comparison to the organs pool, which comprised five different tissues (lung, liver, brain, muscle and skin). This may reflect the rapid growth rate of developing feathers, which in turn requires high transcriptional activity. In comparison, the internal organs will have been experiencing slower growth rates given the advanced age of the chick from which they were sampled. A similar, if slightly lower, number of expressed genes was found in growing feathers of Ruffs, Philomachus pugnax [51, 53].
An important caveat, however, to comparisons involving different pools is that the blood, feathers and organs pools were constructed from different sample sizes of individuals, ranging from one to 30. This could affect the diversity of transcripts, as one would expect to be able to capture more transcripts from a larger sample size of individuals due to inter-individual variation in gene expression. This could partly explain the observation that more zebra finch homologues were detected in feathers than in organs. However, there does not appear to be a clear relationship between the number of individuals in the pools and transcript diversity as the blood pool, which was based on the largest number of individuals, had the lowest diversity of transcripts.
In addition to generating a near-complete common buzzard transcriptome, we also obtained the most complete genomic resource to date for a Leucocytozoon species, a widespread group of avian malaria-like blood parasites. The closest relatives with well annotated genomes are three Plasmodium species, with 5,512, 5,506 and 5,161 annotated genes in P. falciparum, P. vivax and P. knowlesi respectively (NCBI database). Assuming that the number of genes is similar in L. buteonis, the 2,190–2,217 genes we found represent approximately 40–43 % of the full gene set. However, the recognition of species-specific genes will strongly depend on the available genomic and transcriptomic resources to BLAST against and how closely related they are to the target organism. Since L.buteonis belongs to a genus evolutionarily well separated from the reference Plasmodium genomes, the transcriptomic resource we provide probably contains more Leucocytozoon-specific genes than we can currently identify.
Most of the transcripts matching Plasmodium that were assigned to a single pool were found in the organ pool rather than in blood (73.1 % versus 26.1 %). One explanation for such a pattern is that fewer genes might be expressed during life cycle stages occurring in the host’s blood than during those inside the liver. This is unlikely, however, as over 60 % of Plasmodium genes are active during the intraerythrocytic stage . Alternatively, the infection of the buzzard chick that we sampled may have been quite recent, with most of the parasites still developing and dividing inside the liver and only few merozoites having already been released into the blood stream. In other transcriptomic studies of birds, very few endoparasite transcripts have so far been found, presumably due to the fact that the main organs or tissues harbouring the parasites have not been sampled [55, 56].
Ours is the first study to our knowledge to have jointly characterised the transcriptomes of a vertebrate host and its parasite, in our case using only a single MiSeq lane (but see ). This has positive implications for the development of parasite genomic resources across a variety of taxa, and may also be beneficial for exploring host-parasite interactions. As most of the transcripts were derived from the organs pool, we do not yet know which of the organs contained the most transcripts, although it seems likely that this would be the liver. If so, future efforts to characterise malaria-like parasites should be able to achieve much greater sequence coverage by focusing on this organ. Finally, it is unclear if our success in obtaining parasite transcripts could be due to our having normalised the libraries. This would make good sense if the parasite transcripts are expressed at low levels relative to the host, because normalisation improves the extent to which rare transcripts are represented in the final sequencing libraries. Future work should make further refinements to our protocol in order to maximise the representation of both parasite and host transcripts.
By sequencing more than 40 % of all expected Leucocytozoon genes, we also offer the first transcriptomic basis for population genetic and phylogenetic studies of this parasite genus, whose relatedness to other hemosporidian parasites has been controversial . Additionally, dual host and parasite transcriptomes should allow detailed studies of the genetics of host-parasite coevolution. Nonetheless, sequence similarity between the designated Leucocytozoon transcripts and the Plasmodium genome showed a wide variation with a mean of 66 % which is substantially lower than the 80–83 % similarity of mitochondrial sequences of L.buteonis to Plasmodium parasites . This may hint toward different mutation rates between mitochondrial and nuclear genes [57, 58] but could also be the result of error-prone alignment of underrepresented parasite reads.
Candidate genes for fitness differences
We attempted to identify genes potentially linking plumage melanisation to the fitness differences observed among the three colour morphs. For the transcriptome analysis, a direct comparison between individuals of different morphs was not possible due to constraints on the number of libraries we could produce. We therefore used the two different feather types as tentative proxies to gain a first impression of potentially important differences. While dorsal and ventral feathers differ not only in their melanisation but also in the body regions from which they were obtained, restricting comparisons to genes associated with the melanin pathway should be effective in focusing our analysis on genes relevant to colour differences.
For this analysis, we compared ventral (light) and dorsal (dark) feathers for the differential recovery of transcripts involved not only in melanogenesis in feathers, but also in other pathways that are expressed in other tissues. As the libraries had been normalised to maximise the overall representation of different transcripts, it would not have been appropriate to carry out a quantitative analysis of differential gene expression . We therefore focused on tissue-specific patterns of transcript presence / absence. This approach is highly conservative in the sense that it should detect only the most extreme differences in expression, but we nevertheless recovered some interesting patterns. In particular, the greatest expression differences were observed for transcripts with GO annotation terms relating to immunity. Six immune-related transcripts were represented only in the darker dorsal feathers whereas one was found only in the lighter ventral feathers. This is consistent with the fact that dark buzzards tend to better suppress their L. buteonis infections than light ones , and provides the first genetic support for a putative link between melanisation and the immune response in vertebrates [19, 60].
Melanogenesis-related transcripts present in the organs pool as well as in either ventral or dorsal feathers are involved in a total of 56 different biological pathways. While it is too early to speculate on their role in generating fitness differences, we highlight four of these pathways and their components as potentially interesting candidates for further study. Each of these pathways is represented by several genes differentially expressed in dorsal and ventral feathers. Two of the pathways are involved in cell motility, including white blood cell migration during immune reactions (regulation of the actin cytoskeleton and focal adhesion). The central pathway connecting each of the three others is that for MAPK signalling, which is part of a conserved cascade involved in stress responses and regulation of gene expression. It connects to the WNT signalling pathway, which in turn regulates gene transcription and the cell cycle. Interestingly, both the MAPK and WNT signalling pathway also involve the cAMP response element binding protein (CREB), a transcription factor recently linked to differences in dispersal timing of the three buzzard morphs in our study population . CREB is also involved in the regulation of neuropeptides (reviewed in [61, 62]) and therefore provides a potential link to behavioural differences between the morphs . Consequently, although we can only speculate on the role of these pathways based on currently available data, the patterns that emerge are plausibly related to relevant biological processes, suggesting that our candidate gene set provides an interesting avenue for further study.
To further explore patterns of differential expression in relation to both feather and morph, we conducted a qPCR analysis of seven genes selected from Table 4. We found significant differences in expression levels between dorsal and ventral feathers at four genes, including two immune genes (ADORA2A and STK4). qPCR analysis suggests that both of these genes are significantly upregulated in dorsal feathers, consistent with the fact that transcripts associated with immune system processes constituted only 1.8 % of genes represented in both ventral feathers and organs as compared to 12.3 % of genes represented in both dorsal feathers and organs. Furthermore, we found a significant interaction between feather type and morph. This interaction should be interpreted with caution due to the relatively small sample sizes involved. However, the selected genes appear to be expressed at a higher level in dorsal feathers in both intermediate and light buzzards, whereas the reverse appears to be the case for dark birds. Higher levels of expression in dorsal feathers might be expected as these feathers are on average darker than equivalent ventral feathers, particularly in the light and intermediate morphs (Fig. 1). However, in dark birds the breast feathers are comparatively darker.
A single MiSeq run yielded enough sequence data to characterise a large proportion of the transcriptome of the common buzzard and its malaria-like Leucocytozoon parasite. Comparable to a recent study based on fur seals that died of natural causes , we also show that transcripts from a single dead animal represent 56.8 % of all the genes documented in all four libraries combined. This is a promising result for species of conservation concern in which non-destructive and/or non-invasive sampling is desirable. We not only identify several putative candidate genes for fitness differences relating to melanisation in the common buzzard, but we also provide a resource that should facilitate studies of other accipitriform species and the malaria-like Leucocytozoon parasites of birds in general.
All animal handling was performed with permission from the local authority Kreis Gütersloh, permit nr: 4.5.2-723-Bussard in accordance to German federal and state laws.
Raw sequence reads have been deposited at the European Nucleotide Archive (accession no. PRJEB5722) and can be found at http://www.ebi.ac.uk/ena/data/view/PRJEB5722.
The authors thank Thomas Grünkorn for his continued help in the field. MP holds a Marie Curie FP7-International Outgoing Grant within the 7th European Community Framework Programme (PIOF-GA-2010-275049). NC was supported by a grant of the Volkswagen Foundation (I/84 196). OK holds a Heisenberg Professorship of the German Science Foundation (DFG, KR 2089/2-1). JH is supported by a Marie Curie FP7-Reintegration-Grant within the 7th European Community Framework Programme (PCIG-GA-2011-303618).
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Stapley J, Reger J, Feulner PGD, Smadja C, Galindo J, Ekblom R, et al. Adaptation genomics: the next generation. Trends Ecol Evol. 2010;25(12):705–12.View ArticlePubMedGoogle Scholar
- Ekblom R, Galindo J. Applications of next generation sequencing in molecular ecology of non-model organisms. Heredity. 2011;107(1):1–15.PubMed CentralView ArticlePubMedGoogle Scholar
- Vera JC, Wheat CW, Fescemyer HW, Frilander MJ, Crawford DL, Hanski I, et al. Rapid transcriptome characterization for a nonmodel organism using 454 pyrosequencing. Mol Ecol. 2008;17(7):1636–47.View ArticlePubMedGoogle Scholar
- Emerson KJ, Merz CR, Catchen JM, Hohenlohe PA, Cresko WA, Bradshaw WE, et al. Resolving postglacial phylogeography using high-throughput sequencing. Proc Natl Acad Sci U S A. 2010;107(37):16196–200.PubMed CentralView ArticlePubMedGoogle Scholar
- Hohenlohe PA, Amish SJ, Catchen JM, Allendorf FW, Luikart G. Next-generation RAD sequencing identifies thousands of SNPs for assessing hybridization between rainbow and westslope cutthroat trout. Mol Ecol Resour. 2011;11:117–22.View ArticlePubMedGoogle Scholar
- Hoffman JI, Simpson F, David P, Rijks JM, Kuiken T, Thorne MA, et al. High-throughput sequencing reveals inbreeding depression in a natural population. Proc Natl Acad Sci. 2014;111(10):3775–80.PubMed CentralView ArticlePubMedGoogle Scholar
- Slate J, Gratten J, Beraldi D, Stapley J, Hale M, Pemberton JM. Gene mapping in the wild with SNPs: guidelines and future directions. Genetica. 2009;136(1):97–107.View ArticlePubMedGoogle Scholar
- Birol I, Jackman SD, Nielsen CB, Qian JQ, Varhol R, Stazyk G, et al. De novo transcriptome assembly with ABySS. Bioinformatics. 2009;25(21):2872–7.View ArticlePubMedGoogle Scholar
- Ferguson-Lees J, Christie DA. Raptors of the world. London: Christopher Helm; 2001.Google Scholar
- IUCN. IUCN Red List of Threatened Species Version 2013.2. www.iucnredlist.org. 2013.
- Chakarov N, Boerner M, Krüger O. Fitness in common buzzards at the cross-point of opposite melanin-parasite interactions. Funct Ecol. 2008;22(6):1062–9.View ArticleGoogle Scholar
- Krüger O, Lindström J, Amos W. Maladaptive mate choice maintained by heterozygote advantage. Evolution. 2001;55(6):1207–14.View ArticlePubMedGoogle Scholar
- Galvan I, Gangoso L, Grande JM, Negro JJ, Rodriguez A, Figuerola J, et al. Antioxidant Machinery Differs between Melanic and Light Nestlings of Two Polymorphic Raptors. PLoS One. 2010;5(10):e13369.PubMed CentralView ArticlePubMedGoogle Scholar
- Karell P, Ahola K, Karstinen T, Kolunen H, Siitari H, Brommer JE. Blood parasites mediate morph-specific maintenance costs in a colour polymorphic wild bird. J Evol Biol. 2011;24(8):1783–92.View ArticlePubMedGoogle Scholar
- Lei BN, Amar A, Koeslag A, Gous TA, Tate GJ. Differential Haemoparasite Intensity between Black Sparrowhawk (Accipiter melanoleucus) Morphs Suggests an Adaptive Function for Polymorphism. PLoS One. 2013;8(12):e81607.PubMed CentralView ArticlePubMedGoogle Scholar
- Fox HM, Vevers G. The nature of animal colours. New York: Macmillan; 1960.Google Scholar
- Seiji M, Fitzpatrick TB, Birbeck MSC. The melanosome - a distinctive subcellular particle of mammalian melanocytes and the site of melanogenesis. J Invest Dermatol. 1961;36(4):243–52.PubMedGoogle Scholar
- Mundy NI. A window on the genetics of evolution: MC1R and plumage colouration in birds. Proc R Soc B-Biol Sci. 2005;272(1573):1633–40.View ArticleGoogle Scholar
- Ducrest AL, Keller L, Roulin A. Pleiotropy in the melanocortin system, coloration and behavioural syndromes. Trends Ecol Evol. 2008;23(9):502–10.View ArticlePubMedGoogle Scholar
- Chakarov N, Jonker RM, Boerner M, Hoffman JI, Krüger O. Variation at phenological candidate genes correlates with timing of dispersal and plumage morph in a sedentary bird of prey. Mol Ecol. 2013;22(21):5430–40.View ArticlePubMedGoogle Scholar
- Boerner M, Krüger O. Aggression and fitness differences between plumage morphs in the common buzzard (Buteo buteo). Behav Ecol. 2009;20(1):180–5.View ArticleGoogle Scholar
- Boerner M, Hoffman JI, Amos W, Chakarov N, Krüger O. Testing heterozygosity-fitness-correlations and inbreeding in colour morphs of different fitness in the common buzzard. J Evol Biol. 2013;26:2233–43.View ArticlePubMedGoogle Scholar
- Korpimäki E, Tolonen P, Bennett GF. Blood parasites, sexual selection and reproductive success of European kestrels. Ecoscience. 1995;2(4):335–43.Google Scholar
- Figuerola J, Muñoz E, Gutiérrez R, Ferrer D. Blood parasites, leucocytes and plumage brightness in the Cirl Bunting, Emberiza cirlus. Funct Ecol. 1999;13(5):594–601.View ArticleGoogle Scholar
- Møller AP, Nielsen JT. Malaria and risk of predation: A comparative study of birds. Ecology. 2007;88(4):871–81.View ArticlePubMedGoogle Scholar
- Valkiūnas G. Avian malarial parasites and other haemosporidia. Boca Raton, Florida, USA: CRC Press; 2005.Google Scholar
- Perkins SL, Schall JJ. A molecular phylogeny of malarial parasites recovered from cytochrome b gene sequences. J Parasitol. 2002;88(5):972–8.View ArticlePubMedGoogle Scholar
- Zhan XJ, Pan SK, Wang JY, Dixon A, He J, Muller MG, et al. Peregrine and saker falcon genome sequences provide insights into evolution of a predatory lifestyle. Nature Genet. 2013;45(5):563–U142.View ArticlePubMedGoogle Scholar
- Hackett SJ, Kimball RT, Reddy S, Bowie RCK, Braun EL, Braun MJ, et al. A phylogenomic study of birds reveals their evolutionary history. Science. 2008;320(5884):1763–8.View ArticlePubMedGoogle Scholar
- McCormack JE, Harvey MG, Faircloth BC, Crawford NG, Glenn TC, Brumfield RT. A Phylogeny of Birds Based on Over 1,500 Loci Collected by Target Enrichment and High-Throughput Sequencing. PLoS One. 2013;8(1):e54848.PubMed CentralView ArticlePubMedGoogle Scholar
- Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011;29(7):644–U130.PubMed CentralView ArticlePubMedGoogle Scholar
- Zerbino DR, Birney E. Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008;18(5):821–9.PubMed CentralView ArticlePubMedGoogle Scholar
- Schulz MH, Zerbino DR, Vingron M, Birney E. Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics. 2012;28(8):1086–92.PubMed CentralView ArticlePubMedGoogle Scholar
- Kent WJ. BLAT - The BLAST-like alignment tool. Genome Res. 2002;12(4):656–64.PubMed CentralView ArticlePubMedGoogle Scholar
- Eddy SR. A new generation of homology search tools based on probabilistic inference. Genome Inform. 2009;23(1):205-211.Google Scholar
- Li WZ, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22(13):1658–9.View ArticlePubMedGoogle Scholar
- Bekel T, Henckel K, Kuster H, Meyer F, Runte VM, Neuweger H, et al. The Sequence Analysis and Management System - SAMS-2.0: Data management and sequence analysis adapted to changing requirements from traditional sanger sequencing to ultrafast sequencing technologies. J Biotechnol. 2009;140(1-2):3–12.View ArticlePubMedGoogle Scholar
- Bairoch A, Bougueleret L, Altairac S, Amendolia V, Auchincloss A, Puy GA, et al. The universal protein resource (UniProt). Nucleic Acids Res. 2007;35:D193–7.View ArticleGoogle Scholar
- Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 1999;27(1):29–34.PubMed CentralView ArticlePubMedGoogle Scholar
- Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, et al. The COG database: an updated version includes eukaryotes. BMC Bioinformatics. 2003;4:41.PubMed CentralView ArticlePubMedGoogle Scholar
- Chakarov N, Linke B, Boerner M, Goesmann A, Kruger O, Hoffman JI. Apparent vector-mediated parent-to-offspring transmission in an avian malaria-like parasite. Mol Ecol. 2015;24(6):1355–63.View ArticlePubMedGoogle Scholar
- Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9(4):357–U354.PubMed CentralView ArticlePubMedGoogle Scholar
- Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25(16):2078–9.PubMed CentralView ArticlePubMedGoogle Scholar
- Edwards SV, Gasper J, Garrigan D, Martindale D, Koop BF. A 39-kb sequence around a blackbird Mhc class II gene: Ghost of selection past and songbird genome architecture. Mol Biol Evol. 2000;17(9):1384–95.View ArticlePubMedGoogle Scholar
- Westerdahl H, Wittzell H, von Schantz T. Mhc diversity in two passerine birds: no evidence far a minimal essential Mhc. Immunogenetics. 2000;52(1-2):92–100.View ArticlePubMedGoogle Scholar
- Miller HC, Bowker-Wright G, Kharkrang M, Ramstad K. Characterisation of class II B MHC genes from a ratite bird, the little spotted kiwi (Apteryx owenii). Immunogenetics. 2011;63(4):223–33.View ArticlePubMedGoogle Scholar
- Ekblom R, Stapley J, Ball AD, Birkhead T, Burke T, Slate J. Genetic mapping of the major histocompatibility complex in the zebra finch (Taeniopygia guttata). Immunogenetics. 2011;63(8):523–30.View ArticlePubMedGoogle Scholar
- Hoffman JI. Gene discovery in the Antarctic fur seal (Arctocephalus gazella) skin transcriptome. Mol Ecol Resour. 2011;11(4):703–10.View ArticlePubMedGoogle Scholar
- Hoffman JI, Thorne MAS, Trathan PN, Forcada J. Transcriptome of the dead: characterisation of immune genes and marker development from necropsy samples in a free-ranging marine mammal. BMC Genomics. 2013;14:52.PubMed CentralView ArticlePubMedGoogle Scholar
- Ekblom R, Balakrishnan CN, Burke T, Slate J. Digital gene expression analysis of the zebra finch genome. BMC Genomics. 2010;11:219.PubMed CentralView ArticlePubMedGoogle Scholar
- Ekblom R, Wennekes P, Horsburgh GJ, Burke T. Characterization of the house sparrow (Passer domesticus) transcriptome: a resource for molecular ecology and immunogenetics. Mol Ecol Resour. 2014;14(3):636–46.View ArticlePubMedGoogle Scholar
- Santure AW, Gratten J, Mossman JA, Sheldon BC, Slate J. Characterisation of the transcriptome of a wild great tit Parus major population by next generation sequencing. BMC Genomics. 2011;12:283.PubMed CentralView ArticlePubMedGoogle Scholar
- Ekblom R, Farrell LL, Lank DB, Burke T. Gene expression divergence and nucleotide differentiation between males of different color morphs and mating strategies in the ruff. Ecol Evol. 2012;2(10):2485–505.PubMed CentralView ArticlePubMedGoogle Scholar
- Bozdech Z, Llinas M, Pulliam BL, Wong ED, Zhu JC, DeRisi JL. The transcriptome of the intraerythrocytic developmental cycle of Plasmodium falciparum. PLoS Biol. 2003;1(1):85–100.View ArticleGoogle Scholar
- Chu JH, Lin RC, Yeh CF, Hsu YC, Li SH. Characterization of the transcriptome of an ecologically important avian species, the Vinous-throated Parrotbill Paradoxornis webbianus bulomachus (Paradoxornithidae; Aves). BMC Genomics. 2012;13:149.PubMed CentralView ArticlePubMedGoogle Scholar
- Wang BA, Ekblom R, Castoe TA, Jones EP, Kozma R, Bongcam-Rudloff E, et al. Transcriptome sequencing of black grouse (Tetrao tetrix) for immune gene discovery and microsatellite development. Open Biol. 2012;2:120054.PubMed CentralView ArticlePubMedGoogle Scholar
- Bensch S, Hellgren O, Križanauskienė A, Palinauskas V, Valkiūnas G, Outlaw D, et al. How can we determine the molecular clock of malaria parasites? Trends Parasitol. 2013;29(8):363–9.View ArticlePubMedGoogle Scholar
- Wolfe KH, Li W-H, Sharp PM. Rates of nucleotide substitution vary greatly among plant mitochondrial, chloroplast, and nuclear DNAs. Proc Natl Acad Sci. 1987;84(24):9054–8.PubMed CentralView ArticlePubMedGoogle Scholar
- Vijay N, Poelstra JW, Künstner A, Wolf JBW. Challenges and strategies in transcriptome assembly and differential gene expression quantification. A comprehensive in silico assessment of RNA-seq experiments. Mol Ecol. 2013;22(3):620–34.View ArticlePubMedGoogle Scholar
- Mackintosh JA. The antimicrobial properties of melanocytes, melanosomes and melanin and the evolution of black skin. J Theor Biol. 2001;211(2):101–13.View ArticlePubMedGoogle Scholar
- Goodman RH. Regulation of neuropeptide gene-expression. Annu Rev Neurosci. 1990;13:111–27.View ArticlePubMedGoogle Scholar
- Montminy MR, Gonzalez GA, Yamamoto KK. Regulation of CAMP-inducible genes by CREB. Trends Neurosci. 1990;13(5):184–8.View ArticlePubMedGoogle Scholar
- Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S, et al. Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics. 2012;28(12):1647–9.PubMed CentralView ArticlePubMedGoogle Scholar
- Videvall E, Cornwallis CK, Palinauskas V, Valkiunas G, Hellgren O. The avian transcriptome response to malaria infection. Mol Biol Evol. 2015; 32(5):1255–1267.PubMed CentralView ArticlePubMedGoogle Scholar