The evolution of microtubule associated proteins – a reference proteomic perspective
BMC Genomics volume 23, Article number: 266 (2022)
Microtubule associated proteins (MAPs), defined as proteins that bind microtubules but are not molecular motors or severing enzymes, play a key role in regulating microtubule stability in neurons. Existing studies of the evolutionary relationships between these proteins are limited to genomic data from a small number of species. We therefore used a large collection of publicly available reference-quality eukaryotic proteomes to carry out a phylogenetic analysis of microtubule associated proteins in both vertebrates and invertebrates. Complete or near-complete reference quality proteomes were obtained from Uniprot. Microtubule associated proteins were identified using InterProtScan, aligned using MUSCLE and then phylogenetic trees constructed using the WAG algorithm. We identified 889 proteins with tubulin binding domains, of which 663 were in eukaryotes, including 168 vertebrates and 64 invertebrates. The vertebrate proteins separated into three families, resembling human MAP 2, MAP4 and MAPT, respectively, while invertebrate MAPs clustered separately. We found significant variation in number of microtubule associated proteins and number of microtubule binding domains between taxa, with fish and mollusks having an unexpectedly high number of MAPs and binding domains, respectively. Our findings represent a novel analysis of the evolution of microtubule associated proteins based on publicly available proteomics data sets. We were able to confirm the phylogeny of MAPs identified based on more limited genomic analyses, and in addition, derived several novel insights on the structure and function of MAPs.
Microtubule associated proteins (MAPs), defined as proteins that bind microtubules but are not molecular motors or severing enzymes, play a key role in regulating microtubule stability. This group of proteins is of particular interest because they play a critical role in maintaining cytoskeletal stability in neurons and other cells with complex three-dimensional structures. Microtubule associated protein mutations, such as doublecortin (DCX), MAP2, MAP1A, MAP1B, and tau lead to varying degrees of central nervous system malformation and cognitive impairment in humans and in animal models [1,2,3,4,5,6,7,8,9]. One member of the microtubule associated protein family, tau, is also of particular interest because it has the unique ability to form toxic soluble oligomers, propagate and cause neurodegeneration in Alzheimer disease and related dementias [10,11,12,13]. It is the only microtubule associated protein capable of this toxic gain-of-function and appears to do predominantly, but not exclusively, in humans [14,15,16,17,18,19,20,21,22,23,24].
The vertebrate microtubule associated protein family includes three related proteins, tau (MAPT), MAP2 and MAP4, which are thought, based on genomic studies, to be the product of two duplication events ancestral to the vertebrate lineage . These conclusions are however based on analysis of a small number of, predominantly vertebrate, genomes. The largest analysis to date includes 43 species, with only 11 invertebrates, and does not include the critical cephalopod family, which represents an independently evolved clade with a large and complex central nervous system and behavioral repertoire. Publicly available databases, including Uniprot, now include complete or near-complete proteomes for more than one thousand distinct species, making protein-based searches for specific domains practical. Although less well curated than nucleic acid-based databases such as GenBank, they provide an under-utilized resource for comparative evolutionary studies on the protein level, which we seek to exploit in our reported work.
We therefore used a large collection of publicly available reference-quality eukaryotic proteomes to carry out a phylogenetic analysis of microtubule associated proteins in both vertebrates and invertebrates. Our analysis is based only on the assumption that a microtubule associated protein will contain a tubulin binding domain, without any other a priori assumptions about protein or gene structure and includes all available complete or near complete eukaryotic reference proteomes in Uniprot.
Materials and methods
We downloaded all reference proteomes annotated as complete or near complete in the Uniprot repository (Release 2020_3, 17 June 2020). We chose to use reference proteomes rather than specific mass spectrometry-based studies due to the lack of large-scale multi-tissue mass spectrometry studies across the tree of life (see discussion). Since our particular interest was in the potential role of microtubule associated proteins in the central nervous system, we focused our analysis on animals, excluding all other kingdoms. In order to be able to correlate our findings with characteristics of individual species, we also excluded metaproteomes and any protein not assigned to a single, distinct species. Data on brain size and number of neurons were obtained from published sources (Supplemental Table S1 in Additional file 1) [26,27,28,29,30]. Taxon membership was determined by manual curation of the species list. Since brain-specific proteomes were not available for most species we sought to study, we were not able to characterize the anatomic specificity or localization of individuals proteins.
Identification and alignment of microtubule associated proteins
Interproscan was used with default settings to map all PFAM protein domains in the proteomes downloaded from Uniprot. Microtubule binding proteins were identified using the Unix fgrep function to identify all proteins containing a PFAM tubulin binding domain (PFAM00418). We then aligned the identified proteins using the multiple sequence comparison by log-expectation (MUSCLE) algorithm within MEGA . We selected this algorithm based on benchmarking studies showing that it outperforms ClustalW, an alternate multiple sequence alignment algorithm, in both speed and accuracy . A maximum likelihood tree was created from this alignment within MEGA using the WAG model, gamma distributed with invariant sites, and partial deletion settings according to published guidelines . Protein searches against reference genomes were done using UCSC genome browser BLAT against the Genome Reference Consortium Zebrafish Build 11 (danRer11) or BLAST against the Ensembl octopus reference genome for O. bimaculoides (Genome assembly: PRJNA270931) since the UCSC genome browser does not contain a suitable reference genome for this species.
Downstream data analysis and plotting was done using MATLAB R2021b (Mathworks, Natick, MA). A p-value cutoff of 0.05 was used for statistical significance. All code used for analysis is in Additional files 2 and 3.
The number of MAPs varies significantly between taxa
Of the 1478 species and 785,763,547 proteins initially investigated, we identified 889 proteins with tubulin binding domains. 663 of the MAPs belonged to a population of eukaryotic species which included 168 vertebrates and 64 invertebrates. The complete list of species and proteins is in Tables S2 and S3 (Additional file 1). This widely diverse group of eukaryotic species included mammals, birds, fish, reptiles, arthropods, nematodes, and mollusks. The median number of proteins per species varied significantly between taxa (mammals: 32100, birds: 13100, fish: 31700, reptiles: 19800, arthropods: 16600, nematodes: 30000, mollusks: 23750; Fig. 1A; ANOVA: p = 1.7 × 10− 14, Fig. 1A). Unexpectedly, fish have a significantly higher median number of MAPs per species when compared to all other taxonomic groups, with a median of 5 ± 1.49 proteins per species (Fig. 1B). This proliferation of MAPs does not appear to be a product of poorly annotated or fragmentary proteomes, as the identified proteins are not closely related (Fig. 2). In addition, when we examined zebrafish (D. rerio) proteins in more detail, we found that each identified MAP maps, with 100% homology, to a distinct gene in the zebrafish reference genome. Mammals, birds, mollusks, and reptiles had a median of 3 ± 0.73, 3 ± 0.75, 2.5 ± 0.82, and 3 ± 0.41 MAPs per species, respectively. Arthropods and nematodes had a median of 1 ± 0.88 and 1 ± 0.89, respectively.
The number of MAPs does not vary significantly with proteome complexity or brain size
In order to rule out the possibility that variations in the number of MAPs were due to either a shift in genome complexity or a function of the completeness of individual proteomes, we then examined the relationship between total number of MAPs per species and neuron numbers. Neuron counts were derived from published sources as described in Methods above and in Additional file 1. Number of neurons explained less than 1% of variance in MAP number (R2 = 0.004, Fig. 1c). Similarly, the number of MAPs per species did not vary significantly with total protein number, suggesting that it is not a simple function of brain size or complexity (R2 = 0.12, Fig. 1D).
Three microtubule associated protein families are shared by all vertebrates
We then aligned all MAPs identified by InterProScan using MEGA to generate a phylogenetic tree of all eukaryotic MAPs. This revealed three broad families corresponding to the human MAP2, MAP4 and MAPT (tau) proteins (Fig. 2). Birds, reptiles and mammals all showed the same set of three MAPs, with members of each family resembling each other more than they resembled other MAPs from the same species (Fig. 2). For example, the commonly studied zebra finch (T. guttata) and the North American crow (C. brachyrhynchos) both have close relatives of human MAPT, MAP2 and MAP4 (Fig. 2, outer). Interestingly however, the increased number of MAPs in fish appears to result from duplications within families, with the zebrafish (d. rerio) for example having two proteins clustering with mammalian tau (Fig. 2, red) and two with MAP4 (Fig. 2, blue) proteins. When we examined selected invertebrate species, the fruit fly (d. melanogaster) and black ant (L. niger) each showed one protein related to, but independent of, all three vertebrate clusters. Interestingly, the California two-spot octopus (O. bimaculoides) showed three closely related proteins all clustering with other invertebrates (Fig. 2, black). We then conducted a BLAT search to map these proteins back to original genes and found that all three are fragments of a single gene annotated as LOC106878541, with Gene ID: 106878541. It is not clear, based on the available data, whether these represent different isoforms of a single gene or different fragments of a single protein in this relatively understudied species.
There is significant variation in the total number of microtubule binding domains per protein between taxa
The number of microtubule binding regions per MAP is thought to be indicative of how well a protein binds to microtubules. The average number of microtubule binding regions of MAPs per taxon varied slightly (arthropods: 4, birds: 3, nematodes: 3, fish: 4, mammals: 4, mollusks: 5, reptiles: 3) (Fig. 3, top; ANOVA: p = 2 × 10− 6). Interestingly, mollusks have the highest average of microtubule binding regions with some mollusks possessing 7–10 (Fig. 3a). The pacific giant oyster (C. gigas), for example, has one microtubule associated protein with ten distinct microtubule binding domains (Fig. 3, bottom).
In the current work, we constructed a systematic, unbiased phylogeny of vertebrate tubulin-binding microtubule associated proteins. As previously suspected based on genomic analyses, we found three distinct families of vertebrate microtubule associated proteins. While most vertebrate taxa (birds, mammals, reptiles) had one member of each family, fish showed additional expansions within each family and mollusks showed a remarkable expansion of microtubule binding repeats within proteins. We found no relationship between number of neurons or total number of proteins and number of microtubule associated proteins. Our findings show that the tau protein, which leads to neurodegeneration in Alzheimer disease and other neurodegenerative diseases of aging, is strongly evolutionarily conserved within the vertebrate lineage. The evolutionary origins of neurodegenerative disease, and neurodegenerative tauopathies specifically, thus remains unclear.
Our findings are broadly consistent with the phylogeny constructed by Sündermann et al. which is based on gene-level data on microtubule associated proteins . Our work has several unique aspects which differ significantly from that of Sündermann et al., and lead to novel conclusions. First, we included a larger number of species (232 in our analysis versus 102 in Sündermann et al) and a far larger number invertebrates (64 versus 8). Our study includes mollusks, which were entirely absent from the previous analysis. This family is particularly important since it contains the cephalopods, which despite being invertebrates, have a complex central nervous system similar to that seen in mammals and birds. Second, our work focuses on protein-level data, potentially making it more sensitive to evolutionary relationships since amino acid sequences are more likely to be evolutionarily conserved than nucleic acids. Finally, our reported proliferation of microtubule binding proteins in fish, and the unexpectedly high number of microtubule binding repeats in mollusks is also unique and has not been previously reported.
Our findings are also consistent with the existing data on microtubule associated proteins in well studied model species showing one MAP in c. elegans, two in d. melanogaster and showing mouse homologues of all three proteins seen in humans. In addition, limited studies of MAPs in birds have shown that the domestic chicken (g. gallus domesticus) has 5 microtubule binding repeats, and we similarly identified five repeats in this species and in other related bird species . Specifically, Uniprot A0A3Q2UDR8_CHICK has five distinct microtubule binding repeats. We also attempted to validate our data in a recent, large proteomic data set including 100 species across the tree of life . Unfortunately, this dataset uses immortalized cell lines for all fifteen included eukaryotic species, most of which are of non-neuronal derivation. Examination of the data reveals that known brain-specific microtubule associated proteins (e.g., MAPT, MAP2) in humans or model species such as mice are represented, making comparison to our reference proteomic data impossible. This speaks to the need for systematic mass spectrometry-based studies of proteins from individual tissues, and particularly the nervous system, across the tree of life.
Although, thanks to the broad availability of eukaryotic proteomes we were able to examine a broad range of species across multiple taxa, adding validity to our results, we are limited by the organs/systems examined in each species. Microtubule associated proteins are best-known for their role in neuronal development and function, but they are expressed in other organs. The lack of brain-specific proteomes makes it difficult for us to determine whether the increased number of MAPs in fish or in the high number of MTBDs in mollusks reflects an increase in brain specific microtubule stability or reflects a greater need for microtubule stability in other organs, perhaps due to these species’ unique aquatic environment.
Given the enormous number of proteins studied, we were also not able to evaluate the evidence for each individual protein identified, instead relying on the curating efforts of Uniprot staff and contributors. Because we focused specifically on tubulin binding domains (our only a priori assumption for the purpose of this analysis) we did not identify the known human MAP’s MAP1a, MAP1b and MAP1c which do not contain classic tubulin binding domains studied in our current work. The microtubule binding domains in this family of proteins are less well annotated and show a lower degree of evolutionary conservation, making them difficult to identify in our analysis. We therefore cannot exclude the possibility that some of our studied species, particularly the less well-annotated invertebrates, have additional families of MAPs unique to them, which were not seen in our analysis. Ultimately, our analysis benefits from the breadth of data available in reference proteomes but is limited by the fact that these reference proteomes are derived from multiple studies with different tissues of origins, study objectives, and methods. This speaks to the urgent need for methodical proteomic studies of brain tissue across the tree of life, with a particular emphasis on understudied non-mammalian species with complex central nervous systems, such as birds and cephalopods.
The tau protein, which is the only microtubule associated protein capable of forming toxic aggregates in the aging brain, appears, based on our data, to be strongly evolutionarily conserved. It therefore remains unclear why it undergoes a toxic gain-of-function predominantly, but not exclusively, in humans [14,15,16,17,18,19,20,21,22,23,24]. Based on existing data showing early tau pathology in shorter lived animals such as cats, it is possible that the combination of tau, or a tau homologue, a large complex central nervous system and long lifespan are necessary to develop toxicity. Clarifying this question will require rigorous studies of aging, and particularly brain aging, in long-lived non-mammalian species (e.g., parrots).
Our findings represent a novel analysis of the evolution of microtubule associated proteins based on publicly available proteomics data sets. We were able to confirm the phylogeny of MAPs identified based on more limited genomic analyses, and in addition, derived several novel insights on the structure and function of MAPs. Our study is limited by the lack of brain-specific proteomes and limited proteome annotation for many species, underlying the importance of in-depth, tissue specific proteomic studies in non-model species. Future studies taking advantage of newly developed CRISPR protocols for non-model organisms, such as cephalopods, will be necessary to characterize the functions of individual MAPs and to understand the evolution of the apparent partial redundancy seen between mammalian MAPs [7,8,9, 36,37,38].
Availability of data and materials
No novel datasets were generated as part of this study. All data used is publicly available via Uniprot (http://www.uniprot.org; RRID:SCR_002380) under the appropriate species identifiers. Code is included with supplementary materials.
Teng J, Takei Y, Harada A, Nakata T, Chen J, Hirokawa N. Synergistic effects of MAP2 and MAP1B knockout in neuronal migration, dendritic outgrowth, and microtubule organization. J Cell Biol. 2001;155(1):65–76.
Walters GB, Gustafsson O, Sveinbjornsson G, Eiriksdottir VK, Agustsdottir AB, Jonsdottir GA, et al. MAP1B mutations cause intellectual disability and extensive white matter deficit. Nat Commun. 2018;9(1):3456.
Harada A, Teng J, Takei Y, Oguchi K, Hirokawa N. MAP2 is required for dendrite elongation, PKA anchoring in dendrites, and proper PKA signal transduction. J Cell Biol. 2002;158(3):541–9.
Biswas S, Kalil K. The microtubule-associated protein tau mediates the Organization of Microtubules and Their Dynamic Exploration of actin-rich Lamellipodia and Filopodia of cortical growth cones. J Neurosci. 2018;38(2):291–307.
Caceres A, Kosik KS. Inhibition of neurite polarity by tau antisense oligonucleotides in primary cerebellar neurons. Nature. 1990;343(6257):461–3.
Sapir T, Frotscher M, Levy T, Mandelkow EM, Reiner O. Tau's role in the developing brain: implications for intellectual disability. Hum Mol Genet. 2012;21(8):1681–92.
Liu G, Thangavel R, Rysted J, Kim Y, Francis MB, Adams E, et al. Loss of tau and Fyn reduces compensatory effects of MAP2 for tau and reveals a Fyn-independent effect of tau on calcium. J Neurosci Res. 2019;97(11):1393–413.
Takei Y, Teng J, Harada A, Hirokawa N. Defects in axonal elongation and neuronal migration in mice with disrupted tau and map1b genes. J Cell Biol. 2000;150(5):989–1000.
Ma QL, Zuo X, Yang F, Ubeda OJ, Gant DJ, Alaverdyan M, et al. Loss of MAP function leads to hippocampal synapse loss and deficits in the Morris water maze with aging. J Neurosci. 2014;34(21):7124–36.
Castillo-Carranza DL, Gerson JE, Sengupta U, Guerrero-Munoz MJ, Lasagna-Reeves CA, Kayed R. Specific targeting of tau oligomers in Htau mice prevents cognitive impairment and tau toxicity following injection with brain-derived tau oligomeric seeds. J Alzheimers Dis. 2014;40(Suppl 1):S97–S111.
Swanson E, Breckenridge L, McMahon L, Som S, McConnell I, Bloom GS. Extracellular tau oligomers induce invasion of endogenous tau into the Somatodendritic compartment and axonal transport dysfunction. J Alzheimers Dis. 2017;58(3):803–20.
Ash PEA, Lei S, Shattuck J, Boudeau S, Carlomagno Y, Medalla M, et al. TIA1 potentiates tau phase separation and promotes generation of toxic oligomeric tau. Proc Natl Acad Sci U S A. 2021;118(9):e2014188118. https://doi.org/10.1073/pnas.2014188118.
Usenovic M, Niroomand S, Drolet RE, Yao L, Gaspar RC, Hatcher NG, et al. Internalized tau oligomers cause Neurodegeneration by inducing accumulation of pathogenic tau in human neurons derived from induced pluripotent stem cells. J Neurosci. 2015;35(42):14234–50.
Fiock KL, Smith JD, Crary JF, Hefti MM. beta-amyloid and tau pathology in the aging feline brain. J Comp Neurol. 2020;528(1):108–13.
Schultz C, Hubbard GB, Rub U, Braak E, Braak H. Age-related progression of tau pathology in brains of baboons. Neurobiol Aging. 2000;21(6):905–12.
Lemere CA, Beierschmitt A, Iglesias M, Spooner ET, Bloom JK, Leverone JF, et al. Alzheimer's disease abeta vaccine reduces central nervous system abeta levels in a non-human primate, the Caribbean vervet. Am J Pathol. 2004;165(1):283–97.
Elfenbein HA, Rosen RF, Stephens SL, Switzer RC, Smith Y, Pare J, et al. Cerebral beta-amyloid angiopathy in aged squirrel monkeys. Histol Histopathol. 2007;22(2):155–67.
Rosen RF, Farberg AS, Gearing M, Dooyema J, Long PM, Anderson DC, et al. Tauopathy with paired helical filaments in an aged chimpanzee. J Comp Neurol. 2008;509(3):259–70.
Perez SE, Raghanti MA, Hof PR, Kramer L, Ikonomovic MD, Lacor PN, et al. Alzheimer's disease pathology in the neocortex and hippocampus of the western lowland gorilla (gorilla gorilla gorilla). J Comp Neurol. 2013;521(18):4318–38.
Banik A, Brown RE, Bamburg J, Lahiri DK, Khurana D, Friedland RP, et al. Translation of pre-clinical studies into successful clinical trials for Alzheimer's disease: what are the roadblocks and how can they be overcome? J Alzheimers Dis. 2015;47(4):815–43.
Drummond E, Wisniewski T. Alzheimer's disease: experimental models and reality. Acta Neuropathol. 2017;133(2):155–75.
Head E, Moffat K, Das P, Sarsoza F, Poon WW, Landsberg G, et al. Beta-amyloid deposition and tau phosphorylation in clinically characterized aged cats. Neurobiol Aging. 2005;26(5):749–63.
Gunn-Moore DA, McVee J, Bradshaw JM, Pearson GR, Head E, Gunn-Moore FJ. Ageing changes in cat brains demonstrated by beta-amyloid and AT8-immunoreactive phosphorylated tau deposits. J Feline Med Surg. 2006;8(4):234–42.
Chambers JK, Tokuda T, Uchida K, Ishii R, Tatebe H, Takahashi E, et al. The domestic cat as a natural animal model of Alzheimer's disease. Acta Neuropathol Commun. 2015;3:78.
Sündermann F, Fernandez MP, Morgan RO. An evolutionary roadmap to the microtubule-associated protein MAP tau. BMC Genomics. 2016;17:264.
Tower DB. Structural and functional organization of mammalian cerebral cortex; the correlation of neurone density with brain size; cortical neurone density in the fin whale (Balaenoptera physalus L.) with a note on the cortical neurone density in the Indian elephant. J Comp Neurol. 1954;101(1):19–51.
Salajková V. Pravidla buněčného škálování mozku u psů: Efekt domestikace a miniaturizace psích plemen. Prague: Charles University; 2020.
Hinsch K, Zupanc GK. Generation and long-term persistence of new neurons in the adult zebrafish brain: a quantitative analysis. Neuroscience. 2007;146(2):679–96.
Herculano-Houzel S, Catania K, Manger PR, Kaas JH. Mammalian brains are made of these: a dataset of the numbers and densities of neuronal and nonneuronal cells in the brain of Glires, Primates, Scandentia, Eulipotyphlans, Afrotherians and artiodactyls, and their relationship with body mass. Brain Behav Evol. 2015;86(3–4):145–63.
Herculano-Houzel S. Longevity and sexual maturity vary across species with number of cortical neurons, and humans are no exception. J Comp Neurol. 2019;527(10):1689–705.
Kumar S, Stecher G, Li M, Knyaz C, Tamura K. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol Biol Evol. 2018;35(6):1547–9.
Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32(5):1792–7.
Hall BG. Building phylogenetic trees from molecular data with MEGA. Mol Biol Evol. 2013;30(5):1229–35.
Yoshida H, Goedert M. Molecular cloning and functional characterization of chicken brain tau: isoforms with up to five tandem repeats. Biochemistry. 2002;41(51):15203–11.
Muller JB, Geyer PE, Colaco AR, Treit PV, Strauss MT, Oroshi M, et al. The proteome landscape of the kingdoms of life. Nature. 2020;582(7813):592–6.
Crawford K, Diaz Quiroz JF, Koenig KM, Ahuja N, Albertin CB, Rosenthal JJC. Highly efficient knockout of a squid pigmentation gene. Curr Biol. 2020;30(17):3484–3490 e3484.
Harada A, Oguchi K, Okabe S, Kuno J, Terada S, Ohshima T, et al. Altered microtubule organization in small-calibre axons of mice lacking tau protein. Nature. 1994;369(6480):488–91.
Ikegami S, Harada A, Hirokawa N. Muscle weakness, hyperactivity, and impairment in fear conditioning in tau-deficient mice. Neurosci Lett. 2000;279(3):129–32.
This research was supported in part through computational resources provided by The University of Iowa, Iowa City, Iowa, United States of America.
The work reported above was funded by grants from the National Institutes of Health (K23NS109284), the Roy J. Carver Foundation, and the Carver College of Medicine at the University of Iowa (all to MMH).
Ethics approval and consent to participate
All human or animal data/tissue used in this study is publicly available and de-identified. All methods were performed in accordance with the relevant guidelines and regulations.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References used for brain size and number of neurons. Table S2. Total number of MAPs and proteome size by species. Table S3. Microtubule associated proteins by species, taxon and binding regions.
Linux bash script used for InterProtScan analysis. Code used to run InterProtScan on University of Iowa Argon HPC.
Statistical analysis and graph generation. Code used for statistical analysis and generation of graphs.
About this article
Cite this article
Gottschalk, A.C., Hefti, M.M. The evolution of microtubule associated proteins – a reference proteomic perspective. BMC Genomics 23, 266 (2022). https://doi.org/10.1186/s12864-022-08502-y