Structural and functional annotation of the MADS-box transcription factor family in grapevine
© Grimplet et al. 2016
Received: 21 August 2015
Accepted: 14 January 2016
Published: 27 January 2016
MADS-box genes encode transcription factors that are involved in developmental control and signal transduction in eukaryotes. In plants, they are associated to numerous development processes most notably those related to reproductive development: flowering induction, specification of inflorescence and flower meristems, establishment of flower organ identity, as well as regulation of fruit, seed and embryo development. Genomic analyses of MADS-box genes in different plant species are providing new relevant information on the function and evolution of this transcriptional factor family. We have performed a true genome-wide analysis of the complete set of MADS-box genes in grapevine (Vitis vinifera), analyzed their expression pattern and establish their phylogenetic relationships (including MIKC* and type I MADS-box) with genes from 16 other plant species. This study was integrated to previous works on the family in grapevine.
A total of 90 MADS-box genes were detected in the grapevine reference genome by completing current gene annotations with a genome-wide analysis based on sequence similarity. We performed a thorough in-depth curation of all gene models and combined the results with gene expression information including RNAseq data to clarifying the expression of newly identified genes and improve their functional characterization. Curated data were uploaded to the ORCAE database for grapevine in the frame of the grapevine genome curation effort. This approach resulted in the identification of 30 additional MADS box genes. Among them, ten new MIKCC genes were identified, including a potential new group of short proteins similar to the SVP protein subfamily. The MIKC* subgroup contains six genes in grapevine that can be grouped in the S (4 genes) and P (2 genes) clades, showing less redundancy than that observed in Arabidopsis thaliana. Expression pattern of these genes in grapevine is compatible with a role in male gametophyte development. Most of the identified new genes belong to the type I MADS-box genes and were classified as members of the Mα and Mγ subclasses. Ours analyses indicate that only few members of type I genes in grapevine have homology in other species and that species-specific clades appeared both in the Mα and Mγ subclasses. On the other hand, as deduced from the phylogenetic analysis with other plant species, genes that can be crucial for development of central cell, endosperm and embryos seems to be conserved in plants.
The genome analysis of MADS-box genes in grapevine, the characterization of their pattern of expression and the phylogenetic analysis with other plant species allowed the identification of new MADS-box genes not yet described in other plant species as well as basic characterization of their possible role, particularly in the case of type I and MIKC* genes.
The MADS-box family of transcription factors is present in all eukaryotic genomes analyzed so far, although with higher number of gene members in plant genomes than in other kingdoms. Plant MADS-box genes were initially identified as regulators of flower development but later work showed that they control all major aspects of the life of land plants . This family of transcription factors is defined by the presence of a conserved domain, the MADS-box, in the N-terminal region, involved in DNA binding and dimerization with other MADS-box proteins. Ancestral MADS-box gene duplication predating divergence of plants and animals separated the two main lineages, type I and type II [2, 3], but the presence of around 100 genes in the genomes of angiosperm species suggest that they have considerably expanded in plants . Type II group genes include MEF2-like genes of animals and yeast and MIKC-type genes only found in plants. MIKC-type genes received this name because, apart from the MADS (M) domain, they contain three additional conserved domains, the weakly conserved Intervening (I) domain, the conserved Keratin-like (K) domain and the highly variable C-terminal (C) domain  where the latter usually contains conserved subfamily-specific sequence motifs . The I domain is responsible for specificity in the formation of DNA-binding dimers, the K domain mediates dimerization and the C domain functions in transcriptional activation and formation of higher order protein complexes. MIKC-type genes have been further divided in two subgroups, MIKCC and MIKC* based on divergence at the I and K domains and on exon-intron structure [7, 8]. Type I group genes show a simpler gene structure. They are shorter, generally encoding a single exon and lack the K domain.
MIKCC-type genes were initially identified as floral organ identity genes in Antirrhinum majus and Arabidopsis thaliana. Further genetic and molecular analyses grouped their biological functions in flower organogenesis into five classes: A, B, C, D and E, which are required, in different combinations, to specify the identity of sepals (A + E), petals (A + B + E), stamens (B + C + E), carpels (C + E) and ovules (D + E). In Arabidopsis, genes belonging to these functional classes were APETALA1 (AP1) in class A, PISTILATA (PI) and APETALA3 (AP3) in class B, AGAMOUS (AG) in class C , SEEDSTICK/ AGAMOUS-LIKE 1 (STK/AGL11) and SHATTERPROOF (SHP) in class D  and SEPALLATA (SEP1, SEP2, SEP3, SEP4) genes in class E . MIKCC genes in the AG and APETALA1/FRUITFULL (AP1/FUL) subfamilies also participate in fruit and seed development [12–14]. Other MIKCC genes were later identified as involved in different regulatory networks controlling flowering time and flower initiation: FLOWERING LOCUS C (FLC), SUPRESSOR OF OVEREXPRESSION OF CONSTANTS 1 (SOC1) and SHORT VEGETATIVE PHASE (SVP) are involved in the regulation of flowering transition by the integration of signals from different flowering time regulatory pathways [15–18]. These genes function as either positive (SOC1, AGL24) or negative regulators (FLC, SVP) of flower meristem identity genes together with other subfamilies such as AGL15 [19–21], AGL12 and AGL17 [19, 22–25].
The MIKC* subgroup (or Mδ group ) has a small size in all plant species examined so far, ranging from two genes in the basal eudicot (Eschscholzia) and the basal angiosperm (Aristolochia)  to six genes in Arabidopsis thaliana . MIKC* structure is very similar to MIKCC genes but the K-domain is poorly conserved in its last part and gene structure show an exon duplication in its 5’ region . Phylogenetic analyses of MIKC* genes from a broad variety of vascular plants confirm the existence of two clades (S and P) previously determined in Arabidopsis and rice . In Arabidopsis, MIKC* regulatory function depends on the formation of heterodimers between proteins of the S (AGL66 and AGL104) and P clades (AGL30, AGL65, AGL94). With the exception of AGL67 (S) that seems to be involved in late embryo development  these genes are crucial for development of the Arabidopsis male gametophyte [8, 30]. MIKC* genes seem to retain a conserved and essential role in gametophyte development during the evolution of land plants .
Type I genes are very variable in number among flowering plants, ranging from 11 to 229 members . In Arabidopsis, this group has 61 members distributed in three subclasses: Mα (25 genes), Mβ (20 genes) and Mγ (16 genes) . Contrary to the evolution of MIKC-type genes, mostly related to genome duplications, type I MADS-box genes seem to be predominantly duplicated via segmental duplications. This idea is supported by their proximity in the Arabidopsis genome and by their phylogenetic analyses in different species, since species-specific clusters of type I genes have been found in many species [4, 31]. Expression of type I Arabidopsis genes in central cell, antipodal cell and chalazal endosperm of the embryo sac, indicate that they play an important role in female gametophyte and early seed development in [32, 33]. Type I proteins  interact predominantly among them. Mα-type proteins preferentially form heterodimers with Mβ or Mγ-type proteins whereas interactions within the same subclass are rare.
The genome of the PN40024 grapevine was first established from a 8X assembly  and later updated with a 12X assembly available at https://urgi.versailles.inra.fr/Species/Vitis/Data-Sequences from which a gene annotation (VCOST) was performed and is available at the ORCAE platform . The MIKCC-type genes were previously analyzed within the 8X version  and recently the whole MADS-box gene family was analyzed on an early annotation version (v0) of the 12X assembly . Here, we report a thorough, unbiased identification and analysis of the grapevine MADS-box gene family. This work does not represent a new characterization of an advanced annotation version but we have detected new genes and curated all their sequences. This work also represents a direct application of the published grapevine nomenclature recommendations  on the annotation of a complex gene family.
Our analysis has permitted to complete the MIKCC-type genes set with the discovery of 10 new genes, mainly belonging to the SVP, AGL17, BSister (BS) and AGL6 subfamilies, as well as the characterization of the complete set of MIKC* and type I MADS-box genes of grapevine.
Phylogenetic analyses of MIKC* and type I MADS-box genes within a wide range of plant species have permitted to infer homologies for key genes that could be involved in gametophytes development. Furthermore, in silico analysis of gene expression corroborate the expression patterns described for MIKCC-type genes and additionally support the results of the phylogenetic analysis of MIKC* and type I MADS-box genes.
MADS-box gene identification, annotation and mapping within the grapevine genome
Gene identification and structural annotation
Grapevine MADS-box genes
Gene models were curated using the data collected from gene structure comparisons as well as the available RNAseq data from our laboratory (, V. Grbic, P. Carbonell personal communication) to validate actually expressed exons. These data, that included unpublished work, were particularly valuable in the present context since they include expression information of reproductive organs that are susceptible to show expression of MADS-box genes. This data also allowed evaluating the expression of newly detected genes, not represented in microarray data, by redoing the bioinformatics analysis of original RNAseq data with an updated GFF file. Gene structure described  was confirmed for 38 genes. One of them previously allocated to the unknown chromosome was discarded because it already existed on chromosome 15. The structures of 15 other genes were curated. Data relative to the detection of the MADS-box genes in older genome annotations or gene-sets are summarized in Additional file 2.
To clarify gene nomenclature, we built a phylogenetic tree of the MADS-box protein coding genes in grapevine and Arabidopsis (Additional file 3) as recommended by the Super-Nomenclature Committee for Grape Gene Annotation (sNCGGa) . Within type I and type II MADS-box genes, 42 grapevine genes correspond to MIKCC genes (previously we had identified 32 of them ), 6 to MIKC* genes (or Mδ-type), 23 to Mα-type I and 19 to Mγ-type I genes. No Mβ-type I genes seem to be present in grapevine (Additional file 3). For MIKCC genes, when the gene had already been described and the symbol fit the recommendations of the sNCGGa, the same symbol was conserved. For genes that had not been described before and had an Arabidopsis ortholog, Arabidopsis gene symbol was used. For the rest of the genes, the symbol was composed of the subfamily symbol and a number or a letter to differentiate the different members. For the MIKCC genes previously identified , symbols were kept with minor corrections (e. g. VviSOC1.1 is now called VviSOC1a). Symbols used in  were not kept because they were attributed according to their respective chromosome position, which made no more sense with the inclusion of 37 additional genes and because that system was not functional (e.g. the symbol VviAGL15a indicates that the gene is a member of the subfamily AGL15 of the MIKCC while MADS25 gives no information). The new VviSVP-like sequences were designed VviSVPS1 to 5. Following the recommendations of the sNCGGa, we named the MIKC*-subgroup (or Mδ) as MADSD; the Mα-type I subclass as MADS1A and the Mγ-type I subclass as MADS1G. Within the MADSD we distinguished three clades, two corresponding to the previously described S and P clades  and a third one. Consequently gene clades were designed as MADSD1, 2 and 3 and the individual genes were discriminated with a letter attributed randomly. Within the MADS1A we distinguished three clades that were designated as MADS1A1, MADS1A2 and MADS1A3. Within the MADS1G we distinguished also three clades denominated MADS1G1, MADS1G2 and MADS1G3. Additional nomenclature details will be presented within the phylogenetic analysis.
Chromosomal location of grapevine MADS-box genes
Gene structure and phylogenetic analysis of MADS-box genes in grapevine
Update on MIKCC MADS-box genes
The results of the orthology analyses indicated that SVPS sequences were not present in the monocot species considered. In addition, for some genes such as VviSVP1, VviPI, VviAGL15a (except for monocot) and VviSEP1 (not in monocot) there was only one detected gene in all the species. Interestingly, the new grapevine genes found in the subfamilies BS (VviBS3), AGL17 (VviAGL17c, VviAGL17d), AGL6 (VviAGL6b) and TM8 (VviTM8b), did not showed orthology with other species.
Phylogenetic analysis of MIKC* and type I MADS-box genes in plants
The MIKC* MADS-box genes
Type I MADS-box genes
Type I genes were characterized by the presence of a single exon. Only MADS1A3d had three exons and MADS1G1d had two, however we cannot exclude that this could be due to incorrect genome assembly. Within type I genes, it was hard to find clear orthologies between Arabidopsis and grapevine genes, most notably in the Mγ subclass (Fig. 3). Regarding Mβ genes, our analysis revealed that this subclass seems to be absent in grapevine.
Analysis of the Mα-type I genes
Three clades could be distinguished in the Mα subclass based on phylogeny, Mα1, Mα2 and Mα3. These clades showed similarity to the previously defined groups .
The Mα2 clade grouped all nine grapevine genes in a single subclade with no gene from the other analyzed species (Fig. 7). Genes from other species were detected in this clade but no other species showed this level of duplication, Medicago had three genes and strawberry and soybean had two (Additional file 4). Consistently with those results, the homology detected with genes from other species was weak (Fig. 3). MADSA2f, MADSA2g and MADSA2h were absent in other species and the rest of MADSA2 only shared homology with soybean, Medicago, poplar and papaya. A possible ortholog of MADSA2b could only be detected in poplar, which might be a false positive.
Within Mα3 clade, grapevine MADS1A3 genes generally were distant from each other. A group of three genes (MADS1A3a, MADS1A3c and MADS1A3d) seemed to share similarity with seven eucalyptus genes and a few others from Rosacea (Fig. 7). The other Mα3 genes were much more distant in the phylogenetic analysis and appeared unrelated. Two other genes (MADS1A3g and MADS1A3f), seem to be related with a group of nine tomato genes (Additional file 4). MADS1A3e was poorly related to any other Mα except a gene from orange trees. Finally, MADS1A3b did not group within the Mα subclass in the species analysis (Fig. 6). In summary, within the Mα3, not much homology was detected between grapevine genes and other species genes appearing in this clade including Arabidopsis AGL57, AGL64, AGL85, AGL58 and AGL59. MADS1A3a and MADS1A3e were the only members for which orthologs one-to-one could be detected in species other than Arabidopsis (Fig. 3).
Analysis of the Mγ-type I genes
Within the Mγ2 clade there was a group of seven grapevine genes, five in chromosome 5 (MADS1G2c, MADS1G2d, MADS1G2e, MADS1G2f and MADS1G2g) and two on chromosome 2 (MADS1G2a and MADS1G2b) (Fig. 8). These genes had homologs in all the species analyzed but only MADS1G2c showed orthology with papaya and monocot-specific genes (Fig. 3). The closest genes in the phylogenetic tree correspond to a group of apple genes (Fig. 8), which suggests that in both of these fruit species the duplication occurred recently. The Mγ2 clade also included the MADS1G2h grapevine gene located on chromosome 14, which seemed to belong to a group of genes that duplicated and expanded in legumes (soybean and Medicago). MADS1G2h was identified as one-to-one ortholog of AGL80, one of the best characterized Arabidopsis type I genes in this subclass, although AGL80 appeared in a separated branch in the Mγ2 clade. In this context, both eucalyptus and cabbage contained homologous genes to AGL80, probably due to conservation of this gene function along evolution.
Finally, the Mγ3 clade only contained a single grapevine gene (MADS1G3h) that was associated with a single gene from melon (Fig. 8). No orthologs could be detected for this Mγ3 grapevine gene.
Globally, the large clades for Mα3 and Mγ1-type I grapevine genes seemed to be mostly specific for grapevine. In general, there was very little orthology one-to-one detected for the MIKC* and type I MADS-box genes with the notable exceptions of MADSD2b (only absent in strawberry) and MADS1G2h (absent in the tested monocots). In addition, MADS1A1a, MADS1A1g, MADSD1a and MADSD1c were present in about 50 % of the species analyzed.
Expression analysis of MADS-box genes
Expression of MIKCC genes
Expression studies of MIKCC confirmed previous results for those genes previously known [37, 38]. Regarding the new genes found in this study, VviBS3 showed a pattern of expression in tendrils and buds, quite different to the expression pattern of VviBS1 and VviBS2 (berry, seed and flower carpel); VviAGL17d showed a pattern similar to VviAGL17b, restricted to seeds and flowers whereas VviAGL17c expression was detected in roots, berries, seeds, flowers and rachis; VviAGL6b expression was detected in shoot tip, tendril, bud, berry, inflorescence, flowers and reproductive organs in general while VviAGL6a expression was more restricted to the reproductive phase.
VviSVPS sequences showed specific pattern of expression that is different from the VviSVP genes in grapevine. VviSVPS4 expression was similar to VviSVP4 and VviSVP5 but it was neither expressed in reproductive organs nor in seeds. VviSVPS5 showed expression in vegetative organs but also in berry and petals. VviSVPS1 expression was detected in leaf, buds but also in berry, flower and pollen. VviSVPS2 showed expression in the vegetative tissues and flowers and VviSVPS3 in shoot tip and flowers. No expression was detected for VviTM8b.
Expression of MIKC* genes
Expression patterns for genes in clades 1 and 2 were partially overlapping. Within clade 1, MADSD1a was expressed in flower and pollen, MADSD1b was expressed in leaf, bud, berry, seed and flower and MADSD1c was expressed in leaf, bud, berry, seed, inflorescence and flower. Within clade 2, MADSD2a showed expression in leaf, root, berry and seed while MADSD2b was expressed in bud, berry, seed, stamen and pollen. Finally, MADSD3a expression was detected in berry and seed.
Expression of type I genes
Most Arabidopsis type I genes showed very little expression that is restricted to very specific cells and tissues and limited time span, such as the female gametophyte or specific cells within the developing seed. Transcriptome-wide studies conducted in grapevine have yet to address these tissues since for many of these Type I MADS-box genes we could not detect proof of expression.
Expression of Mα-type I genes
Most of the Mα1 clade genes did not show expression over the background noise in the microarrays data to clearly state that they are expressed in any tissue. However, MADS1A1a, MADS1A1c and MADS1A1e seemed to have a slight expression in a few tissues and were considered as expressed. This means that they were qualified as “putative genes” in their description in the ORCAE database and in Additional file 2, as opposed to “hypothetical” for genes with no proof of expression . The gene that is conserved among different species, MADS1A1g, seemed not expressed either or expression could not be detected. Regarding the Mα2 clade, three genes showed expression in specifics tissues. MADS1A2a was allocated to 11 Plant Ontology terms related to vegetative as well as reproductive tissues (Additional file 2). MADS1A2b was allocated to 7 Plant Ontology terms related to vegetative as well as reproductive tissues. MADS1A2i expression was detected in shoot tips and inflorescences. Both MADS1A2b and MADS1A2c seemed to have a light expression and were classified as “putative”. Finally, in the Mα3 clade, only MADS1A3e was clearly expressed in leaves and berries whereas MADS1A3f and MADS1A3g were slightly expressed and were classified as “putative” genes. Neither MADS1A3a, MADS1A3b and MADS1A3d, all on chromosome 19, nor MADS1A3c were expressed. The closest homologous genes in Arabidopsis (AGL57, AGL58, AGL59 and AGL64) were expressed in the embryo and peripheral endosperm .
Expression of Mγ-type I genes
In this subfamily, expression was not detected only for three genes in the first clade (Mγ1), MADS1G1a, MADS1G1b and MADS1G1e. For the rest of the genes, MADS1G1c and d seemed weakly expressed. MADS1G1f, MADS1G1g, MADS1G1h and MADS1G1j were expressed in leaves (MADS1G1j in senescing leaves). In addition, MADS1G1j and MADS1G1i were expressed in flowers.
The rest of genes in the Mγ2 clade showed very low expression and this was only clearly visible for MADS1G2a, MADS1G2b and MADS1G2h in seeds. Gene MADS1G3a, in clade Mγ 3, did not show clear expression in any tissue.
Coexpression of MADS-box genes and relative expression
Several groups of genes present a high level of expression correlation between each other (distance threshold <0.15). Based on “guilt-by-association” concept, genes with similar expression are likely involved in the same process.
We identified four groups of consecutive co-expressing genes that might be under the expression of the same regulators such as MADS1G1f and MADS1G1g or MADS1G1i and MADS1G1j that also coexpressed with MADS1G2a. These genes were in a paralogous chromosome segment . This was the only occurrence where genes from the same duplicated region (potential paralogs) were co-expressed. VviSVPS2 and VviSVPS3 co-expressed also with MADS1A2d. MADS1G2a and MADS1G2b, they also co-expressed with MADS1A1a.
Interestingly, in the case of VviSVP5 co-expressing with VviSVP3 and VviSVP4 co-expressing with VviSVPS4, they are consecutively located in the following order VviSVP3, VviSVPS4 in one chromosome and VviSVP5 and VviSVP4 in another. The 4 genes had similar expression, with the exception that VviSVP5 and VviSVP3 were more abundant in pollen and seed than in other reproductive organs.
Other genes belonging to the same subclasses had identical expression, and therefore could have redundant roles. This was the case for VviSOC1a and VviSOC1b; VviAP1 and VviAGL6b as well as VviBS2 and VviBS1. When genes showed co-expression only on the basis of available RNAseq data, the results were considered less reliable, since it was based on 11 conditions, versus 246 for a gene present in all microarrays. This was the case of MADSD1a, MADS1G2d, MADS1G1c and MADS1G2f on one side; VviSVPS5 and MADS1G1h; MADS1A3f and MADS1G2g; MADS1G2c, VviPI and MADS1A3c, as well as VviFUL1 and MADS1A3i. Finally, amongst genes that did not belong to the same clade and for which there was enough expression data collected, VviSVP2 and VviSOC1c were of interest, beside sharing the same expression pattern, they were strictly expressed in vegetative tissues and not in reproductive tissues. Additionally, VviSEP1, VviAG2, VviAG1 and VviSEP3 shared the same expression patterns that are strictly limited to reproductive tissues. None of the MIKC* genes were co-expressed with any other MADS-box genes (Fig. 10).
We have performed an exhaustive analysis of MADS-box genes on the 12x grapevine genome based on the isolation of the complete set of genes identified in PN40024. In addition to public functional annotations, we flagged some regions that might represent functional MADS-box genes in other areas of the genome. This supposes the identification of 90 functional genes what adds 37 new genes to previous studies . Chromosome localization, gene structure, phylogenetic analyses with other sequenced genome species and expression analysis allowed to propose an extended characterization of this gene family in grapevine and to draw hypothesis on the function of the yet undescribed genes.
Grapevine MIKCC –type II genes
A total of 42 MIKCC –type II genes distributed in 13 subfamilies were found in grapevine, representing a similar number to what has been described in other plant species. These numbers include the addition of ten new genes to the previous descriptions of this group. Interestingly, we did not found putative one-to-one orthologs for these 10 new genes in other plant systems as detected for the remaining MIKCC genes in previous analyses . Grapevine shows a notable expansion of the SVP subfamily as has been described in other woody species [31, 45]. In addition, the new members in the VviSVP subfamily (VviSVPS 1 to 5) showed special features not yet found in any other species. The identification of new genes could result from our more thorough analysis over a 12x genome which, in comparison to automatic annotation methods, permits the isolation of genes with low similarity to other species and not strongly expressed, as automatic method rely on transcript data (EST, RNAseq) and sequences from other species.
The most significant data among the new genes found were the sequences called VviSVPS that appeared in the VviSVP subfamily. The MADS-box of VviSVPS proteins are highly similar to VviSVP and they were named in this way because of this similarity. They also showed similarities with the tomato JOINTLESS  that controls flower abscission. VviSVPS genes are located near VviSVP genes in grapevine. Three of these sequences (VviSVPS1, VviSVPS2 and VviSVPS3) map linked to VviSVP4 on chromosome 3 and showed a singular pattern of expression, partially overlapping with the expression of VviSVP genes but also expressed in other tissues. The same happened with sequences VviSVPS4 and VviSVPS5 which are linked to VviSVP3 on chromosome 15. VviSVPS genes are however much shorter than VviSVP and JOINTLESS. The encoded proteins only contain the MADS-box and a short sequence of about 30 aminoacids that permits to discriminate VviSVPS2 and VviSVPS3 on one hand and VviSVPS1, VviSVPS4 and VviSVPS5 on the other hand (Fig. 4). Compared with grapevine VviSVP genes, they seem to lack a large part of the I region but showed homology at the end of the I region and the beginning of the K domain (Fig. 4). The conservation of the C-terminal small non-MADS-box sequence within the two groups indicates that this region is probably of functional interest. In addition, their specific expression pattern also suggests that these genes are probably functional. Short MADS-box genes have also been found in other species. Notably, one of them seems to be associated with dormancy in leafy spurge , whose expression could be in agreement with the expression pattern observed for VviSVPS2, VviSVPS4 and VviSVPS5 in grapevine. Short MADS-box proteins as the VviSVPS (<100 amino acids) and with sequence similarities to them were also detected in Arabidopsis, Prunus persica, Ricinus communis, Citrus clementina, Populus trichocarpa, Glycine max, Citrus sinensis and Witheringia coccoloboides in their respective predicted automatic annotations. VviSVPS genes had no homologs in the monocot species considered suggesting that duplication events giving rise to these new genes were rather recent and might actually play a role in dicot-specific developmental mechanisms. Their wide range and diversity of expression suggest that those mechanisms are rather diverse and establishment of their biological function will require further studies. The I and K domains are important for dimerization, The K-domain promotes dimerization via forming amphipathic helices, which interact with those of another K-domain-containing protein . The I domain influences the specificity of DNA-binding dimer formation . In VviSVPS, I-region is twice shorter than usual, and K-domain consists of 10–14 amino acids compared to 70–100 amino acids K-domain in MIKC proteins (Fig. 4), i.e. only 3–4 coils are formed instead of 20–30 coils in full-length K-domain alfa-helix. The proofs of their expression and their conservation in other species tends to indicate that they are functional, however, whether or not these proteins form complexes with other MADS-box proteins is unclear. A similar role could also be proposed for VviTM8b, although the truncated protein only contains the MADS-box and the expression of the corresponding gene could not be detected.
Regarding VviAGL6b, it showed a decreasing level of expression along inflorescence development but increased expression along tendril development  suggesting a possible involvement in development of this organ. The expression data (Additional file 2) indicate that the VviAGL6a probe set in the Grapegen array might have been compromised since expression was never detected in this platform, while it is clearly expressed in many tissues. Nevertheless its expression seems to be low in tendrils. Thus a new function acquired in this subfamily for tendril development was suggested for VviAGL6b , as was also proposed for other members of the AP1/FUL subfamily in grapevine  where additional subfamily members also show a differential pattern of tendril expression.
Grapevine MIKC*–type II genes
The grapevine genome contained six genes in this group belonging to the two previously defined clades (S and P) [27, 28], that are typical of euphyllophytes (angiosperm, gymnosperms and ferns). Interestingly, clade 2, that included grapevine MADSD2 genes, also included AGL33, an Arabidopsis gene difficult to classify and not assigned to either S or P clades but considered within the MIKC*subgroup, . MADSD2a the closest grapevine gene to AGL33 in the phylogenetic tree did not show homology to it or to genes from other species with the exception of poplar. This grapevine gene was clearly expressed in heterogeneous tissues indicating a possible ubiquitous role. The other grapevine gene in clade 2, MADSD2b, located alone in chromosome 17, corresponds to a single copy in every species except in strawberry and its Arabidopsis ortholog is AGL65. Its abundant and generalized expression is coincident with the expression behavior of AGL65 according to TAIR (https://www.arabidopsis.org/servlets/TairObject?name=AT1G18750&type=locus).
In clade 1 there were three grapevine genes, MADSD1a, MADSD1b and MADSD1c. MADSD1a had a high number of one-to-one orthologs in other species and was expressed in flower and pollen, being a good candidate for having a functional role in pollen development. MADSD1b and MADSD1c are linked on the same chromosome and have similar patterns of ubiquitous expression suggesting that they could be functionally redundant. Arabidopsis genes within this clade (AGL66 and AGL104) seem to function redundantly in pollen development since their loss has significant effects on pollen performance but only when both AGL66 and AGL104 MADS-box containing complexes are reduced . Thus, we could think of a possible equivalent role for the complex (MADSD2b/MADSD2c) / MADSD1a as complexes AGL65/AGL104 or AGL65/AGL66 role in Arabidopsis pollen development. This is supported by their expression pattern since although MADSD1a and MADSD2b/MADSD2c have quite different patterns of expression they share the feature of peaking in pollen.
Arabidopsis and rice MIKC* genes are almost exclusively expressed in pollen [49–51] with the exception of the AGL67 gene which is expressed in embryos and is candidate for regulating aspects of late embryo development . However, grapevine MIKC*genes are expressed outside male gametophytes with the exception of MADSD1a that was only detected in flower and pollen, which could suggest additional biological functions for these genes. Similarly, expression of P-clade genes was detected in sporophytic tissues in Prunus , in the female gametophyte in lycophytes as well as on hermaphrodite gametophytes in the fern Ceratopteris , suggesting that MIKC* genes could also function outside the male gametophyte in other systems. The basal angiosperm Eschscholzia  contains two MIKC*genes highly expressed in pollen with the one belonging to the P clade also expressed in sporophytic tissues. In Arabidopsis, heterodimers seems to exist only in pollen , suggesting a role of these genes also restricted to male gametophyte development in spite of the expression of the P clade gene. This could be also the case of grapevine with a broad expression of MADSD2b but with a more restricted expression for their S putative partners.
Finally, clade 3 contained a single grapevine gene, MADSD3a, which encodes truncated proteins with the MADS-box domain followed by five amino acids. MADSD3a is expressed in berry and seed. Coexpression analysis (Fig. 10) did not allow to identify clear potential heterodimers based on gene expression, although MADSD3a had similar expression pattern as MADSD2a, with high expression in seeds which could suggest a putative role in seed development, as has been found for AGL67 in Arabidopsis. These data indicate that short MADS-box genes also exist in the MIKC* subgroup as has been observed for the MIKCC genes.
Grapevine type I MADS-box genes
We have identified 42 type I MADS-box genes in grapevine, 23 belonging to Mα-type I and 19 to Mγ-type I genes. Our analyses showed that no gene could be classified as Mβ in grapevine although a reduced clade of two genes might cover their biological roles. These genes numbers were very similar to those described in Arabidopsis with Mα (25 genes) and Mγ (16 genes) , although Arabidopsis displays a large group of Mβ genes (20). Type I genes number is variable among plant species without clear homologies within subclasses and this is also the case in grapevine [4, 33]. Our phylogenetic analysis showed the presence of grapevine-specific clades both in the Mα and Mγ-subclasses with only a few genes having clear homologs in other species. Gene clustering on chromosome location for many of these genes is also in agreement with their proposed origin through segmental duplications.
The Mα-type I genes
Three clades were identified in grapevine MADS1A1 and MADS1A2, that are phylogenetically related and MADS1A3. In Arabidopsis, the Mα type I genes are separated into two groups according to their expression. The first group contains genes distinctly expressed while the genes in the second group are weakly expressed. In the first group, two clades with differential expression pattern were additionally found . In grapevine, the first clade, MADS1A1, contained the genes showing the highest homology with this first clade of the first group of Arabidopsis (Fig. 7). This conservation was also observed in other plant systems. This clade in Arabidopsis contains genes with a proposed functional role in central cell development, endosperm development and early embryo sac and seed development (DIA/AGL61 , AGL62 , AGL23 ). The expression detected for MADS1A1 genes in grapevine is very low, probably due to expression of these genes in few and specific cells, never evaluated in grapevine and during short times along development. No Plant Ontology was attributed to any genes, but expression higher than the background noise were observed for three genes (MADS1A1a, MADS1A1c and MADS1A1e). The expression of MADS1A1a and MADS1A1c in berry and first stages of seed development (mainly in the case of MADS1A1a), and MADS1A1e in seeds would be in agreement with a role in seed development similarly to what has been proposed for their Arabidopsis homologs. Out of five of these genes that clustered on chromosome 5, only two might be expressed. This could be a consequence of segmental duplication that can derive in non-functionalization, as has also been observed in Arabidopsis and other plant systems. This was one of the few groups of grapevine type I genes that showed clear homology with genes from other plant species (Fig. 7), suggesting they might regulate conserved functions during reproductive development.
Regarding the second clade of grapevine genes, MADS1A2, their expression is generally very low except for two genes that are clearly expressed. Both MADS1A2a and MADS1A2 were expressed in a diversity of tissues, senescing leaves, winter buds, post véraison berry tissues, seedling and pollen. In general, MADS1A2 genes showed only little homology to genes from a few other species (papaya, soybean, medicago, tomato and poplar). The MADS1A2 clade do not seem to have any counterpart in Arabidopsis and cannot be related to the second group of Arabidopsis Mα-type I genes composed by genes weakly expressed in the female gametophyte. Interestingly, Arabidopsis genes belonging to the clade without grapevine counterparts were the ones interacting with Mβ genes, a group of genes also missing in grapevine. Considering that Mβ genes seem to be absent or partially absent in several species (Eucalyptus, soybean, medicago and grapevine), it is tempting to speculate the existence of alternative regulatory mechanisms for the development of the female gametophyte in plants, including grapevine. The third clade of Mα-type I genes in grapevine, MADS1A3, is phylogenetically related in the species analysis to the second clade of the first group of Mα-type I genes of Arabidopsis but do not show specific gene homologies. Grapevine genes also show reduced homology with other species, with the exception of MADS1A3a (and MADS1A3e to a lesser extend). In the species analysis, MADS1A3 genes are in the same clade than the four Arabidopsis paralogs AGL57, AGL58, AGL59 and AGL64 which are expressed in the embryo and the peripheral endosperm . We did not detected expression for MADS1A3a, MADS1A3b, MADS1A3c and MADS1A3d but MADS1A3e was expressed in older tissues (senescing leaves, late ripening) and MADS1A3f and MADS1A3g only show signs of expression slightly over background noise.
The Mγ-type I genes
Three clades were distinguished within this subclass, MADS1G1, MADS1G2 and MADS1G3. The first clade MADS1G1 is grapevine-specific (Fig. 8). All these grapevine MADS-box genes, except MADS1G1d, were identified for the first time in this work. Thus, there is no microarray data available to analyze their expression. RNAseq experiments detected no expression for MADS1G1a, MADS1G1b, MADS1G1e and only traces for MADS1G1d and MADS1G1e. By contrast, MADS1G1f, MADS1G1g and MADS1G1h were clearly expressed in shoot tips. More detailed studies will be required to address the possible biological function of this group of genes. Genes within the other group, MADS1G1i and MADS1G1j, were already identified in the grapevine genome. These two genes are linked in the genome and co-expressed with MADS1G2a appearing to be flower-specific. MADS1G1j is also the only gene displaying homology with genes in other species (apple, MDP0000753870).
Although no gene could be classified as Mβ-type I in grapevine, AGL47 and AGL82, a subgroup of Arabidopsis Mβ-type I genes were located closely to the MADS1G1 clade in the phylogenetic tree (Fig. 8). AGL47 was expressed during early megagametogenesis and AGL82 in central cell . Although these organs and cell types were not analyzed here, expression of MADS1G1i and MADS1G1j in flowers could suggest that they might fulfill the role of the Mβ-type I subclass in grapevine.
In the second clade (MADS1G2) there was a group of genes, MADS1G2a to MADS1G2g, clustering together and a single gene, MADS1G2h, with homology to the Arabidopsis genes belonging to the AGL34, AGL35, AGL36, AGL37, AGL80, AGL86, AGL90 and AGL92 group of expression, all involved in endosperm development. Mγ-type I genes are preferentially expressed in the developing seed in Arabidopsis, whereas AGL80 is predominantly expressed in the central cell . Multiple MADS1G2 genes have AGL80 as best match with MADS1G2h having the highest homology (homology percentage of 65 %, compared to 50 % for the others). MADS1G2h also had one-to-one orthologs in all dicot species analyzed, but none in monocots. AGL80 together with AGL61 (DIANA) are involved in central cell formation in the Arabidopsis embryo sac . Interestingly, no grapevine homologs were identified for AGL37 (PHERES1), functionally characterized and involved together with AGL62 in endosperm development. The group of grapevine MADS1G2a to MADS1G2g genes had an expression pattern restricted to seed in several cases although other genes were expressed in earliest reproductive stages (flower and inflorescences). In addition, the putative AGL80 homolog MADS1G2h, was expressed in post-harvest berry and seed.
The third clade (MADS1G3) contained a single Mγ gene, MADS1G3a, which belonged to a clade with Arabidopsis genes AGL95, AGL96 and AGL48. MADS1G3a did not share homology with any gene in the analyzed species and was slightly expressed in seeds. MADS1G3a might be the gene fulfilling the role of these Arabidopsis genes whose biological roles were described as redundant in embryo development .
In summary, grapevine type I genes include a few conserved members that could be crucial for embryo and endosperm development in parallel to the role of Arabidopsis AGL80 the putative ortholog of MADS1G2h and AGL61 and AGL62, putative orthologs of MADS1A1 clade genes. Processes related to central cell, embryo and endosperm development could be conserved under the same regulatory networks. Regarding the development of the female gametophyte, which in Arabidopsis is controlled by interaction between the second group of Arabidopsis Mα-type I genes and the Mβ–type I genes, grapevine genes are less conserved since no homologous to those Mα-genes were identified and Mβ genes might be absent or only represented by two members in grapevine.
Our identification of MADS box proteins in the grapevine genome revealed, for this specific gene family at least, that automatic approaches were limited in gene prediction since most of type I genes had not been identified previously. Our genomic analysis of MADS-box genes in grapevine allowed the discovery of genes belonging to the BS, AGL17, AGL6 and TM8 subfamilies that had no homologs in other plant species. In addition, five sequences related to the VviSVP subfamily, named as VviSVPS could represent a new type of MADS-box genes not yet characterized in other plant systems.
Characterization of the MIKC*-subgroup confirm the proposed existence of S and P clades genes although the grapevine genome seems to have less redundancy in the P clade with only two members. Expression of these grapevine genes was detected in the male gametophyte but also in other tissues which support additional roles outside pollen development.
We have extensively described grapevine type I genes for the first time. We identified two subclades of Mα-type I and three subclades of Mγ-type I genes, but no genes could be clearly classified as Mβ in grapevine. Phylogenetic analysis among species showed that only few members of type I genes have clear homologs in other plant species and that grapevine-specific clades appeared both in the Mα and Mγ-subclasses. Comparing with Arabidopsis and other species type-I genes, we observed conservation of genes that could be crucial for development of central cell, embryo and endosperm. Further functional analysis will be required to understand the biological role of this complex gene family in grapevine.
Identification of MADS-box genes
Genes previously identified as MADS-box genes  were blasted (blastp and tblastn) against the grapevine genome 12x.2 (https://urgi.versailles.inra.fr/Species/Vitis/Data-Sequences/Genome-sequences), the non redundant list of genes in  and the COST annotation gene set available at the ORCAE website (http://bioinformatics.psb.ugent.be/orcae/). Results from different analyses were manually cross-check to identify the potential locus in the 12x.2 genome of known genes and potential new locus. The UGene  software was used to localize the gene model on the grapevine genome and test the structure.
The coding DNA sequences (CDS) were blasted (blastx) against the NCBI public database to compare the structures with other known MADS-box genes in other species and with the NCBI predictions for the grapevine genes. When discrepancies were observed, gene models were corrected using the Ugene software. Loci giving rise to genes that were not functional were eliminated from the list (list and position in Additional file 1). A GFF (General File Format) file with the MADS-box genes was designed, uploaded into the IGV software and the RNAseq data available (shoot tips, leave, flower inflorescence and seed tissues) in the laboratory were used to double-check the exon structure of the genes. Final models were uploaded in the Vitis vinifera ORCA database [36, 39]. Protein domains were directly retrieved from the post-upload analysis automatically performed in the ORCAE database on InterPro, PANTHER, COILS and Phobius . COILS is a software program that compares a sequence to a database of known parallel two-stranded coiled-coils and derives a similarity score . The protein domain modeling were performed with DomainDraw .
Sequence alignment and phylogenetic analysis
Previously described Arabidopsis MADS-box genes  were retrieved from the TAIR database . Phylogenetic analyses were conducted using MEGA6 . Multiple sequence alignment was inferred using MUSCLE . The evolutionary history was inferred by using the Maximum Likelihood method based on the JTT matrix-based model . The bootstrap consensus tree inferred from 100 replicates was taken to represent the evolutionary history of the taxa analyzed . Branches corresponding to partitions reproduced in less than 30 % of bootstrap replicates were collapsed. Initial trees for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using a JTT model, and then selecting the topology with superior log likelihood value. The coding data was translated assuming a Standard genetic code table. All positions with less than 95 % site coverage were eliminated. Genes were named according to  based on the distance homology with Arabidopsis genes.
Comparison to other species
For sequence comparison with the MADS box genes from 16 plant species (Arabidopsis thaliana, Brassica rapa, Carica papaya, Eucalyptus grandis, Citrus sinensis, Malus domestica, Prunus persica, Fragaria vesca, Glycine max, Medicago truncatula, Cucumis melo, Populus trichocarpa, Solanum lycopersicum, Zea mays, Sorghum bicolor, Oryza sativa) they retrieved at http://planttfdb.cbi.pku.edu.cn in the “M-type (type 1 and delta)” and “MIKCC” sections. Orthologous genes in genomes from the 16 species were identified following the approached used by . Each pair of predicted gene sets was aligned with the BLASTx algorithm, and only alignments with an e-value lower than 1e−20 and sequence homology higher than 40 % were retained. If a comparison was above that value the two genes were considered homologs. The percentage cutoff allowed lowering the weight of homology only on the MADS-box. Two genes, A from Vitis genome GV and B from genome GX, were considered orthologs if B was the best match for gene A in GX and A was the best match for B in GV, else genes were considered homologs. A phylogenetic tree was constructed with the “M-type” from these species with the same parameters as before.
Expression data were retrieved from 3 different microarray platforms (Affymetrix Genchip (16 k probesets) GrapeGen (21 k probesets),Vitis Nimblegen array (29 k probesets) and from our in-house RNAseq projects. Data normalization was performed on all the array of each platform (RMA normalization). After retrieving the values for the probesets corresponding to each gene, the values for the 3 or 4 replicates of the same condition were averaged to obtain a total of 256 conditions (organ, cultivar, treatment, platform). Based on expression data, a plant ontology ID was attributed to each gene if expression intensity in a tissue was above a defined threshold of absolute intensity for each platform (orange and red values in Fig. 9). Normalized signal intensity higher than 8 (log2 value) for microarray data, at least 10 reads for RNAseq data). For the coexpression analysis, in order to minimize the weight of the conditions that were performed numerous times without bringing pertinent information a preliminary cluster analysis was performed. Non redundant array were obtained by averaging the values of the probeset for the same gene of arrays that present a distance threshold of the MADS-box expression lower than 0.05 for hierarchical clustering (Pearson correlation, average linking). To compare the relative expression of the genes, a second cluster analysis was performed on the non redundant array with the same parameters. Genes considered as having the same profile should present a distance threshold between each other lower than 0.15. For cluster analysis, the value with a low intensity in the microarrays data (green values in Fig. 9) where variability is mainly caused by noise were smoothed to a basal log2 value of 5.
Authors would like to thank Rafael Torres-Perez for bioinformatics analyses. Pablo Carbonell Carolina Royo and Vojislava Grbic for RNAseq data. J.G. was supported by the Ramon y Cajal program (RYC-2011-07791) and the AGL2014-59171-R project from the Spanish MINECO. JMZ was supported by the BIO2014-59324-R project from the Spanish MINECO. Authors would like also to thank COST action FA1106 “Quality fruit”.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Gramzow L, Theissen G. A hitchhiker’s guide to the MADS world of plants. Genome Biol. 2010;11(6):214.PubMed CentralView ArticlePubMedGoogle Scholar
- Alvarez-Buylla ER, Pelaz S, Liljegren SJ, Gold SE, Burgeff C, Ditta GS, et al. An ancestral MADS-box gene duplication occurred before the divergence of plants and animals. Proc Natl Acad Sci U S A. 2000;97(10):5328–33.PubMed CentralView ArticlePubMedGoogle Scholar
- De Bodt S, Raes J, Van de Peer Y, Theissen G. And then there were many: MADS goes genomic. Trends Plant Sci. 2003;8(10):475–83.View ArticlePubMedGoogle Scholar
- Gramzow L, Theißen G. Phylogenomics of MADS-Box genes in plants - two opposing life styles in one gene family. Biol (Basel). 2013;2(3):1150–64.Google Scholar
- Theissen G, Kim JT, Saedler H. Classification and phylogeny of the MADS-box multigene family suggest defined roles of MADS-box gene subfamilies in the morphological evolution of eukaryotes. J Mol Evol. 1996;43(5):484–516.View ArticlePubMedGoogle Scholar
- Kaufmann K, Melzer R, Theissen G. MIKC-type MADS-domain proteins: structural modularity, protein interactions and network evolution in land plants. Gene. 2005;347(2):183–98.View ArticlePubMedGoogle Scholar
- Henschel K, Kofuji R, Hasebe M, Saedler H, Munster T, Theissen G. Two ancient classes of MIKC-type MADS-box genes are present in the moss Physcomitrella patens. Mol Biol Evol. 2002;19(6):801–14.View ArticlePubMedGoogle Scholar
- Verelst W, Twell D, de Folter S, Immink R, Saedler H, Munster T. MADS-complexes regulate transcriptome dynamics during pollen maturation. Genome Biol. 2007;8(11):R249.PubMed CentralView ArticlePubMedGoogle Scholar
- Bowman JL, Smyth DR, Meyerowitz EM. Genetic interactions among floral homeotic genes of Arabidopsis. Development. 1991;112(1):1–20.PubMedGoogle Scholar
- Colombo L, Franken J, Koetje E, van Went J, Dons HJ, Angenent GC, et al. The petunia MADS box gene FBP11 determines ovule identity. Plant Cell. 1995;7(11):1859–68.PubMed CentralView ArticlePubMedGoogle Scholar
- Theissen G. Development of floral organ identity: stories from the MADS house. Curr Opin Plant Biol. 2001;4(1):75–85.View ArticlePubMedGoogle Scholar
- Immink RGH, Ferrario S, Busscher-Lange J, Kooiker M, Busscher M, Angenent GC. Analysis of the petunia MADS-box transcription factor family. Mol Genet Genomics. 2003;268(5):598–606.PubMedGoogle Scholar
- Rijpkema AS, Royaert S, Zethof J, van der Weerden G, Gerats T, Vandenbussche M. Analysis of the Petunia TM6 MADS box gene reveals functional divergence within the DEF/AP3 lineage. Plant Cell. 2006;18(8):1819–32.PubMed CentralView ArticlePubMedGoogle Scholar
- Smaczniak C, Immink RG, Muino JM, Blanvillain R, Busscher M, Busscher-Lange J, et al. Characterization of MADS-domain transcription factor complexes in Arabidopsis flower development. Proc Natl Acad Sci U S A. 2012;109(5):1560–5.PubMed CentralView ArticlePubMedGoogle Scholar
- Li D, Liu C, Shen L, Wu Y, Chen H, Robertson M, et al. A repressor complex governs the integration of flowering signals in Arabidopsis. Dev Cell. 2008;15(1):110–20.View ArticlePubMedGoogle Scholar
- Liu C, Chen H, Er HL, Soo HM, Kumar PP, Han JH, et al. Direct interaction of AGL24 and SOC1 integrates flowering signals in Arabidopsis. Development. 2008;135(8):1481–91.View ArticlePubMedGoogle Scholar
- Lee J, Lee I. Regulation and function of SOC1, a flowering pathway integrator. J Exp Bot. 2010;61(9):2247–54.View ArticlePubMedGoogle Scholar
- Amasino R. Seasonal and developmental timing of flowering. Plant J. 2010;61(6):1001–13.View ArticlePubMedGoogle Scholar
- Alvarez-Buylla ER, Liljegren SJ, Pelaz S, Gold SE, Burgeff C, Ditta GS, et al. MADS-box gene evolution beyond flowers: expression in pollen, endosperm, guard cells, roots and trichomes. Plant J. 2000;24(4):457–66.View ArticlePubMedGoogle Scholar
- Lehti-Shiu MD, Adamczyk BJ, Fernandez DE. Expression of MADS-box genes during the embryonic phase in Arabidopsis. Plant Mol Biol. 2005;58(1):89–107.View ArticlePubMedGoogle Scholar
- Adamczyk BJ, Lehti-Shiu MD, Fernandez DE. The MADS domain factors AGL15 and AGL18 act redundantly as repressors of the floral transition in Arabidopsis. Plant J. 2007;50(6):1007–19.View ArticlePubMedGoogle Scholar
- Rounsley SD, Ditta GS, Yanofsky MF. Diverse roles for MADS box genes in Arabidopsis development. Plant Cell. 1995;7(8):1259–69.PubMed CentralView ArticlePubMedGoogle Scholar
- Burgeff C, Liljegren SJ, Tapia-Lopez R, Yanofsky MF, Alvarez-Buylla ER. MADS-box gene expression in lateral primordia, meristems and differentiated tissues of Arabidopsis thaliana roots. Planta. 2002;214(3):365–72.View ArticlePubMedGoogle Scholar
- Tapia-Lopez R, Garcia-Ponce B, Dubrovsky JG, Garay-Arroyo A, Perez-Ruiz RV, Kim SH, et al. An AGAMOUS-related MADS-box gene, XAL1 (AGL12), regulates root meristem cell proliferation and flowering transition in Arabidopsis. Plant Physiol. 2008;146(3):1182–92.PubMed CentralView ArticlePubMedGoogle Scholar
- Han P, Garcia-Ponce B, Fonseca-Salazar G, Alvarez-Buylla ER, Yu H. AGAMOUS-LIKE 17, a novel flowering promoter, acts in a FT-independent photoperiod pathway. Plant J. 2008;55(2):253–65.View ArticlePubMedGoogle Scholar
- Parenicova L, de Folter S, Kieffer M, Horner DS, Favalli C, Busscher J, et al. Molecular and phylogenetic analyses of the complete MADS-box transcription factor family in Arabidopsis: new openings to the MADS world. Plant Cell. 2003;15(7):1538–51.PubMed CentralView ArticlePubMedGoogle Scholar
- Kwantes M, Liebsch D, Verelst W. How MIKC* MADS-box genes originated and evidence for their conserved function throughout the evolution of vascular plant gametophytes. Mol Biol Evol. 2012;29(1):293–302.View ArticlePubMedGoogle Scholar
- Nam J, Kim J, Lee S, An G, Ma H, Nei M. Type I MADS-box genes have experienced faster birth-and-death evolution than type II MADS-box genes in angiosperms. Proc Natl Acad Sci U S A. 2004;101(7):1910–5.PubMed CentralView ArticlePubMedGoogle Scholar
- de Folter S, Busscher J, Colombo L, Losa A, Angenent GC. Transcript profiling of transcription factor genes during silique development in Arabidopsis. Plant Mol Biol. 2004;56(3):351–66.View ArticlePubMedGoogle Scholar
- Adamczyk BJ, Fernandez DE. MIKC* MADS domain heterodimers are required for pollen maturation and tube growth in Arabidopsis. Plant Physiol. 2009;149(4):1713–23.PubMed CentralView ArticlePubMedGoogle Scholar
- Wells CE, Vendramin E, Jimenez Tarodo S, Verde I, Bielenberg DG. A genome-wide analysis of MADS-box genes in peach [Prunus persica (L.) Batsch]. BMC Plant Biol. 2015;15:41.PubMed CentralView ArticlePubMedGoogle Scholar
- Bemer M, Heijmans K, Airoldi C, Davies B, Angenent GC. An atlas of type I MADS box gene expression during female gametophyte and seed development in Arabidopsis. Plant Physiol. 2010;154(1):287–300.PubMed CentralView ArticlePubMedGoogle Scholar
- Masiero S, Colombo L, Grini PE, Schnittger A, Kater MM. The emerging importance of type I MADS box transcription factors for plant reproduction. Plant Cell. 2011;23(3):865–72.PubMed CentralView ArticlePubMedGoogle Scholar
- de Folter S, Immink RG, Kieffer M, Parenicova L, Henz SR, Weigel D, et al. Comprehensive interaction map of the Arabidopsis MADS Box transcription factors. Plant Cell. 2005;17(5):1424–33.PubMed CentralView ArticlePubMedGoogle Scholar
- Jaillon O, Aury JM, Noel B, Policriti A, Clepet C, Casagrande A, et al. The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature. 2007;449(7161):463–7.View ArticlePubMedGoogle Scholar
- Sterck L, Billiau K, Abeel T, Rouze P, Van de Peer Y. ORCAE: online resource for community annotation of eukaryotes. Nat Methods. 2012;9(11):1041.View ArticlePubMedGoogle Scholar
- Diaz-Riquelme J, Lijavetzky D, Martinez-Zapater JM, Carmona MJ. Genome-wide analysis of MIKCC-type MADS box genes in grapevine. Plant Physiol. 2009;149(1):354–69.PubMed CentralView ArticlePubMedGoogle Scholar
- Wang L, Yin X, Cheng C, Wang H, Guo R, Xu X, et al. Evolutionary and expression analysis of a MADS-box gene superfamily involved in ovule development of seeded and seedless grapevines. Mol Genet Genomics. 2015;290(3):825–46.View ArticlePubMedGoogle Scholar
- Grimplet J, Adam-Blondon A-F, Bert P-F, Bitz O, Cantu D, Davies C, et al. The grapevine gene nomenclature system. BMC Genomics. 2014;15(1):1077.PubMed CentralView ArticlePubMedGoogle Scholar
- Grimplet J, Van Hemert J, Carbonell-Bejerano P, Diaz-Riquelme J, Dickerson J, Fennell A, et al. Comparative analysis of grapevine whole-genome gene predictions, functional annotation, categorization and integration of the predicted gene sequences. BMC Res Notes. 2012;5:213.PubMed CentralView ArticlePubMedGoogle Scholar
- Royo C, Carbonell-Bejerano P, Torres-Perez R, Nebish A, Martinez O, Rey M, et al. Developmental, transcriptome, and genetic alterations associated with parthenocarpy in the grapevine seedless somatic variant Corinto bianco. J Exp Bot. 2015;67(1):259–73.View ArticlePubMedGoogle Scholar
- Velasco R, Zharkikh A, Troggio M, Cartwright DA, Cestaro A, Pruss D, et al. A high quality draft consensus sequence of the genome of a heterozygous grapevine variety. PLoS One. 2007;2(12):e1326.PubMed CentralView ArticlePubMedGoogle Scholar
- Vilella AJ, Severin J, Ureta-Vidal A, Heng L, Durbin R, Birney E. EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates. Genome Res. 2009;19(2):327–35.PubMed CentralView ArticlePubMedGoogle Scholar
- Fasoli M, Dal Santo S, Zenoni S, Tornielli GB, Farina L, Zamboni A, et al. The grapevine expression atlas reveals a deep transcriptome shift driving the entire plant into a maturation program. Plant Cell. 2012;24(9):3489–505.PubMed CentralView ArticlePubMedGoogle Scholar
- Leseberg CH, Li A, Kang H, Duvall M, Mao L. Genome-wide analysis of the MADS-box gene family in Populus trichocarpa. Gene. 2006;378:84–94.View ArticlePubMedGoogle Scholar
- Mao L, Begum D, Chuang HW, Budiman MA, Szymkowiak EJ, Irish EE, et al. JOINTLESS is a MADS-box gene controlling tomato flower abscission zone development. Nature. 2000;406(6798):910–3.View ArticlePubMedGoogle Scholar
- Horvath DP, Chao WS, Suttle JC, Thimmapuram J, Anderson JV. Transcriptome analysis identifies novel responses and potential regulatory genes involved in seasonal dormancy transitions of leafy spurge (Euphorbia esula L.). BMC Genomics. 2008;9:536.PubMed CentralView ArticlePubMedGoogle Scholar
- Diaz-Riquelme J, Martinez-Zapater JM, Carmona MJ. Transcriptional analysis of tendril and inflorescence development in grapevine (Vitis vinifera L.). PLoS One. 2014;9(3):e92339.PubMed CentralView ArticlePubMedGoogle Scholar
- Kofuji R, Sumikawa N, Yamasaki M, Kondo K, Ueda K, Ito M, et al. Evolution and divergence of the MADS-box gene family based on genome-wide expression analyses. Mol Biol Evol. 2003;20(12):1963–77.View ArticlePubMedGoogle Scholar
- Honys D, Twell D. Transcriptome analysis of haploid male gametophyte development in Arabidopsis. Genome Biol. 2004;5(11):R85.PubMed CentralView ArticlePubMedGoogle Scholar
- Liu C, Teo ZW, Bi Y, Song S, Xi W, Yang X, et al. A conserved genetic pathway determines inflorescence architecture in Arabidopsis and rice. Dev Cell. 2013;24(6):612–22.View ArticlePubMedGoogle Scholar
- Bemer M, Wolters-Arts M, Grossniklaus U, Angenent GC. The MADS domain protein DIANA acts together with AGAMOUS-LIKE80 to specify the central cell in Arabidopsis ovules. Plant Cell. 2008;20(8):2088–101.PubMed CentralView ArticlePubMedGoogle Scholar
- Kang IH, Steffen JG, Portereiko MF, Lloyd A, Drews GN. The AGL62 MADS domain protein regulates cellularization during endosperm development in Arabidopsis. Plant Cell. 2008;20(3):635–47.PubMed CentralView ArticlePubMedGoogle Scholar
- Colombo M, Masiero S, Vanzulli S, Lardelli P, Kater MM, Colombo L. AGL23, a type I MADS-box gene that controls female gametophyte and embryo development in Arabidopsis. Plant J. 2008;54(6):1037–48.View ArticlePubMedGoogle Scholar
- Portereiko MF, Lloyd A, Steffen JG, Punwani JA, Otsuga D, Drews GN. AGL80 is required for central cell and endosperm development in Arabidopsis. Plant Cell. 2006;18(8):1862–72.PubMed CentralView ArticlePubMedGoogle Scholar
- Steffen JG, Kang IH, Portereiko MF, Lloyd A, Drews GN. AGL61 interacts with AGL80 and is required for central cell development in Arabidopsis. Plant Physiol. 2008;148(1):259–68.PubMed CentralView ArticlePubMedGoogle Scholar
- Okonechnikov K, Golosova O, Fursov M, team U. Unipro UGENE: a unified bioinformatics toolkit. Bioinformatics. 2012;28(8):1166–7.View ArticlePubMedGoogle Scholar
- Kall L, Krogh A, Sonnhammer EL. A combined transmembrane topology and signal peptide prediction method. J Mol Biol. 2004;338(5):1027–36.View ArticlePubMedGoogle Scholar
- Lupas A, Van Dyke M, Stock J. Predicting coiled coils from protein sequences. Science. 1991;252(5009):1162–4.View ArticlePubMedGoogle Scholar
- Fink JL, Hamilton N. DomainDraw: a macromolecular feature drawing program. In Silico Biol. 2007;7(2):145–50.PubMedGoogle Scholar
- Lamesch P, Berardini TZ, Li D, Swarbreck D, Wilks C, Sasidharan R, et al. The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools. Nucleic Acids Res. 2012;40(Database issue):D1202–10.PubMed CentralView ArticlePubMedGoogle Scholar
- Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. MEGA6: Molecular evolutionary genetics analysis version 6.0. Mol Biol Evol. 2013;30(12):2725–9.PubMed CentralView ArticlePubMedGoogle Scholar
- Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32(5):1792–7.PubMed CentralView ArticlePubMedGoogle Scholar
- Jones DT, Taylor WR, Thornton JM. The rapid generation of mutation data matrices from protein sequences. Bioinformatics. 1992;8(3):275–82.View ArticleGoogle Scholar
- Felsenstein J. Confidence limits on phylogenies: an approach using the bootstrap. Evolution. 1985;39(4):783–91.View ArticleGoogle Scholar