Identification, systematic evolution and expression analyses of the AAAP gene family in Capsicum annuum

Background The amino acid/auxin permease (AAAP) family represents a class of proteins that transport amino acids across cell membranes. Members of this family are widely distributed in different organisms and participate in processes such as growth and development and the stress response in plants. However, a systematic comprehensive analysis of AAAP genes of the pepper (Capsicum annuum) genome has not been reported. Results In this study, we performed systematic bioinformatics analyses to identify AAAP family genes in the C. annuum ‘Zunla-1’ genome to determine gene number, distribution, structure, duplications and expression patterns in different tissues and stress. A total of 53 CaAAAP genes were identified in the ‘Zunla-1’ pepper genome and could be divided into eight subgroups. Significant differences in gene structure and protein conserved domains were observed among the subgroups. In addition to CaGAT1, CaATL4, and CaVAAT1, the remaining CaAAAP genes were unevenly distributed on 11 of 12 chromosomes. In total, 33.96% (18/53) of the CaAAAP genes were a result of duplication events, including three pairs of genes due to segmental duplication and 12 tandem duplication events. Analyses of evolutionary patterns showed that segmental duplication of AAAPs in pepper occurred before tandem duplication. The expression profiling of the CaAAAP by transcriptomic data analysis showed distinct expression patterns in various tissues and response to different stress treatment, which further suggest that the function of CaAAAP genes has been differentiated. Conclusions This study of CaAAAP genes provides a theoretical basis for exploring the roles of AAAP family members in C. annuum. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07765-1.

The first amino acid transporter protein (AtAAP1/ NAT2) isolated from plants belongs to the AAP family. There are eight members in Arabidopsis, and AtAAP transports neutral, acidic and cationic amino acids with different specificities and affinities [13,14]. AtAAP1 is highly expressed in Arabidopsis cotyledons and the endosperm, and mediates uptake of amino acids to developing embryo or root cells [15][16][17]. AtAAP2 is localized to the plasma membrane and the phloem, and the aap2 mutant exhibits altered xylem-phloem transfer of amino acids, which affects metabolism and results in increased seed yield and oil content in Arabidopsis [18]. AtAAP3 is exclusively expressed in roots and AtAAP4 is primarily expressed in source leaves, stems, and flowers, AtAAP5 has been observed in all tissues [19]. In the aap6 mutant, the amino acid content of the Arabidopsis sieve elements was reduced but not affect leaves aphid herbivores [20]. AtAAP8 participates in the early seed development in Arabidopsis [21]. OsAAP3 and OsAAP5 regulate tiller number and grain yield in rice [22,23], and overexpression of OsAAP6 increases grain protein content and improves rice nutritional quality [24]. In addition, there are reports of AAP subfamily members in other species, including StAAP1 [25], PvAAP1 [26], PtAAP11 [27], VfAAP1 and VfAAP3 [28].
AtLHT1 localizes on the surface of roots in young seedlings and in pollen and mediates uptake of amino acids from the root to the mesophyll cells through the xylem [29,30]. Under conditions of nitrogen deficiency in particular, overexpression of AtLHT1 can increase the efficiency of nitrogen utilization [30]. AtLHT2 localizes to the tapetum of Arabidopsis anthers [31]. AtLHT6 is expressed in buds, flowers, and roots; AtLHT4 expression is increased in developed buds compared to mature flowers; and expression of AtLHT5 peaks in flowers [32,33]. OsLHT6 is specifically expressed in new shoot meristems [7], and PgLHT plays an important role in the growth and development of the ginseng root system [34]. The GAT subfamily mainly transports γaminobutyric acid (GABA) and GABA-related compounds; the highest expression of AtGAT1 is observed in flowers and under conditions of elevated GABA [35]. AtANT1 is expressed in all organs, with the highest abundance in flowers and cauline leaves, and mediates transport of aromatic and neutral amino acids, arginine, indole-3-acetic acid, and 2, 4-dichlorophenoxyacetic acid [36]. AtAUX1 is a high-affinity transporter of indoleacetic acid (IAA), and AtAUX1 and AtLAX3(a homolog of AtAUX1) are mainly expressed in roots and promote lateral root formation [37,38]. The expression of OsAUX subfamily members is also tissue-specific: OsAUX4 is preferentially expressed in new shoot meristems, and OsAUX2 and OsAUX5 are specifically expressed in young roots, which suggests a role in the formation and development of root systems [7]. MtLAX2, a functional homolog of AtAUX1, is required for nodule organogenesis [39]. The ProTs subfamily is responsible for transporting proline, glycinebetaine (GB) and GABA. AtProT1 is expressed in the phloem or phloem parenchyma cells, which indicates a role in the long-distance transport of proline [40]. By contrast, AtProT2 is only expressed in root epidermis and cortical cells; AtProT3 is more highly in leaf epidermal cells [40]. HvProT2 is constitutively expressed in both leaves and roots, and heterologous expression experiments have shown that the affinity of HvProT2 is highest for glycinebetaine [41]. AtAVT3 and AtAVT4 encode amino acid efflux proteins located in the vacuolar membrane, where they mediate transport of alanine and proline [42].
Pepper is an annual or perennial plant that belongs to the Solanaceae family; it is an important vegetable crop in China, which is number one in the world in terms of planting area and output (http://www.fao.org/faostat/en/ ). The pepper Zunla-1 (C. annuum L.) genome contains 34,476 protein-coding loci on 12 different chromosomes. Although the roles of many AAAPs in plants have been well characterized, members of the AAAP gene family in pepper have not been studied. We used bioinformatics to identify the AAAP gene family members in pepper and systematically analyzed the chromosome distribution, gene structure, evolution characteristics, and expression patterns of AAAP genes to provide a theoretical basis for exploring the roles of AAAPs in pepper.

Identification of AAAP genes in pepper
To explore the AAAP protein family in pepper, we used one domain (PF01490) searche of Pepper Genome Database2 (http://peppersequence.genomics.cn/page/); the HMM profile was used as a query and each putative AAAP protein sequences was verified by SMART, CDD and Pfam analyses. A total of 53 AAAP genes were identified and renamed in pepper according to their affinities within gene subfamilies; CaGAT1, CaANL4 and CaVAAT1, were not anchored to chromosomes (Table 1). Gene lengths ranged from 669 (CaLHT4) to 2532 bp (CaAAP4), the molecular weight varies from 24.43 kDa (CaLHT4) to 93.22 kDa (CaAAP4). The isoelectric points (pIs) of CaAAAP proteins ranged from 4.27(CaVAAT5) to 10.06(CaANT5); the majority of proteins (83%) had pIs more than 7.0, which indicates that AAAP proteins in pepper may represent a class of basic protein.
We studied the exon/intron arrangement of the coding sequences of CaAAAP genes in their genome sequences and found that 13.21% (7/53) of pepper AAAP genes contained a single exon, 3.77% (2/53) had a single intron, and 83.02% had 1 to 14 introns ( Fig. 1). Prediction of TM regions showed that most CaAAAPs (77.36%) had 8-11. Similar numbers of TMs regions were found in several subfamilies (e.g., 10 TMs in the AUX subfamily and 11 TMs in the ANT and ATLa subfamilies; Table 1 and Additional file 1: Figure S1). Thus, members of the same subfamily have a conserved structure. Conserved domains of pepper AAAP proteins were analyzed with the MEME server and a total of 20 conserved motifs were identified ( Fig. 1, Additional file 3: Table S1). Motifs 1 (44/53), 2 (42/53), and 7 (49/53) were widespread among members of the CaAAAP family. Some subfamilies included several specific motifs. For example, the LHT and GAT subfamilies contained motifs 3, 12, 13, and 14, whereas motif 5 was only found in the LHT, AAP, GAT, and ProT subfamilies. Motifs 9, 10, and 17 were only present in the AUX subfamily; motifs 15 and 18 were only present in the ANT subfamily; motifs 16 and 19 were only present in the ATLa subfamily. Similar numbers of motifs were found in the ProT and AUX subfamilies ( Fig. 1), which suggests that the structures of these subfamilies are highly conserved.

Phylogenetic and structural analyses of AAAP proteins in pepper
To further understand the homology between the AAAP gene families of pepper and other plant species (Table 2), we constructed an unrooted phylogenetic tree of fulllength AAAPs from pepper, potato, rice and Arabidopsis was constructed (Fig. 2). We found that the genes CaAAAP, StAAAP, OsAAAP and AtAAAP were divided into eight distinct subfamilies, which indicates that the AAAP gene family has eight subfamilies in angiosperms. In pepper, the LHT subfamily was the largest (26.42%; 14 genes), whereas the GAT subfamily comprised only two genes. and the numbers of genes in the subgroups GAT, ProT,AUX and ANT were the same as or similar to those in potato, rice, and Arabidopsis.

Chromosomal location and duplication analyses
We used Mapchart 2.30 mapping to identify the chromosomal location of AAAP genes in the pepper genome (Fig. 3). In addition to CaGAT1, CaANL4 and CaVATT1, the remaining 50 genes were unevenly distributed on 11 of 12 chromosomes; no genes were mapped to chromosomes 1 (Fig. 3, Table 1). Most of the genes were mapped to the bottom of chromosomes 2, 5, 7 and 8, whereas the genes on chromosome 11 were mostly mapped to the top. A total of 58.5% (31/53) of genes were mapped to chromosome 2, 3, 4 and 5, which contained 8, 6, 11 and 6 genes, respectively. Only one gene was located on chromosome 9, and two to four genes were mapped to the remaining chromosomes (Fig. 3).
To identify the duplication events of AAAP genes in pepper, we analyzed the 53 full-length AAAP protein sequences using MCScanX. According to the defined criterion of separation five or fewer genes with more than 50% similarity at protein level, 33.96% (18 of 53) originated from the duplication events (Fig. 3). Twelve genes (22.64%) were arranged in tandem duplication and organized into four groups. Two pairs of tandem duplicate genes were identified on chromosome 2; chromosomes 5 and 7 each contained one pair (Fig. 3). Three segmental duplication blocks were located on chromosomes 2, 4 and 12, representing 11.32% of all CaAAAP genes (6/53) (Fig. 3, Additional file 2: Figure S2). Furthermore, high-sequence similarity occurred in duplicated genes: CaAAP1 and CaAAP3, which originated via tandem duplication, were     (Table 3). In general, Ka/Ks ratios less than 1 indicate purifying selection, and Ka/Ks ratios greater than 1 indicate positive selection [43]. The Ka/Ks ratios of all seven paralog pairs were < 1.0, which indicates that CaAAAP genes evolved under purifying selection (Table 3). We also estimated the dates of duplication events of paralog pairs using the formula T = Ks/ 2λ (assuming a clock-like rate (λ) of 6.96 × 10 − 9 synonymous substitutions per years [44]); duplication events were estimated to have occurred 8.53 to 68.69 million years ago (Mya), with an average duplication time of 43.61 Mya. We estimate that the duplication time of two AAAP paralog pairs in pepper occurred 58.87 to 54 Mya and that of five of the paralogous gene pairs occurred 40.96 to 8.53 Mya (Table 3).

Expression patterns of CaAAAP genes in various tissues
We investigated the expression profiles of all CaAAAP genes in roots, stems, leaves, floral buds, flowers and different developmental stages of fruits (Fig. 4, Additional file 4: Table S2). 48 (90.5%) of the CaAAAP genes were detected in at least one tissue (RPKM ≥1), and 19(35.8%) genes were detected in all tissues tested (RPKM ≥1). In particular, approximately half of the CaAAAP genes showed low expression in fruits. By contrast, approximately 50% CaAAAP genes showed high expression in flowers and buds (RPKM ≥10). The CaAAAP genes clustered into three distinct clades based on expression patterns (Fig. 4). Seven genes (CaAAP2, CaAAP3, CaAAP5, CaAAP9, CaATL6, CaATL7, and CaVAAT8) in group I were expressed at relatively high levels in all tissues. In addition to several genes exhibited relatively high expression in specific organs (such as CaLHT3, CaLHT5, CaLHT8, VAAT1 and VAAT6 in buds; CaATL4 in fruits; CaLHT9 and CaGAT2 in roots;  CaLHT12 in roots, stems and leaves), the other genes in group II were expressed at relatively low levels in all tested tissues. Group III comprised 20 genes that were expressed at relatively high levels in most organs.
Differential expression profiling of CaAAAP genes in response to hormones and abiotic stress To study whether CaAAAPs are involved in responses to hormones and abiotic stresses in pepper, we investigated the expression levels of the CaAAPs in the roots and leaves of 40-day old seedlings in response to cold, heat, salt, osmotic, oxidative, ABA, IAA, GA3, JA and SA treatment (Fig. 5, Additional file 5: Table S3). In addition to CaLHT2, CaLHT5, CaLHT7, CaLHT8, CaLHT13, and CaAAP10, most AAAP genes were induced in at least one of the treatment as compared with the control (Fig. 5). Interestingly, some AAAP genes varied greatly between the leaves and roots in the response to abiotic or hormones stress. For instance, CaAAP4, CaLHT9, CaLHT10, CaATL3, CaATL6, CaATL7, CaAUX3, and CaVAAT7 were found to be upregulated under cold, heat, osomotic, oxidative and salt in the roots, but downregulated in the leaves. There were 28, 10, 20, and 18 CaAAAP genes were also upregulated by ABA, GA3, IAA, and JA treatment in the roots respectively, but downregulated in the leaves. Whereas there were 4, 5, and 7 CaAAAP genes were observed to be upregulated in the leaves but downregulated in the roots under the cold, IAA and salt stress treatment, respectively. In contrast, the highest number of CaAAAP genes were upregulated in the SA response in the leaves and roots (33 genes). There were several stress-responsive ciselements showing in the promoter regions of these members, such as ABRE, ARE, LTR, MBS, TGACGmotif, CGTCA-motif, TCA-element, GARE-motif, AuxRR-core, and TC-rich repeats (Additional file 6: Table S4). Among the 53 AAAP genes, the CaAAP7 promoter had no these stress-responsive elements, while CaVAAT2 had maximum 14 elements. These results reaveled that a number of CaAAAP genes might involved in regulating abiotic and hormone stress responses.

Discussion
The AAAP gene family, which contains eight subfamilies, encodes integral TM proteins that play a pivotal role in various aspects of normal plant growth and development. This gene family has been identified in many plants, including Arabidopsis [6], rice [7], maize [8], poplar [9], potato [10], moso bamboo [11] and Medicago truncatula [12]. Although the role of AAAP genes in plants has been previously suggested, systematic study of the AAAP gene family in pepper has not been performed. We identified 53 CaAAAPs genes in C.annuum.Zunla-1 in this work. The number of CaAAAPs identified was similar to those in potato [10] and moso bamboo [11]. In addition, AAAP proteins account for 0.13 to 0.18% of the total proteins in many plant species studied (Table 2), and the percentage of CaAAAPs identified in the present study was 0.15%. Thus, the number of AAAP genes in most plants appears to be similar, regardless of genome size. Consistent with that in other plants, the pepper AAAP gene family can be divided into eight subfamilies ( Fig. 2 and Table 2). Although the clade patterns are consistent with previous results from in other plants, the number of AAAP genes within sevearl subfamilies is significant difference ( Table 2), which indicates that the expansion of each subfamily occurred after the split of dicot and monocot. In addition to CaGAT1, CaVATT1 and CaATL4, the remaining 50 genes were unevenly distributed on 11 of 12 chromosomes, and most of the genes were mapped on chromosomes 2, 3, 4 and 5 (Fig. 3). Meanwhile, four groups of tandem duplicate genes were identified on chromosome 2, 5 and 7, respectively, and segmental duplication blocks were located on chromosomes 2, 4 and 12 respectively (Fig. 3). In addition, gene structure analysis indicated the same subgroup had the same or similar numbers and types of exon/intron, TM regions, and motif compositions (Fig. 1, Table 1), which suggests that those groups have been relatively conserved during evolution.
Gene duplication is generally considered a major source of gene family expansion and functional diversity during evolution [45]. Previous studies also showed that 50% (29/58) of AAAP genes are duplicated gene in rice [7], duplicated genes represented 32.69% (17/52) in potato [10] and 30.43% (14/46) in Arabidopsis [6]. In the present study, 33.96% of AAAP genes (18/53) in pepper were duplicated genes, 12 genes (22.64%) are involved in the tandem duplication, and 6 genes (11.32%) in segmental duplication. These results suggest that tandem gene duplication is the main cause of expansion of the CaAAAP gene family; similar results have been reported in potato and Arabidopsis [6,10]. The two pairs (CaANT1 and CaANT4, and CaAAP5 and CaAAP6) of paralogs participated in segmental duplications occurred from 54 to 58.87 Mya, and five pairs (CaANT1 and CaANT2, CaANT2 and CaANT3, CaANT1 and CaANT3, CaLHT1 and CaLHT3, and CaAAP1 and CaAAP3) participated in tandem duplications occurred from 41.43 to 8.53 Mya (Table 3). This indicated that the segmental duplication of AAAPs in pepper occurred before tandem duplication. The pepper/potato separation occurred approximately 36 Mya [46], the duplication of most AAAP paralog pairs occurred before their separation from pepper and potato, and only two paralogous pairs were duplicated after the pepper/potato split. The Ka/Ks ratios of seven paralog pairs were < 1 (Table 3), which indicates that these paralog pairs evolved under purifying selection. Similar results have been reported in moso bamboo [11] and poplar [9], which have no paralog pairs in the AAAP family that underwent positive selection.
Gene duplication often causes changes in gene expression patterns and original functions of these genes may be retained [45]. Comparative analysis of the expression pattern of duplicated CaAAAP genes revealed that CaANT2 and CaANT3 (tandem duplicated genes) exhibited similar expression patterns in various development stages and stresses, which indicated that they may have overlapping functions (Figs. 4 and 5). However, most duplicated CaAAAP genes exhibited distinct expression patterns, such as CaAAP5 and CaAAP6 (segmental duplicated genes); as well as CaAAP1 and CaAAP3, CaLHT1 and CaLHT3 (tandem duplicated genes) (Figs. 4 and 5). These results indicate that the expression and functional divergence of duplicated genes under selection pressure, contributing to adapt to the diversity of the environment.
Gene expression patterns are usually closely linked to plant growth and development, and comparative expression analyses of gene families can provide useful information for establishing their putative functions [47]. In this study, the expression profiles of CaAAAP differed across different organs and stages, consistent with the results of studies in other species such as potato [10]. Approximately 50% CaAAAP genes were expressed at relatively high levels in flowers and buds. 24

and 19
CaAAAPs genes showed relatively high expression levels in the roots and leaves, respectively. Similarly, 19 StAATs in potato were expressed at relatively high levels in the leaves [10]. Our data showed that CaAAP5, an orthologous of StAAP1 and AtAAP6, was highly expressed in flowers, roots, leaves, and stems (Fig. 4). AtAAP6 is responsible for the long-distance transport of amino acids [20]. StAAP1, which is highly expressed in leaves, stem, stolon and young tuber, is also responsible for the long-distance transport of amino acids [25]. Therefore, CaAAP5 might be involved in the longdistance transport of amino acid in pepper. In Arabidopsis, AtAUX1 and AtLAX3 are highly expressed in roots [37,48]. AUX subfamily genes are also mainly expressed in roots of rice and potato [7,10]. In the study, AUX subfamily genes exhibited relatively high expression in roots, which indicates that CaAUXs might be involved in root growth and development. CaATL4 was only expressed at a high level in F-Dev-8 and F-Dev-9, suggesting that CaATL4 could play important roles in the late fruit developmen. Taken together, these results indicate that CaAAAPs may play an important role in the growth and development of pepper.
It has been reported that AAAPs is regulated by low temperature, high salt, and/or drought stress treatments in many plants [40,49]. Under abiotic stress, 47 genes were regulated in at least one of the treatment as compared with the control and the expression of 48 genes were observed in all tissue analysis (Figs. 4 and 5). It has been reported that HvProT and AtProT2 were strongly induced by salt stress [49,50] . Similarly, we found that CaProT1 had a close relationship with AtProT2, was specifically upregulated by cold, heat, salt, osmotic, oxidative, IAA, GA3, JA and SA stress in leaves. On the contrary, AtAAP6 were found to be downregulated by salt stress [50]. CaAAP5, which is orthologous to AtAAP6, was downregulated under salt stress in leaves, but upregualtaed in roots. In moso bamboo, the AAP subfamily gene PeAAAP9 has low expression level in leaf, but it is strongly induced by drought, cold and salt stress treatment [11]. Similary, CaAAP6 was highly expressed under all ten stresses treatment in the roots. However, low expression of this gene was observed in root, suggesting that CaAAP6 may take part in abiotic stress signaling pathways. With respect to the ten treatments, the expression of most CaAAAP was induced in the leaves or roots, suggesting that CaAAAP may play different roles in stress responses in pepper.

Conclusions
Overall, 53 AAAP gene family members were identified in the 'Zunla-1' pepper genome and could be divided into eight subgroups. Throughout its evolutionary history, CaAAAPs were highly conserved and expanded slowly. CaAAAP genes exhibit tissue-specific expression and coordinate to regulate growth and development in pepper.

Data retrieval and identification of gene families
All pepper protein sequences were obtained from the Pepper Genome Database2 (http://peppersequence.genomics. cn/page/). The HMM profile for the AAAP domain (PF01490) downloaded from the Pfam database (http:// pfam.xfam.org) [51], was used to identify potential AAAP genes from the pepper genome with HMMER 3.2.1 (http:// hmmer.janelia.org/), with an E-value of 10 − 2 [47]. BLAST analyses using the rice and Arabidopsis AAAPs as queries against the pepper genome with an E-value threshold of 10 − 10 . The sequences of the rice and Arabidopsis AAAP family were obtained from JGI (https://phytozome.jgi.doe. gov/pz/portal.html). After merging all of the putative pepper AAAP sequences, the candidate protein sequences were further verified for the presence of conserved domains with the online tools Conserved Domain Database (http://www. ncbi.nlm.nih.gov/cdd/), SMART (http://smart.emblheidelberg.de/), and pfam (http://pfam.xfam.org/). The results were integrated and redundant genes were discarded. Molecular weights and pIs of the proteins encoded by the identified genes were predicted with the online EXPASY serve (http://web.expasy.org/protparam/).

Phylogenetic tree, gene structure and conserved motif analyses of CaAAAP genes
Multiple sequence alignments analyses of AAAP amino acid sequences of Arabidopsis, rice, potato and pepper were performed with ClustalW. We built the phylogenetic tree using the neighbor-joining method with MEGA7 [52] and 1000 bootstrap replications, a Poisson model, and partial deletion gap parameters. We determined the exon/intron organization of CaAAAP genes by aligning the coding sequences with genomic sequences using the Gene Structure Display Server (http:// gsds.cbi.pku.edu.cn/) [53]. Conserved motifs were generated with MEME (http://meme-suite.org/tools/meme) with the following parameters: zero or one motif in each sequence, 10 and 100 width of motifs, and a maximum of 20 motifs. Motifs were visualized with TBtools [54].

Chromosomal location and syntenic analyses
The physical positions of the CaAAAP genes were obtained from the pepper annotation file deposited in the Sol Genomics database, mapped to 12 chromosomes, and visualized with Mapchart v.2.32 [55]. For syntenic analyses of CaAAAP genes, we used MCScanX [56] with the default settings to identify gene pairs of segmental and tandem duplications within the pepper genome.

Expression patterns of CaAAAP genes in various tissues and different stresses
To study the expression patterns of pepper AAAP genes in the pepper plant, we downloaded transcriptome sequencing data from the NCBI (https://www.ncbi.nlm. nih.gov/geo/; accession no.GSE45037) [46]. These data covered a wide range of developmental stages of pepper: roots, stems and leaves from plants at the full-bloom stage; unopened flower buds (buds) and fully open flowers (flowers) from mature plants; and fruits lengths of 0-1, 1-3, 3-4, and 4-5 cm (F-Dev-1, F-Dev-1, F-Dev-3 and F-Dev-4, respectively); mature green fruit (F-Dev-5); fruit turning red (F-Dev-6); and fruit 3, 5, and 7 days after turning red (F-Dev-7, F-Dev-8, and F-Dev-9, respectively). A heat map representing digital expression profile of CaAAAP genes was created with R 3.6.3 with logtransformed values.