Identification of stable QTLs and candidate genes involved in anaerobic germination tolerance in rice via high-density genetic mapping and RNA-Seq

Background Anaerobic germination tolerance is an important trait for direct-seeded rice varieties. Understanding the genetic basis of anaerobic germination is a key for breeding direct-seeded rice varieties. Results In this study, a recombinant inbred line (RIL) population derived from a cross between YZX and 02428 exhibited obvious coleoptile phenotypic differences. Mapping analysis using a high-density bin map indicated that a total of 25 loci were detected across two cropping seasons, including 10 previously detected loci and a total of 13 stable loci. Analysis of the 13 stable loci demonstrated that the more elite alleles that were pyramided in an individual, the higher the values of these traits were in the two cropping seasons. Furthermore, some anaerobic germination-tolerant recombinant inbred lines, namely G9, G10, G16, and G151, were identified. A total of 84 differentially expressed genes were obtained from the 13 stable loci via genome-wide expression analysis of the two parents at three key periods. Among them, Os06g0110200, Os07g0638300, Os07g0638400, Os09g0532900, Os09g0531701 and Os12g0539751 constitute the best candidates associated with anaerobic germination. Conclusions Both the anaerobic germination-tolerant recombinant inbred lines and the loci identified in this study will provide new genetic resources for improving the anaerobic germination tolerance of rice using molecular breeding strategies, as well as will broaden our understanding of the genetic control of germination tolerance under anaerobic conditions. Electronic supplementary material The online version of this article (10.1186/s12864-019-5741-y) contains supplementary material, which is available to authorized users.


Background
Direct-seeded rice (DSR), classified as wet DSR, dry DSR, or water DSR, is becoming increasingly popular across the world due to its cost efficiency and convenience [1]. Compared with wet and dry DSR, water DSR (whereby seeds are broadcasted in standing water) is more advantageous as it is less labor and time intensive, and restrains weed growth. However, rice (Oryza sativa L.) is extremely sensitive to anoxia during germination and early growth of the embryo [2][3][4]. When using the water DSR method, the rice seeds are completely submerged and suffer hypoxia or anoxia, leading to poor or no germination, seedling death, and poor crop establishment [2,3]. Anaerobic germination (AG) is the main limiting factor for the large-scale adoption of water DSR. Revealing the biochemical and molecular mechanisms of AG tolerance under flooding conditions and breeding rice varieties with superior AG-tolerant traits could effectively resolve the limitations of the water DSR method.
Insufficient energy supply under oxygen-deficient conditions caused by submergence is a major bottleneck for seed germination and seedling survival [5]. α-Amylase is a critical enzyme in the mobilization of starchy reservation in rice. Of the α-amylase members, the expression of RAmy3D is positively correlated with coleoptile elongation and seedling survival, particularly in tolerant genotypes [3,6], indicating that RAmy3D may be active during the anaerobic mobilization of rice. However, the sequence variations of RAmy3D that are associated with the AG tolerance in different rice varieties are still unknown. Moreover, the maintenance of higher α-amylase activity has been widely reported as a significant step in carbohydrate catabolism under submergence [7,8]. High expression of CIPK15, which encodes protein kinases, activates energy and the stressor receptor SnRK1A, resulting in a series of downstream cascade reactions, such as the induced expression of the transcription factor MYBS1, thereby enhancing the expression of α-amylases genes [9].
Under hypoxic conditions, ATP is produced through fermentative metabolism using ethanol and available carbohydrates [10]. Two enzymes related to alcohol fermentation metabolism, namely pyruvate decarboxylase (PDC) and alcohol dehydrogenase (ADH), are induced by submergence, and the enzyme activity of tolerant material was found to be higher than intolerant material after 12 h of imbibition [3]. Overexpression of PDC (pdc1) in the rice cultivar "Taipei 309" improved anoxia tolerance as a result of an increase in alcohol metabolism [11]. A point mutation in the Adh1 gene reduces ethanol dehydrogenase activity and leads to a lack of ATP, thereby inhibiting the elongation of the coleoptile sheath of the reduced adh activity (rad) mutant under submerged condition [12].
Recently, the trehalose-6-phosphate (T6P) phosphatase gene (OsTPP7) that is involved in T6P metabolism was cloned [13]. Under anaerobic stress, OsTPP7 increased the turnover of T6P, thus enhancing starch mobilization and driving the growth kinetics of the germinating embryo and the elongation of the coleoptile, which consequently enhanced AG tolerance. However, the exact mechanism of tolerance to AG is still not fully understood and warrants further investigation [14]. Rice has a very complex molecular regulatory network underpinning AG tolerance and metabolization under anaerobic fermentation. This regulatory network involves numerous factors relating to hemoglobin, expansin, reactive oxygen species, pyrophosphate, nitrite, ferrous ions, nitric oxide, pH reduction (acidification), and lipid peroxidation [15,16], which are only understood to a limited extent.
To determine the genetic basis of AG in rice, quantitative trait locus (QTL) mapping and genome-wide association study (GWAS) were recently used to identify some QTLs associated with AG. Several studies assigned the elongation of the coleoptile as an indicator trait of the tolerance phenotype in QTL mapping [17][18][19][20][21]. There are other studies assigned the seeding survival rate as an indicator of the tolerance phenotype [4,[22][23][24][25]. Some valuable QTLs were obtained by QTL mapping with the two kind of indicator. GWAS is a popular and highly efficient strategy for dissecting complex traits, but few reports regarding the excavation of AG tolerance loci via GWAS are available. A GWAS using 36,901 single nucleotide polymorphisms (SNPs), and identified 88 loci associated with AG tolerance [26]. Through GWAS of 5291 SNPs in 432 indica varieties, a total of 15 AG tolerance loci were detected by another report [27].
Although numerous AG tolerance loci have been identified, only one QTL (qAG-9-2) has been fine mapped and cloned as OsTPP7. Therefore, a large gap between the identification of AG tolerance genes and the breeding of DSR rice varieties still exists. This gap is mainly due to the majority of reported loci being identified based on low-density markers, and few reliable and stable AG tolerance loci have been screened for gene cloning. It is thus imperative that ultra-high density markers are used to evaluate stable AG tolerance loci for further investigation.
In this study, a high-density genetic map consisting of 2711 bin markers obtained via the sequencing-based genotyping of 192 recombinant inbred lines (RILs) derived from a cross between the japonica cultivar 02428 and indica cultivar YZX was used for QTL mapping. The aims of the study were to identify stable QTLs for AG tolerance and screen candidate genes in the early and late cropping seasons (ES and LS). These QTLs will provide a foundation for further investigation of the molecular mechanisms underlying anoxia tolerance and will provide target loci for improving the varieties via molecular breeding. We also performed transcriptome expression profiling and identified differentially expressed genes located in the AG tolerance-related QTL regions, providing valuable information for candidate gene verification and the dissection of gene regulatory networks affecting rice AG tolerance.

Phenotypic performance of the parents and RIL population for AG tolerance
The faster the rice coleoptile elongates, the sooner the seedling is able to escape the anoxic environment, which improves the chances of survival. Therefore, rice coleoptiles are a classical organ used for studies on AG tolerance [28]. In this study, we investigated the dynamic phenotypic changes in the coleoptiles in two parents under anaerobic conditions at ES. Considerable distinct variations in the coleoptile traits between YZX and 02428 were observed (Fig. 1).
Since YZX germinates faster than 02428, the coleoptile of YZX can be observed on the first day, while the coleoptile of 02428 will be seen on the second day. Although the coleoptile of YZX emerged earlier than that of 02428, the coleoptile length (CL), coleoptile surface area (CSA), coleoptile volume (CV), and coleoptile diameter (CD) of 02428 were significantly greater than that of YZX at the end point (6th day). From the 3rd day, CL, CSA, and CV in 02428 increased significantly faster than in YZX. However, the growth of CD in 02428 exceeded that of YZX from the 2nd day. CL, CSA, and CV growth in both YZX and 02428 occurred mainly from the 2nd to the 5th day. In particular, coleoptile development in YZX mainly occurred from the 2nd to the 3rd day, while coleoptile growth in 02428 was relatively uniform between the 2nd and 5th day. The CD of the two parents differed slightly once the coleoptile had broken through the seed coat (from the 2nd to 6th day) and differed significantly from CL, CSA, and CV. There were significant or highly significant differences in the four traits between YZX and 02428 at ES and LS (Table 1), indicating large genetic differences between the two parents, which is conducive for QTL mapping. The four coleoptile traits in the RIL population exhibited different degrees of transgressive segregation (Additional file 1: Table S1). Specifically, the variation in CV was the highest of all the traits across the two cropping seasons. The absolute skewness and kurtosis values of each trait in the RIL population were very close to 0, indicating that all of these traits were normally distributed and were controlled by minor multiple genes. In addition, the correlation among the four coleoptile traits in the RIL population was analyzed (Additional file 2: Table S2). With the exception of CL and CD at LS, all the other traits were extremely significantly correlated.

Identification of QTLs
The genetic linkage map had an average distance of 0.86 cM between adjacent bin markers and an average physical distance between the markers of 137.68 Kb. Using the inclusive composite interval mapping (ICIM) method, a total of 32 QTLs affecting CL, CSA, CV, and CD were detected across the two cropping seasons and were distributed on all chromosomes, except chromosome 11. In all QTLs, some showed a negative additive effect and the other showed a positive additive effect, indicating that both parents contributed favorable alleles. Seven QTLs were associated with CL and each accounted for 4.99%~10.30% of the phenotypic variation. Five QTLs were identified for CSA and each accounted for 5.93%~10.35% of the phenotypic variation. Seven QTLs were related to CV and each contributed 5.19%~12.83% of the phenotypic variation. Thirteen QTLs were associated with CD and each explained 4.28%~9.81% of the phenotypic variation. The QTLs that overlapped based on physical position were classified as the same loci. Ultimately, a total of 25 loci were obtained. Notably, one of the 25 loci were repeatedly detected across the two seasons, that is, qCD-7 (ES) and qCD-7 (LS). Two loci were detected by different traits, that is, qCSA-3, qCV-3 and qCL-3-1; qCL-12 and qCSA-12-1, and they were detected at LS. In addition, two loci were detected by different traits across the two seasons, that is, qCSA-6 (ES), qCSA-6 (LS), and qCV-6-1 (LS); qCV-9 (ES) and qCD-9-2 (LS) ( Table 2 and Fig. 2).

Effects analysis of the stable QTLs
Three types of QTLs were defined as stable QTLs, including overlapped QTLs associated with multiple traits in a single environment; QTLs repeatedly detected in the two environments; and the QTLs that could be co-located with previous reports. In this study, a total of 13 loci consisting of 20 stable QTLs were obtained (Table 3).
To further clarify the effects of the stable QTLs, we summarized the phenotypic differences between the two alleles of each locus in the RIL population (Additional file 3: Table S3). We first divided the individuals of the RIL population into 02428 types and YZX types according to the marker genotypes of each identified locus and then compared the differences between the two genotypes for the corresponding traits.  Among the 13 loci, eight loci showed stable and reliable effects for all four traits. For these eight loci, the mean value of the corresponding traits of the individuals with elite alleles was greater than the individuals with nonelite alleles in the RIL population, and some even reached statistical significance. This indicated that these eight loci, although partially affected by the environment, were reliable and responsible for the phenotypic variation. The other five loci were associated with CD or CV, and the mean value of the individuals that harbored elite alleles was superior to the mean value of the nonelite alleles. For the effect of a single locus, using the absolute value of the difference between two alleles as the standard, loci3 consisting of qCL-3-1 (LS), qCSA-3 (LS), and qCV-3 (LS) had the largest absolute value of CL (0.30 cm) at LS. Loci8 consisting of qCV-6-1 (LS), qCSA-6 (ES), and qCSA-6 (LS) had the largest absolute value of CSA (0.07 cm 2 ) at ES. Loci10 consisting of Fig. 2 The QTL position on the high-density bin map of the YZX × 02428 RIL population. Square brackets indicate that different QTLs are located at the same physical location on the chromosome. The red text indicates the QTLs that are mapped in the two cropping seasons qCV-9 (ES) and qCD-9-2 (LS) had the largest absolute value of CV (1.65 mm 3 ) and CD (0.04 mm) at ES and LS, respectively. These results suggested that all these loci could be efficiently used for molecular breeding.
To confirm this, the elite alleles were used to test the effectiveness of pyramid breeding. Without considering the interactive effects among these 13 stable loci and environmental influences, the more elite alleles that were pyramided in an individual, the higher the values of these traits were in the two cropping seasons (Fig. 3). This result indicated that the pyramiding of favorable alleles could improve AG tolerance. Practically, AG tolerance could be modified via a combination of different numbers of favorable alleles in the varieties. In the RIL population, most individuals harbored four to seven favorable alleles, few harbored zero favorable alleles, and some individuals harbored up to 10 favorable alleles. Therefore, we also obtained accessions (G9, G10, G16, and G151) that carried more than eight elite alleles and exhibited superior AG tolerance ability (Additional file 4: Table S4) and could serve as favorable alleles donor parents in the breeding procedure.

Identification of candidate genes within the 13 stable loci via gene expression profiling
Considering that coleoptile growth in the two parents occurred mainly from the 2nd to the 5th day, and the differences between the two parents were significant, RNA-Seq analysis was performed on the whole germinating seeds, including the embryo and coleoptile, on days two, three, and four under anaerobic conditions. Through differential expression profiling analysis (FDR ≤ 0.05, absolute value fold change ≥ 1.5) of the two parents at different sampling points, the DEGs were identified for the corresponding days. A total of 10,053 DEGs were identified during the three periods (Fig. 4).
Such a large number of DEGs could not be responsible for the variation in anaerobic germination ability between YZX and 02428. Thus, functional annotations of the DEGs (YZX-LO versus 02428-LO) on the 2nd, 3rd, and 4th day were analyzed and GO enrichment was performed. We screened all GO terms with P < 0.05 (Additional file 5: Figure S1). Notably, we found that "response to stress," "response to abiotic stimulus," and "response to stimulus" were highly significantly enriched for all three sampling points. Another study conducted a large-scale transcriptomic analysis of the developing coleoptile and mature coleoptile under different durations of anaerobic conditions [29]. In their study, GO analysis of up-regulated and down-regulated DEGs (aerobic vs. anaerobic) indicated that the "response to stress" and "response to stimulus" terms were significantly enriched, which corroborates the present study.
Of the 13 stable loci, three were newly identified in this study and 10 overlapped with previous reports whose candidate genes responsible for traits were unknown. In order to further explain the mechanism of AG tolerance, we combined the expression profile data to analyze the genes located in these 13 stable loci intervals. A total of 84 candidate DEGs were obtained (Additional file 6: Table S6). The qCD-4-1 (LS) locus contains only one gene, while the qCL-3-1 (LS), qCSA-3 (LS), and qCV-3 (LS) loci contain 20 genes.
To further narrow down the scope of candidate genes, the fragments per kilo base of exon per million fragments mapped value of 84 genes and the phenotypic value of the four traits (CL, CSA, CV and CD) at three phases were used to calculate the Pearson correlation coefficients in R software. Taking p value<0.05 and the  absolute value of correlation coefficient>0.9 as the threshold; based on the method [30], the correlation network was constructed with genes and traits as nodes and correlation coefficients as edge. As shown in Fig. 5, the visualized network was divided into three distinct subnetworks. Genes associated with CL, CSA and CV mainly fell into one cluster, mainly including six highly correlated genes. Another cluster consisted of genes associated with CD, mainly including eight highly correlated genes. The 3rd subnetwork is composed of the remaining genes. Based on the gene annotation (http://rice.plantbiology. msu.edu/), gene differential expression multiples between two parents and correlation with traits. Twelve highly promising candidate genes were identified, including Os02g0271900, Os06g0109600, Os06g0110000, Os06g01 10200, Os07g0638300, Os07g0638400, Os07g0639400, Os09g0531701, Os09g0532900, Os10g0566800, Os12g 0539751 and Os12g0626500 (Table 4). Notably, the expression levels of Os07g0638400 and Os09g0532900 differ greatly between parents, and they are also highly correlated with traits. We further validated the expression levels of these 12 genes using qRT-PCR in YZX and 02428 seeds after 4 days anaerobic treatment. Which gave similar results to those of RNA-seq analysissuggesting that our results were reliable (Fig. 6).

Additional AG tolerance indicators should be developed for the evaluation of AG tolerance
To date, two main indicators have been used to identify the AG tolerance capacity of rice seeds. One indicator is the seedling survival ratio after 21 days of submergence under 10 or 20 cm water head, which is a standardized method developed by International Rice Research Institute [4,23]. Using this method, some germplasms with good AG tolerance have been screened, and several QTLs were identified [4,[23][24][25]. Notably, OsTPP7 in qAG-9-2 has been cloned and functionally validated [13], indicating that this method is feasible. However, the survival ratio method is labor-and time-intensive and laborious. One alternative method is to use a phenotype associated with coleoptile elongation as an indicator. The QTLs obtained using coleoptile elongation can be overlapped with those obtained using the survival rate [26,27]. In this study, coleoptile traits were validated using QTL mapping (Table 2). However, only seven QTLs were identified for CL, while 25 QTLs were identified for CV, CSA, and CD. Our QTL mapping results indicate that the use of CL as the only indicator is relatively inefficient. Therefore, it is necessary to develop more indicators. Such as days to emergence of first leaf or root, the seedling survival rate that via anaerobic change to aerobic treatment, seedling quality of seedling formed under anaerobic conditions.

Pyramiding of favorable QTLs can improve AG tolerance in rice
Marker-assisted backcrossing (MABC) has been shown to be an effective strategy. For example, SUB1 has been introgressed into different global varieties of rice. The swarna-Sub1 and BR11-Sub1 varieties, which were derived from MABC breeding, have shown excellent performance with consistent yield advantages during flash  [31]. AG tolerance is essential for germination and seedling establishment under anaerobic conditions and is a key trait for DSR varieties. Using the MABC strategy, the QTL containing OsTPP7 was also transferred into Ciherang-Sub1, which improved the near isogenic lines (NILs) harboring the SUB1 gene [32]. Phenotypic evaluation showed that the introgression of AG1 increased AG tolerance compared to Ciherang-Sub1. However, qAG-9-2 functions well under moderate stress conditions but does not tolerate high stress. Multiple QTLs or genes that combine qAG-9-2 should enhance AG tolerance.
In this study, 13 stable loci associated with AG tolerance were obtained, and each QTL accounted for 4.84%~12.65% of the corresponding phenotypic variation. Moreover, we tested the effectiveness of pyramid breeding using 13 stable loci, and all phenotypes indicated a tendency of increasing phenotypic value with increasing number of favorable loci. Although the RILs carried 10 favorable loci did not exhibit ideal phenotype due to only five RILs harbor 10 favorable loci result in poor representativeness, the RILs contains 7, 8, 9 or more favorable loci have better phenotypic value than that of less than 3 favorable alleles. As a result of transgressive segregation in the RIL population, we obtained some materials that carried multiple favorable alleles and possessed AG tolerance capacity beyond the parents. The identification of stable loci will provide a foundation for the further investigation of the molecular mechanisms of AG tolerance and also provides a material basis for DSR breeding.

Stable QTLs provide a relevant method for screening promising candidate genes involved in AG tolerance
Generally, increasing marker density is an effective means of improving QTL mapping resolution [33]. In this study, a high-density bin-map for the RIL population was used for QTL mapping in rice for AG tolerance, which significantly increased the QTL mapping resolution compared with traditional markers. However,  there were a large number of candidate genes within the identified QTLs intervals. The differential expression profile strategy is a useful approach for studying the genes associated with a given trait [34] and can effectively reduce the number of candidate genes. In our study, we sequenced the transcriptome at three key stages of AG of the two parents, obtained high-quality differential expression profiles, and then mapped the DEGs to 13 target genetic loci and finally obtained only 84 candidate genes, which suggests that the combination of high-density QTL mapping and RNA-Seq constitutes an effective strategy for identifying candidate genes for QTL mapping. Among the 12 highly promising candidate genes, we found two late embryogenesis abundant (LEA) protein-related genes, Os06g0110200 and Os12g0626500. Notably, Os06g0110200, which was differentially expressed by up to 730-fold between 02428 and YZX, showed extremely low expression in YZX. LEA proteins, which are named for their abundant expression during the later stages of embryo development in plant seeds, are a group of stress-responsive proteins that are induced by environmental stress. The transgenic overexpression of LEAs indicated that these proteins play an important role in the tolerance of rice to drought, salt, and abscisic acid [35,36]. The OsLEA4 protein encoded by Os06g0110200 enhanced the tolerance towards high salinity, heat, freezing, and UV radiation in E. coli recombinants [37]. Furthermore, microarray profiling showed that the expression level of Os06g0110200 was sharply reduced after treatment for 3 h under anaerobic conditions [38].
In addition, we found four peroxidase-related genes, of which the expression of Os07g0638300 was increased by up to 10-fold, Os07g0638400 is highly correlated with CL,CSA and CV. Peroxidase activity in germinating seeds incubated under anoxic conditions for 5 d was substantially lower in the tolerant genotypes than in the intolerant genotypes [3]. Transcript profiling of anoxia-grown rice seedlings revealed that a catalase-coding transcript was strongly down-regulated under anoxia [39]. Reactive oxygen species (ROS) are considered as messengers or transmitters of environmental cues during seed germination [40]. Hydrogen peroxide (H 2 O 2 ) is a ROS, and the accumulation of H 2 O 2 has been reported in the early stage of germination imbibition in different plant species [41][42][43]. The exogenous application of H 2 O 2 improved seed germination in many plants, including barley, pea, almond, and rice [44][45][46][47]. However, the excessive production of ROS by germinating seeds has often been considered as a negative factor that inhibits the germination process. Therefore, antioxidative mechanisms are regarded as being of particular importance for germination success via the conversion of H 2 O 2 to oxygen and water [48,49]. Based on known results, we speculated that the large accumulation of H 2 O 2 promoted seed germination in the early stage of seed germination, while excessive H 2 O 2 inhibited seed germination in the middle and late stages of seed germination. Thus, the AG tolerance materials exhibited a tendency of shift in gene expression from high to low, while the nonresistant material exhibited the opposite. Therefore, peroxidase can affect the AG ability of rice seeds by regulating the content of H 2 O 2 in the seeds.
A study reveals two crucial transcription factors, MYBS1 and MYBGA, present in rice (Oryza sativa) and barley (Hordeum vulgare), that function to integrate diverse nutrient starvation and gibberellin (GA) signaling pathways during germination of cereal grains. Sugar represses but sugar starvation induces MYBS1 synthesis and its nuclear translocation. GA antagonizes sugar repression by enhancing conuclear transport of the GA-inducible MYBGA with MYBS1 and the formation of a stable bipartite MYB-DNA complex to activate the α-amylase gene [50]. In the region containing two QTLs on chromosomes 2 and 9, we identified two genes, namely Os02g0271900 and Os09g0532900, that encode MYB family transcription factors, the functions of which may be similar to those of the homologous genes OsMYBS1 and MYBGA [50].
Os06g0110000 and Os06g0109600, which encode cytochrome P450 and adenylate kinase, respectively, are associated with CL and showed lower expression levels in the tolerant material 02428. The most common reaction catalyzed by cytochrome P450 is a monooxygenase reaction, for example, the insertion of one atom of oxygen into the aliphatic position of an organic substrate (RH), while the other oxygen atom is reduced to water. Os06g0109600 encodes a protein that catalyzes the interconversion of adenine nucleotides (ATP, ADP, and AMP). Microarray profiling shows that the expression level of Os06g0109600 was sharply reduced after treatment for 3 h under anaerobic conditions, while the expression of Os06g0110000 was increased rapidly after 3 h of aerobic treatment [51]. This result indicates that these two genes play a negative role in AG tolerance, as the AG-tolerant materials maintained lower expression levels of these genes. Os09g0531701 encodes a glycosyl transferase family 8 protein that catalyzes the transferal of saccharide moieties from an activated nucleotide sugar (also known as the "glycosyl donor") to a nucleophilic glycosyl acceptor molecule. The result of glycosyl transfer can be a carbohydrate, glycoside, oligosaccharide, or a polysaccharide. The relationship between the activity of glycosyl transferase and AG tolerance is unknown. In the present study, the expression of Os09g0531701 was significantly reduced by up to 142-fold in the AG-tolerant material 02428, indicating that anaerobic conditions limit the activity of glycosyl transferase. In addition, Os12g0539751 exhibited highly variable expression levels among the parents. All of the above candidate genes provide a basis for the molecular cloning and development of target gene markers.

Conclusions
In this study, a biparental population with a high-density genetic bin map was used to identify QTLs for rice resistance to AG. CL, CSA, CV, and CD were used as indicators for our phenotypic identification. A total of 25 loci were detected across two cropping seasons. Including the 10 loci that overlapped with previous reports, a total of 13 were defined as stable loci. After verification, pyramiding of the 13 favorable loci could improve AG, which confirmed the accuracy of our mapping results. We sequenced the transcriptome at three key stages of AG of the two parents, obtained high-quality differential expression profiles, mapped the genes into 13 target genetic loci, and ultimately obtained 12 promising candidate genes. These results improve our understanding of the genetic basis of rice seed AG.

Plant material
The mapping population consisted of 192 RILs derived by single-seed descent from an inter-subspecific cross of the indica YZX and the japonica 02428 [52]. YZX and 02428 were bred by Hunan Academy of Agricultural Sciences and Jiangsu Academy of Agricultural Sciences, respectively. The RILs and parents were grown in a paddy field at the South China Agricultural University, Guangzhou, China (at approximately 113°east longitude and approximately 23°north latitude) at the early cropping season (ES) and late cropping season (LS) in 2016. Each RIL or parent was planted in a block designed of 6 column × 6 row, with spacing of 20 cm among the plants. Crop management, and disease and insect pest control were performed as locally recommended. All the materials come from the germplasm resource bank of the National Engineering Research Center of Plant Space Breeding. Considering that seed maturity affects AG, six individual plants in the middle of each block were harvested independently at the 35th day after heading in ES and the 40th day after heading in LS. The harvested seeds were dried in a heated air dryer at 42°C for 5 d and then stored at − 20°C.

Evaluation of tolerance to anaerobic conditions during germination
From the six independently harvested plants per season, three individuals were selected per line and 20 healthy seeds were selected for each. Seeds were placed in an oven at 50°C for 7 d to break dormancy, following which they were surface-sterilized with 20% diluted bleach (6-7% NaClO) for 20 min and then thoroughly rinsed with sterile water. Five seeds were placed at bottom of one centrifuge tube (10 cm), which were filled with distilled water to create an oxygen-free environment. All centrifuge tubes were immediately placed in a chamber with an 8 h light (200 μmol m − 2 s − 1 )/16 h dark cycle at 30°C. After 6 d, a WinRHIZO (Regent Instruments Inc., Québec, Canada) root image analysis system was used to measure the coleoptile length (CL), coleoptile surface area (CSA), coleoptile volume (CV), and coleoptile diameter (CD). Twelve replicates (centrifuge tubes) were performed per line. Statistical analysis was performed with SAS (Statistical Analysis System, version 8.01) and Microsoft Excel.

QTL analyses
We used 192 RILs derived from the 02428/YZX cross for mapping the QTLs involved in rice seed AG. Using the Genotyping-By-Sequencing approach, 85,742 high-quality SNPs were validated for recombinant event determination. Using a sliding window approach, a high density bin map with an average physical length of 137.68 kb was obtained. Which contained 2711 bin markers that were evenly distributed across 12 chromosomes. The number of markers on each chromosome ranged from 162 to 311. (Chen et al. 2016). QTL Ici-Mapping v4.1 [53] software was used for QTL analyses. The significance threshold value of the logarithm of odds (LOD) scores for QTL detection was 2.5; if the LOD ≥ 2.5, the site was considered to constitute a QTL for the trait, and the additive effect of each QTL and the contribution rate to the trait were calculated.

RNA sequencing (RNA-Seq) and data analysis
The total RNA of each sample was homogenized using a mortar and pestle with liquid nitrogen and then purified using the Plant Total RNA Purification Kit (ComWin Biotech Company) following the manufacturer's instructions. RNA-Seq library construction and sequencing followed a previous protocol [52]. Quality control was confirmed within the Illumina HiSeq software and all reads passing the filtering specifications were mapped onto the reference genome IRGSP-1.0.
After the expression level of each transcript and gene was calculated, differential expression analysis was conducted using edgeR [54]. The false discovery rate (FDR) was used to determine the p-value threshold in multiple tests, and for the analysis, a threshold of FDR ≤ 0.05 and an absolute value of fold change ≥ 1.5 were used to assess the significance of the gene expression. The differentially expressed genes (DEGs) were used for GO (Gene Ontology) enrichment, and the groups with FDR ≤ 0.05 were considered to be significantly enriched.