Sudh K. Nir, Vijy Chikm,Mnje Gowd, Vemuri Hindu,Alreht E. Melhinger, Prsnn M. Bodduplli,*
aInternational Maize and Wheat Improvement Center(CIMMYT),ICRISAT Campus,Patancheru, Greater Hyderabad 502324,India
bInternational Maize and Wheat Improvement Center(CIMMYT), ICRAF Campus,UN Avenue,Gigiri,P.O.Box 1041-00621,Nairobi,Kenya
cInstitute of Plant Breeding,Seed Science and Population Genetics,University of Hohenheim,D-70593 Stuttgart,Germany
A B S T R A C T
Derivation of doubled haploid lines in maize through in vivo induction is routinely practiced in multi-national commercial maize breeding programs [1,2] and is gaining increasing popularity in the maize breeding programs of the developing world. Large scale haploid induction in maize for doubled haploid (DH) line production is based on an in vivo method[3,4], which involves crossing the source germplasm from which new inbred lines are desired, with pollen from maize genotypes called haploid inducers [5,6]. Pollen from haploid inducers has the capability of inducing seed formation that has a haploid embryo and normal triploid endosperm at a frequency of about 10- to 100-fold higher than the natural occurrence of haploids which is about 0.1% [7]. The first haploid inducer,Stock 6,with a haploid induction rate(HIR)of~3% was identified in 1959 [7]. Efforts to improve the HIR and adaptation to specific environments led to the development of several Stock 6-based haploid inducers and derivatives with high HIR.Now,haploid inducers with >6%HIR are available in temperate[8,9](https://plant-breeding.uni-hohenheim.de/en/84531#jfmulticontent_c167370-2) and in tropical [10,11] genetic backgrounds.
Considerable progress has been made in recent years towards understanding the genetics of haploid induction in maize. QTL conditioning haploid induction in maternal haploid inducers were identified in bi-parental populations derived by crossing haploid inducers with non-inducers[12-14] and in an association mapping population consisting of a large set of inducer lines and non-inducer lines using a novel genome wide association mapping strategy [15]. The most significant QTL identified in these studies was on chromosome 1, which was fine mapped [16,17]. The gene underlying this QTL with a strong effect on haploid induction rate was identified as a sperm-specific phospholipase[18-20].However, the mechanism by which this protein triggers haploid induction needs to be further studied. Another significant QTL on chromosome 8 [14] was also fine mapped[3].
Other than haploid inducers, HIR was noted to be influenced by two factors: a) the environment in which the induction crosses were conducted; and b) the maternal genotype (referred to as inducibility). The influence of environment on HIR is quite contradictory,with some studies indicating no influence [21,22] and others indicating significant influence [8,23,24]. In contrast, many studies provide evidence for the influence of maternal genotype on the HIR[8,9,22,23,25,26]. Given the same inducer, some maternal genotypes respond more favorably to haploid induction than the others.
QTL mapping and association mapping approaches can be used to identify the maternal QTL or genomic regions influencing HIR. A QTL mapping study earlier identified maternal QTL influencing the haploid induction in a population derived from two inbreds that have contrasting HIR when crossed to the same inducer [26]. However, by using QTL mapping with bi-parental populations, only limited variation is explored. Hence, many loci important for controlling the trait may not segregate and consequently,cannot be detected[27,28]. On the contrary, genome-wide association studies(GWAS)allow sampling of a wider genetic diversity present in a population of diverse inbred lines and can also potentially map the alleles/loci influencing the trait more precisely due to the large number of historical recombinations accumulated in many different lineages included in the mapping population.
Though GWAS is a very powerful strategy to identify genetic variants, identification of common alleles with smaller effect sizes require larger sample sizes. Genomic prediction (GP) has emerged as a powerful breeding tool employing molecular markers for traits with a complex genetic basis.GP is carried out using SNPs across the genome,without statistical tests for significant markers to be included in the biometric model. Prediction accuracy is highly dependent on trait architecture, apart from many other factors[29,30]. Meta-analysis of GWAS is a statistical synthesis of GWAS results from multiple studies to increase the detection power and to reduce false positives[29].Though large number of GWAS meta-analysis studies have been reported in humans and other animals [29], meta-analysis studies are relatively rare in plants.
In this study,we crossed a large number of diverse tropical inbreds organized in two association mapping panels (AMP)with the pollen from tropicalized haploid inducers and assessed the HIR in each of these inbreds. Our objectives were to (1) study the variation for haploid inducibility among tropical inbred lines,(2)identify the best tropical inbreds that respond favorably to haploid induction, and (3) identify genomic regions in the maternal parent influencing HIR using GWAS separately and jointly across both AMPs.Genomic predictions were carried out within each AMP using trait associated as well as random markers.
The maternal genotypes were organized into two association mapping panels. The first Association Mapping Panel (AMP1)comprised 442 inbred lines adapted to the tropics and subtropics in a wide range of environments including Latin America, sub-Saharan Africa and Asia. Of these, 271 are CIMMYT Maize Lines (CMLs); 149 were a subset of the Improved Maize for African Soils(IMAS) association mapping panel;and 22 were a subset of the Drought Tolerant Maize for Africa (DTMA) association panel. The second Association Mapping Panel (AMP2) comprised 230 breeding lines from CIMMYT, 188 of which were part of the DTMA panel and the rest were CMLs.Seeds for CMLs were procured from CIMMYT maize gene bank. Seeds for DTMA and IMAS lines were obtained from CIMMYT's Maize Molecular Breeding Program based in Mexico. Two different tropicalized haploid inducers Tropically Adapted Inducer Lines 8 and 9 (TAIL8 and TAIL9)were used for haploid induction crosses. TAIL8 and TAIL9 inducers were developed at CIMMYT in collaboration with the University of Hohenheim by crossing temperate inducer hybrids RWS × UH400 and RWS × RWK to CML494, respectively. TAIL8 has higher HIR than TAIL9 but exhibits similar plant vigor [31]. Both inducers are equipped with the R1-nj marker for discrimination of haploid and diploid kernels in the induced progeny based on color marker expression[5].
Haploid induction crosses on AMP1 and AMP2 inbreds were conducted in the summer of 2012 and 2015, respectively, at the Agua Fría experimental station in Mexico (20.26°N,97.38°W; ~110 m a.s.l.). TAIL9 was used for haploid induction in AMP1 and TAIL8 was used for AMP2.Both AMP inbred lines were grown in two replications with each replication consisting of 38 plants planted in two rows 4.5 m long. To achieve flowering synchrony with inbred lines of different maturities, haploid inducers were stagger-planted four times at a five-day interval. The maternal genotypes were detasseled before their silks emerged. Pollen was collected from 10 to 15 haploid inducer plants, bulked and used for pollinating the silks of inbred plants. Harvested ears from each inbred in each replication were shelled and maintained separately in a cold room at 8-12°C till used for planting.
HIR of AMP1 and AMP2 inbreds was evaluated in the summer of 2014 and in the winter cycle of 2016, respectively, at Agua Fría experimental station. HIR was assessed by the gold standard haploid/diploid classification based on plant characteristics like plant vigor,leaf width,erectness and paleness[4,5,16,31,32]. Haploids show distinctly poor vigor with narrow, erect and pale leaves compared to diploids, and hence can be accurately distinguished from diploids. For AMP1, the experimental design was a partially replicated alpha lattice design, where 39% of the inbreds planted were included in both replications.For AMP2,the experimental design was also an alpha lattice with two replications,each having 500 seeds.To accommodate large numbers of plants in minimal space,induced seeds were planted at a spacing of 75 cm × 10 cm on the beds.The plant density using this design was 266,666 per hectare. After three weeks of planting, the total number of survived plants,haploid and diploid plants were recorded for each entry based on the plant characteristics as described earlier. Diploid plants were removed afterwards but any doubtful plants were left till flowering by which the ploidy can be established accurately.HIR was calculated as[(number of true haploids / number of surviving plants) × 100]. Nongerminated seeds and plants that died before HIR evaluation were not considered in HIR determination.
DNA of all inbred lines was extracted from leaf samples of 3-4 weeks old seedlings by using the standard CIMMYT laboratory protocol [33]. Genotyping was carried out by the genotyping by sequencing (GBS) platform [34] at the Institute of Genomic Diversity, Ithaca, USA. The original data set consisted of 955,690 SNPs across all chromosomes, which included partially imputed data based on an algorithm that searches for the closest neighbor in small SNP windows across the entire maize database (approximately 22,000 Zea samples),allowing for a 5%mismatch[35].For GWAS,filter criteria of call rate ≥0.7 and minor allele frequency ≥0.03 were used,yielding 324,625 SNPs for AMP1 and 333,397 SNPs for AMP2.For calculating a principal component analysis and kinship matrix,high quality SNPs with filtering criteria of CR ≥0.9 and MAF ≥0.1 were used and a random sample of 122,303 and 129,800 SNPs was chosen from AMP1 and AMP2,respectively.For linkage disequilibrium (LD) analysis, we used filtering criteria of CR ≥0.9 and MAF ≥0.3 which generated 31,456 and 36,816 SNPs in AMP1 and AMP2,respectively.
The PCA method, described by Price et al. [36], was implemented in SNP & Variation Suite (SVS) V_8.6.0 (SVS, Golden Helix,Inc.,Bozeman,MT,USA,www.goldenhelix.com).A twodimensional plot of the first two principal components was created to visualize the possible population stratification among the samples. A kinship matrix was computed as the GBLUP genomic relationship matrix [37] following overall normalization as executed in SVS V_8.6.0. The extent of genome wide LD was based on adjacent pairwise r2values between high quality SNPs from the GBS SNP set and physical distances between these SNPs [37]. Nonlinear models with r2as responses (y) and pairwise physical distances (x) as predictors were fitted into the genome-wide LD data using the ‘nlin' function in R [38]. Average pairwise distances in which LD decayed at r2=0.2 and r2=0.1 were calculated based on the model given by[39].
GWAS was carried out on the HIR phenotypes based on the single locus mixed model analysis correcting for both population structure and kinship (EMMAX) [40] as implemented in SVS V_8.6.0 [41]. In the linear models, the first 10 principal components were used as covariates. Manhattan plots were created using the-lg(P-values)of all SNPs used in analysis, and Q-Q plots were obtained by plotting the observed -lg (P-values) and the expected -lg (P-values) to determine genomic inflation, if any. Independent tests of association between one SNP and HIR was estimated and significance of association was declared based on a Bonferroni corrected P ≤0.05.
A meta-analysis was performed with the association analysis outputs from the two GWAS by employing a random effects model combining the effect sizes of each common SNP that were analyzed in the two studies using software SVS V_8.6.0. Variance components for random effects, one for each marker, were computed using the DerSimonian-Laird approach[41].Genomic control correction [42] was applied to each GWAS before the meta-analysis and to the overall results after meta-analysis. A total of 312,010 SNPs which were common in the two GWAS,were used in meta-analysis.Heterogeneity of effect sizes across the two studies was assessed using Cochran's Q and I2statistics. Selected SNPs from the meta-analysis was plotted on a Forest plot to visualize the overall effect size estimate and heterogeneity between the two GWAS.
Genomic best linear unbiased prediction (GBLUP) was carried out using standard method employing the genomic relationship matrix generated using overall normalization [37]. The prediction was carried out within AMPs, with and without feature selections as employed in SVS v8.6 of Golden Helix software.A 10-fold cross validation scheme with 10 iterations was used to generate the training and validation sets and assess the prediction accuracy for different scenarios. For GP with feature selection,the 27 significant SNPs from AMP1,two significant SNPs from AMP2 and 19 most significant SNPs from meta-analysis were pre-selected. In the case of GPs without feature selection (‘Random'), 3000 SNPs distributed randomly in the genome were selected. The average value of the correlations between the phenotype and the genomic estimated breeding values was defined as genomic prediction ability. For all the genomic prediction analyses, the training dataset and validation dataset were independent.
Most of the genotypes in AMP1 showed HIR in the range of 4%-10%while the majority of the genotypes in AMP2 showed HIR in the range of 4%-8% (Fig. 1). In both populations, a few genotypes showed very low HIR(<2%)or very high HIR(>14%)(Fig. 1). The fifteen best inbred lines for HIR from each AMPs were presented in the Table S1. Among the white and yellow inbred lines,the latter showed higher mean HIR compared to the former in both panels (Fig. S1-A). Inbreds adapted to lowland tropical environments tended to show higher HIR compared to sub-tropical inbreds in both panels(Fig.S1-B).No clear pattern emerged when we compared the HIR of inbreds classified under the CIMMYT heterotic groups A and B(Fig.S1-C). In AMP1, heterotic group A showed higher mean HIR compared to heterotic group B. The opposite trend was observed in AMP2. AMP2 is constituted of inbreds developed from two important source germplasm,namely DTP and LPS.The inbreds derived from DTP showed higher mean HIR compared to LPS(Fig.S1-D).HIR assessed in different inbreds in both panels showed high heritability(>0.8%)(Fig.1).Among the agronomic traits assessed in the inbred lines, only plant height and ear aspect were correlated with HIR (Table S2).Days to silking, ear height and ear rot scores were not correlated with HIR.
Principal component analysis revealed a moderate population structure in the AMP1 and AMP2 datasets (Fig. 2). About 10 eigen vectors were required to explain about 50% of the variance in both panels.For AMP1,the first two eigen vectors separated two general clusters of tropical and sub-tropical lines. The tropical lines from Latin America, Africa, and Asia were not separated from each other (Fig. 2a). For AMP2, the first two principal components explained 9.8% and 5.7% of variation and did not clearly separate the sub-tropical and tropical lines (Fig. 2b). The genome-wide LD decay plotted as LD(r2)between adjacent pairs of markers versus distance in kb showed that average LD decay was 21.27 kb at r2= 0.1 and 7.38 kb at r2=0.2 for AMP1(Fig.3a)and 33.05 kb at r2=0.1 and 11.47 kb at r2= 0.2 for AMP2 (Fig. 3b). The LD in the AMP1,which was a representative of the CIMMYT elite and breeding lines seemed to decline faster compared to the AMP2 which was primarily composed of inbred lines with better adaptation to the Africa tropics/sub-tropics.
Fig.1-Phenotypic distribution and components of variance for HIR in two AMPs.HIR,haploid induction rate;AMP,association mapping panel.
The single locus mixed model analysis on the AMP1 identified 27 SNPs significantly associated with HIR at a Bonferroni corrected P <0.05 (Table 1, Fig. 4a) on all the chromosomes.SNP S6_156591426 on chromosome 6 showed the strongest association with the least P value (1.47E-11) and Bonferroni corrected P-value (4.98E-06). Phenotypic variance explained by individual significant SNPs ranged from 6.30% to 10.10%.From the analysis of AMP2, two SNPs were identified as significantly associated with HIR at this threshold, and both were located around 128.5 Mb on chromosome 6(B73 AGP V2)(Table 1, Fig. 4b). Favorable alleles from 20 of the 27 SNPs identified in AMP1 were low in frequency in the panel and the two SNPs identified in AMP2 were also low in frequency among the lines in that panel. The SNP with the most significant effect towards HIR identified from AMP1 explained 10% of the phenotypic variance and in AMP2, the most significant SNP identified, S6_128560933, explained 17% of the variation. Multiple regression analysis including all significant SNPs from the analyses explained 42% and 53% of the phenotypic variation for the trait in AMP1 and AMP2,respectively(data not shown).
Fig.2- Two-dimensional PC plot based on the first two principal components of AMP1(A)and AMP2(B) used in GWAS.The germplasm groups representing the color codes are mentioned within in the figures.
Fig.3- Linkage disequilibrium(LD) plot representing the average genome wide LD decay in the AMP1(A)and AMP2(B)with genome-wide markers.The values on the Y-axis represents the squared correlation coefficient r2 and the X-axis represents the physical distance in kilo base(kb).
Meta-analysis was conducted using the GWAS results obtained from AMP1 and AMP2, after applying genomic control corrections.Fifty-two SNPs were found to have significant effects in both GWAS and the meta-analysis (Table S3, Fig. 5). These SNPs had zero I2value, showing absence of heterogeneity between the SNP effects in both GWAS.The SNP with the highest strength of association was S5_205925729, on chromosome 5,with effect sizes of 0.23 and-0.31 in the AMP1 and AMP2 analysis,respectively, and a combined effect size of -0.28 in the metaanalysis.Six SNPs identified in the meta-analysis were found to be in chromosomal bins previously reported for having QTL for this trait(Table S3,Fig.S2).
GP within the panels using SNPs associated with HIR and random SNPs in 10-fold cross validation analysis revealed moderate predictive ability. Random SNPs based average prediction ability was 0.41 and 0.33 in AMP1 and AMP2,respectively(Fig.6).Further,inclusion of HIR associated SNPs into the prediction model improved the mean accuracies to 0.49 and 0.44, respectively, in AMP1 and AMP2. Overall, the prediction ability was improved by including the HIR associated significant SNPs with the random SNPs.
Table 1-Chromosomal position and SNPs significantly associated with HIR detected by GWAS in two association mapping panels.
In vivo haploid induction based on maternal haploid inducers is the backbone of doubled haploid line development in maize[14].The ability of the haploid inducer to induce haploids and the ability of the source germplasm to be induced for haploids can be determined based on the HIR obtained in a specific inducer and source germplasm combination.Determining the HIR typically involves crossing the source germplasm as female parent with a haploid inducer as male parent and differentiating the haploids from the diploids based on R1-nj color marker expression in the seeds (induced progeny).However, R1-nj color marker expression can be inhibited in some of the tropical inbreds [43] and even when expressed,can potentially lead to significant proportion of false positives and false negatives[9,31,44].Hence,we have not relied on R1-nj marker expression for determination of HIR in this study.The experiments in this study used a more reliable and accurate method based on plant traits for HIR assessment,even though it is resource intensive and required a large field area planted under trials. The method of haploid/diploid identification based on plant traits was used as a gold standard for determination of HIR in several previous studies[4,16,31,32,44].
Fig.4- Highly significant SNPs for maternal genetic influence on HIR identified from MLM(Q+ K)model represented in Manhattan plot,plotted with the individual SNPs on the X-axis and-lg(P-value)of each SNP in the Y-axis for AMP1(A)and AMP2(B).The threshold line represents Bonferroni P <0.05.
Fig.5- Manhattan plot showing SNP associations to maternal genetic influence on HIR in meta-analysis (top panel)and individual GWAS of AMP1 and AMP2.The red dots are SNPs identified in meta-analysis;the blue dots are from AMP1 GWAS;and the green dots are from AMP2 GWAS.
Fig.6-Box plot showing genomic prediction ability from 10-fold cross validation using 3000 random SNPs and HIR associated SNPs identified in AMP1 and AMP2.The boxes represent the first and third quantiles and the median is represented by a short black line within the box.The lines extending from the boxes to the horizontal bars represent the distance to the maximum and minimum observations.
This is the first large-scale study conducted using tropical maize germplasm to elucidate the influence of the seed parents on the in vivo haploid induction rates, besides identification of putative genomic regions in the maternal parents influencing HIR. Analysis of HIR in the inbreds comprising the two AMPs used in this study confirmed significant influence of the maternal genotype, as was also observed in earlier studies [9,22,23]. In addition, the analysis also revealed huge variation for haploid inducibility in tropical inbred lines with a few lines showing very high HIR and a few lines showing very low HIR,while majority showed moderate HIR.Average HIR observed in each AMP was in a similar range reported earlier for the respective haploid inducers[31].
HIR data generated in this study on elite tropical inbred lines is valuable for tropical maize breeding programs in constitution/selection of relevant source populations for DH line development, as well as optimal utilization of resources in haploid induction. For example, in populations developed from inbred lines that respond poorly to haploid induction,more plants can be planted and pollinated with inducer pollen to obtain the minimum number of haploids required for production of a desired number of DH lines. The observation that white maize germplasm and lowland tropical germplasm respond more favorably for haploid induction compared to yellow maize germplasm and subtropical germplasm respectively indicates a need for careful planning of the induction nurseries when using different germplasm groups.This study also indicated that HIR is correlated with plant height and ear aspect of the inbred lines,indicating that vigorous plants with big ears and good grain filling may show higher inducibility than less vigorous plants with poorly filled ears.These results are corroborated by the observations that HIR was higher in agronomically improved lines than those of unimproved lines[45]and that HIR was higher in tropical single-crosses derived from elite lines than in landraces and open-pollinated varieties [9]. Days to silking did not show any correlation with HIR indicating that maturity group of the germplasm does not influence haploid inducibility.
The present study also reports inbred lines that respond favorably to haploid induction showing about twice the average HIR of all the inbreds assessed in the respective panel. Similar to previous studies, this study also revealed that the response to haploid induction is highly heritable with significant genetic variance [23]. A recent diallel analysis for HIR trait using temperate germplasm indicated that HIR is mostly controlled by additive genetic factors [24]. Predominant role of general combining ability rather than specific combining ability was also observed for HIR in both temperate and tropical germplasm [23,24]. Together, these results indicate that haploid inducibility in maize germplasm can be improved through selection and genotypes showing favorable response to haploid induction can be effectively used for improving the haploid inducibility. It was also proposed that the germplasm with high inducibility can be effectively used in germplasm enhancement projects to cross with unadapted germplasm to improve haploid inducibility and reduce haploid/diploid misclassification rates [24]. Thus, in addition to improving the HIR in the haploid inducers,it is also possible to improve the haploid inducibility of the elite or nonelite germplasm for enhancing the efficiency of DH line production.
Previously, genetic analyses of haploid induction mainly focused on haploid inducers and led to identification of several QTL responsible for haploid induction through linkage mapping [13,14] and association mapping [15] approaches.The gene underlying the major QTL identified in these studies encodes a pollen-specific phospholipase and a frame-shift mutation in this gene in haploid inducers is critical for conditioning the haploid induction [18-20]. Other QTL identified in haploid inducers may be modifiers of this gene function [14,15]. To understand the genetic architecture of haploid induction trait fully,identification of maternal genetic modifiers is also important as haploid induction is significantly influenced by the female parent as described earlier.Using a linkage mapping population, Wu et al. [26] identified QTL on chromosomes 1 and 3 in the maternal genome having considerable effect on HIR.In this study,the GWAS of the two panels identified a total of 29 SNPs to be significantly associated with HIR. Significantly associated SNPs explained phenotypic variation as high as 17% in AMP2 and 10% in AMP1.All the significant SNPs from both GWAS explained 42%and 53%of the phenotypic variation for the trait in AMP1 and AMP2 respectively. These results indicate that maternal genome influence on haploid induction could be controlled by genes with moderate to large effects. This is unique observation as most of the traits for which GWAS have been conducted in maize, apart from biochemical traits and resistance to some diseases, are controlled by large numbers of small effect loci which typically explain <9% of the total phenotypic variation [46]. The significant associations detected from the two panels were not common,probably owing to different genomic regions/loci responsible for maternal effects leading to haploid induction. The two panels were composed of different sets of tropical/sub-tropical germplasm, with AMP1 being predominantly CIMMYT elite breeding lines developed for different tropical environments,including Latin America,Africa and Asia,and AMP2 composed of lines predominantly adapted to African tropics/sub-tropics.In addition, AMP1 and AMP2 were induced for haploids in different years;hence,different environments could also have influenced the HIR differently in each set of inbred lines.Several candidate genes and the functional domains in those genes were also identified in the study.However,we could not specifically associate their reported functions with the maternal genetic influence on HIR.
To increase the power of QTL detection and identify common alleles with small effects, a meta-analysis was conducted on the two GWAS. In this study, heterogeneity in genetic effects was allowed between the two panels of inbreds primarily due to the use of different haploid inducers, and hence a random-effects model was adopted in the metaanalysis to overcome this limitation [47]. Some of the SNPs identified from the individual GWAS studies and metaanalysis on chromosomes 1 and 3 fell within the physical co-ordinates of the marker intervals of the QTL identified by Wu et al. [26]. On chromosome 3, Wu et al. [26] reported a candidate gene,Centromere protein c1(Cen pc1),estimated to be located between positions 217,200,846 and 217,549,349 base pairs. CENPC proteins play an important role in DNA binding reactions and hence, Cen pc1 may be related to haploid induction[48,49].In our study,one of SNPs identified through meta-analysis was found to be located within the reported physical co-ordinates of Cen pc1. Apart from these, several novel genomic regions influencing haploid inducibility were identified through GWAS analyses followed by meta-analysis.Genetic analysis of maternal parent influence on in vivo haploid induction in the present study revealed that the genomic regions controlling maternal haploid induction in the haploid inducers are not co-localized with the genomic regions in the maternal parents influencing haploid induction rate.
Even though considerable progress has been made on understanding the genetic basis of haploid induction, the biological mechanism of in vivo maternal haploid induction is still elusive. Two possible mechanisms discussed are parthenogenetic induction of the egg cell into embryo without fertilization, and normal fertilization but elimination of inducer chromosomes from the developing embryo. Single fertilization and parthenogenetic induction of haploids were reported in several studies [50-53]. Elimination of paternal chromosomes was also observed in other studies [54-57].Since experimental evidence exists for both proposed mechanisms,the maternal genomic regions identified in this study influencing HIR could be involved with either of the abovementioned mechanisms of haploid induction.These maternal modifiers may possibly be involved in influencing pollen competitive ability,fertilization and seed development.
The efficiency of maize breeding programs can be enhanced through GP. There have been numerous studies in maize that showed the importance of trait heritability,training population size, number of markers used, training population design along with trait architecture in improving the prediction ability [30,58-61]. The potential of GP has been assessed for simple as well as complex traits in maize[60]and also implemented for various traits in CIMMYT maize breeding programs [61-63]. The prediction ability obtained in this study is high enough to warrant GP for HIR in source germplasm in practical breeding programs. The mean accuracies of 41%and 33%in AMP1 and AMP2,respectively,are in accordance with previous studies of moderately complex traits such as haploid male fertility [64], ear rot [64], maize lethal necrosis[65,66]and northern corn leaf blight[67].It has been suggested in many GP studies in maize that a population with broad genetic base, like that of association mapping panels compared to breeding populations, shows lower prediction ability [61,66]. The observed differences in the prediction ability in the two panels studied here could be due to their differences in sample size, genetic variance, trait heritability, changes in population structure and LD estimates.Inclusion of HIR associated markers into the prediction model led to substantial increase in the prediction ability in AMP1 and AMP2 (49% and 44% respectively), although this could be marginally over-estimated as GWAS and GP were carried out within the same panels. The significant SNPs associated with the trait explained 42% and 53% of its variation, indicating that the prediction ability could be attributable to moderate to large effect QTL detected in this study and few other small effects QTL distributed across genome.
Overall, this study indicated that the maternal parent's influence on in vivo haploid induction is a highly heritable trait with significant genetic variation. Therefore, this trait can be potentially improved through selection in elite breeding materials. The results from the GWAS and GP indicated that maternal influence on haploid induction is a less complex trait controlled by a few genomic regions with moderate to large effect size along with several small effect ones. Inbreds identified with very high and very low HIR can be possibly used in further genetic studies to validate the genomic regions influencing the maternal parent influence on in vivo haploid induction in maize.
Declaration of competing interest Authors declare that there are no conflicts of interest.Acknowledgments The authors thank Raman Babu for his inputs on the experimental design; Leocadio Martinez and Luis Antonio Lopez for their assistance in the field experiments and data collection; Juan Burgue?o for data analysis; and Rafael Venado and Guadalupe Berber for their support in sample collection and DNA extraction. The Bill and Melinda Gates Foundation (BMGF), United States of America, supported this work through the projects“A Double Haploid Facility for Strengthening Maize Breeding Programs in Africa”(grant ID:OPP1028335) and “Stress Tolerant Maize for Africa (STMA)”(grant ID: OPP1134248). Additional support came from CGIAR Research Program on Maize (MAIZE), and MasAgro Project funded by the Secretariat of Agriculture, Livestock,Rural Development, Fisheries and Food (SAGARPA), Government of Mexico. MAIZE receives W1 & W2 support from the Governments of Australia, Belgium, Canada, China,France, India, Japan, the Republic of Korea, Mexico, the Netherlands, New Zealand, Norway, Sweden, Switzerland,the United Kingdom, the United States of America, and the World Bank.Appendix A.Supplementary data Supplementary data for this article can be found online at https://doi.org/10.1016/j.cj.2019.09.008.R E F E R E N C E S
[1] W.Schmidt,Hybrid maize breeding at KWS SAAT AG,Proceedings of the Annual Meeting of the Austrian Seed Association, Gumpenstin, Austria,Nov. 25-27 2003, pp. 1-6,(in German).
[2] T.Cupka,Making the most of double haploid breeding in line development,49th Illinois Corn Breeders School Proceedings,Urbana-Champaign, Illinois, USA 2013, pp. 1-27.
[3] C. Liu, W. Li, Y. Zhong, X. Dong, H. Hu, X. Tian, L. Wang, B.Chen,C.Chen,A.Melchinger,S.Chen,Fine mapping of qhir8 affecting in vivo haploid induction in maize, Theor. Appl.Genet. 128 (2015) 2507-2515.
[4] A.E. Melchinger, W.Schipprack, T. Würschum, S. Chen, F.Technow, Rapid and accurate identification of in vivoinduced haploid seeds based on oil content in maize,Sci.Rep.3 (2013) 2129.
[5] B.M.Prasanna, Doubled haploid technology in maize breeding: an overview, in: B.M. Prasanna,V. Chaikam, G. Mahuku(Eds.), Doubled Haploid Technology Maize Breeding:Theory and Practice, CIMMYT, Mexico, D.F, Mexico 2012, pp. 1-8.
[6] V. Prigge, A.E.Melchinger, Production of haploids and doubled haploids in maize, in: V.M. Loyola-Vargas, N.Ochoa-Alejo (Eds.),Plant Cell Culture Protocols.Methods in Molecular Biology (Methods and Protocols), 877,Humana Press,Totowa,NJ, USA 2012,pp. 161-172.
[7] E.H.Coe Jr.,A line of maize with high haploid frequency,Am.Nat. 93(1959) 381-382.
[8] F.K.R?ber,G.A.Gordillo,H.H.Geiger,In vivo haploid induction in maize-performance of new inducers and significance of doubled haploid lines in hybrid breeding, Maydica 50(2005)275-283.
[9] V. Prigge, C. Sánchez, B.S. Dhillon, W. Schipprack, J.L. Araus,M. B?nziger, A.E. Melchinger, Doubled haploids in tropical maize:I.Effects of inducers and source germplasm on in vivo
haploid induction rates,Crop Sci. 51(2011) 1498-1506.
[10] V. Prigge, W.Schipprack, G. Mahuku, G.N. Atlin, A.E.Melchinger, Development of in vivo haploid inducers for tropical maize breeding programs, Euphytica 185 (2012)481-490.
[11] V. Chaikam, S. Nair, L. Martinez,L. Lopez, H.F. Utz, A.Melchinger, P. Boddupalli, Marker-assisted breeding of improved maternal haploid inducers in maize for the tropical/subtropical regions, Front. Plant Sci. 9 (2018) 1527.
[12] S. Deimling, F. R?ber,H.H. Geiger, Methodology and genetics of in vivo haploid induction in maize, Vortr. Pflanzenzüchtg 38 (1997) 203-204(in German).
[13] P.Barret,M.Brinkmann,M.Beckert,A major locus expressed in the male gametophyte with incomplete penetrance is responsible for in situ gynogenesis in maize, Theor. Appl.Genet.117 (2008) 581-594.
[14] V. Prigge, X. Xu, L. Li, R. Babu, S. Chen, G.N. Atlin, A.E.Melchinger, New insights into the genetics of in vivo induction of maternal haploids, the backbone of doubled haploid technology in maize, Genetics 190 (2012) 781-793.
[15] H. Hu, T.A. Schrag, R.Peis, S. Unterseer, W.Schipprack, S.Chen, J.Lai, J. Yan, B.M. Prasanna, S.K. Nair, V. Chaikam, V.Rotarenco,O.A. Shatskaya, A. Zavalishina, S. Scholten,C.C.Sch?n, A.E. Melchinger, The genetic basis of haploid induction in maize identified with a novel genome-wide association method,Genetics 202 (2016) 1267-1276.
[16] X.Dong,X.Xu,J.Miao,L.Li,D.Zhang,X.Mi,C.Liu,X.Tian,A.E. Melchinger, S. Chen, Fine mapping of qhir1 influencing in vivo haploid induction in maize, Theor. Appl. Genet. 126(2013) 1713-1720.
[17] S.K. Nair,W.Molenaar, A.E. Melchinger, P.M. Boddupalli,L.Martinez,L.A. Lopez,V. Chaikam,Dissection of a major QTL qhir1 conferring maternal haploid induction ability in maize,Theor. Appl. Genet. 130 (2017) 1113-1122.
[18] T.Kelliher,D.Starr,L.Richbourg,S.Chintamanani,B.Delzer,M.L. Nuccio,J. Green, Z.Chen, J. McCuiston, W. Wang,T.Liebler, P.Bullock,B. Martin, MATRILINEAL,a sperm-specific phospholipase, triggers maize haploid induction, Nature 542(2017) 105-109.
[19] L.M. Gilles,A. Khaled, J. Laffaire, S. Chaignon, G. Gendrot, J.Laplaige,H. Bergès, G. Beydon, V. Bayle,P. Barret, J.Comadran, J.Martinant, P.M. Rogowsky, T. Widiez, Loss of pollen-specific phospholipase NOT LIKE DAD triggers gynogenesis in maize, EMBO J.36(2017) 707-717.
[20] C. Liu, X. Li, D. Meng, Y. Zhong, C.Chen, X. Dong, X. Xu, B.Chen,W.Li,L.Li,X.Tian,H.Zhao,W.Song,H.Luo,Q.Zhang,J. Lai, W.Jin, J.Yan, S. Chen, A 4-bp insertion at ZmPLA1 encoding a putative phospholipase a generates haploid induction in maize, Mol. Plant 10 (2017) 520-522.
[21] M.A. Aman, K.R. Sarkar, Selection for haploidy inducing potential in maize, Indian J. Genet. Plant Breed. 38(1978)452-457.
[22] J. Eder, S. Chalyk, In vivo haploid induction in maize, Theor.Appl. Genet. 104 (2002) 703-708.
[23] A.Z. Kebede, B.S. Dhillon, W.Schipprack, J.L. Araus, M.B?nziger, K. Semagn, G. Alvarado, A.E. Melchinger, Effect of source germplasm and season on the in vivo haploid induction rate in tropical maize, Euphytica 180 (2011)219-226.
[24] G.N. de la Fuente, U.K. Frei, B. Trampe, D. Nettleton, W.Zhang, T. Lübberstedt,A diallel analysis of a maize donor population response to in vivo maternal haploid induction: I.Inducibility, Crop Sci. 58(2018) 1830-1837.
[25] T.N. Satarova, V.Y. Cherchel, Inheritance of matroclinal haploidy in maize, Cytol. Genet. 44(2010) 155-159.
[26] P. Wu, H.Li,J.Ren, S. Chen, Mapping of maternal QTLs for in vivo haploid induction rate in maize (Zea mays L.), Euphytica 196 (2014) 413-421.
[27] B.A. Olukolu, A.Negeri,R.Dhawan, B.P. Venkata, P. Sharma,A.Garg,E.Gachomo,S.Marla,K.Chu,A.Hasan,A connected set of genes associated with programmed cell death implicated in controlling the hypersensitive response in maize,Genetics 193 (2013) 609-620.
[28] A. Korte, A. Farlow, The advantages and limitations of trait analysis with GWAS: a review, Plant Methods 9(2013) 29.
[29] E. Evangelou, J.P.A. Ioannidis,Meta-analysis methods for genome-wide association studies and beyond, Nat. Rev.Genet. 14(2013) 379.
[30] P. Schopp, D. Müller, F. Technow, A.E. Melchinger, Accuracy of genomic prediction in synthetic populations depending on the number of parents, relatedness, and ancestral linkage disequilibrium, Genetics 205 (2017) 441-454.
[31] V.Chaikam,L.Martinez,A.E.Melchinger,W.Schipprack,P.M.Boddupalli,Development and validation of red root markerbased haploid inducers in maize, Crop Sci. 56(2016)1678-1688.
[32] V. Chaikam,L.A. Lopez, L. Martinez,J. Burgue?o, P.M.Boddupalli,Identification of in vivo induced maternal haploids in maize using seedling traits,Euphytica 213(2017)177.
[33] CIMMYT, Laboratory Protocols: CIMMYT Applied Molecular Genetics Laboratory, 3rd edition CIMMYT, Mexico, D.F.,Mexico, 2005.
[34] R.J. Elshire, J.C.Glaubitz, Q. Sun, J.A. Poland, K. Kawamoto,E.S. Buckler, S.E. Mitchell,A robust, simple genotyping-bysequencing (GBS) approach for high diversity species, PLoS One 6 (2011), e19379.
[35] M.C.Romay,M.J.Millard,J.C.Glaubitz,J.A.Peiffer,K.L.Swarts,T.M. Casstevens, R.J. Elshire, C.B.Acharya, S.E. Mitchell, S.A.Flint-Garcia, Comprehensive genotyping of the USA national maize inbred seed bank, Genome Biol. 14 (2013) R55.
[36] A.L. Price, N.J.Patterson, R.M. Plenge, M.E. Weinblatt, N.A.Shadick,D.Reich,Principal components analysis corrects for stratification in genome-wide association studies, Nat.Genet. 38(2006) 904.
[37] P.M. VanRaden, Efficient methods to compute genomic predictions,J.Dairy Sci. 91(2008) 4414-4423.
[38] The R Core Team,R:A Language and Environment Statistical Computing, R Foundation for Statistical Computing, Vienna,Austria, 2013.
[39] W.G. Hill, B.S. Weir, Variances and covariances of squared linkage disequilibria in finite populations, Theor. Popul. Biol.33(1988) 54-78.
[40] H.M. Kang, Efficient Mixed-Model Association eXpediated(EMMAX),University of California, Los Angeles, CA, USA,2010.
[41] V.Hindu,N.Palacios-Rojas,R.Babu,W.B.Suwarno,Z.Rashid,R. Usha, G.R. Saykhedkar, S.K. Nair, Identification and validation of genomic regions influencing kernel zinc and iron in maize, Theor. Appl. Genet.131 (2018) 1-15.
[42] G.Zheng,B.Freidlin,J.L.Gastwirth,Robust genomic control for association studies, Am. J. Hum. Genet. 78 (2006)350-356.
[43] V. Chaikam,S.K. Nair,R.Babu, L. Martinez, J.Tejomurtula, P.M.Boddupalli,Analysis of effectiveness of R1-nj anthocyanin marker for in vivo haploid identification in maize and molecular markers for predicting the inhibition of R1-nj expression, Theor. Appl. Genet.128 (2015) 159-171.
[44] A.E.Melchinger,W.Schipprack,H.Friedrich Utz,V.Mirdita,In vivo haploid induction in maize: identification of haploid seeds by their oil content, Crop Sci. 54(2014) 1497-1504.
[45] S.S. Chase, Monoploids and monoploid-derivatives of maize(Zea mays L.), Bot. Rev. 35(1969) 117-168.
[46] J.G. Wallace, P.J.Bradbury,N. Zhang, Y. Gibon, M. Stitt, E.S.Buckler,Association mapping across numerous traits reveals patterns of functional variation in maize, PLoS Genet.10(2014), e1004845.
[47] F. Begum, D. Ghosh, G.C. Tseng,E. Feingold,Comprehensive literature review and statistical considerations for GWAS meta-analysis, Nucleic Acids Res. 40(2012) 3777-3784.
[48] Y. Du,C.N. Topp, R.K. Dawe, DNA binding of centromere protein C(CENPC) is stabilized by single-stranded RNA, PLoS Genet.6(2010), e1000835.
[49] E.Screpanti,A.de Antoni,G.M.Alushin,A.Petrovic,T.Melis,E. Nogales, A. Musacchio, Direct binding of Cenp-C to the Mis12 complex joins the inner and outer kinetochore, Curr.Biol.21(2011) 391-398.
[50] K.R. Sarkar, E.H. Coe Jr., A genetic analysis of the origin of maternal haploids in maize, Genetics 54(1966) 453.
[51] M.Swapna,K.R.Sarkar,Anomalous fertilization in haploidy inducer lines in maize(Zea mays L),Maydica 56(2012)221-225.
[52] N.K. Enaleeva, V.S. Tyrnov,L.P. Selivanova,A.N. Zavalishina,Single fertilization and the problem of haploidy induction in maize, Doklady Biol. Sci. 353 (1996) 225-226.
[53] X.Tian,Y.Qin,B.Chen,C.Liu,L.Wang,X.Li,X.Dong,L.Liu,S.Chen,Hetero-fertilization Together With Failed Egg-Sperm Cell Fusion Supports Single Fertilization Involved in In Vivo Haploid Induction in Maize, 69, 2018 4689-4701.
[54] Z. Zhang, F. Qiu,Y. Liu,K. Ma, Z. Li, S. Xu, Chromosome elimination and in vivo haploid production induced by stock 6-derived inducer line in maize (Zea mays L.), Plant Cell Rep.27 (2008) 1851-1860.
[55] X.Li,D.Meng,S.Chen,H.Luo,Q.Zhang,W.Jin,J.Yan,Single nucleus sequencing reveals spermatid chromosome fragmentation as a possible cause of maize haploid induction,Nat. Commun. 8 (2017) 991.
[56] L. Li,X. Xu, W.Jin, S. Chen, Morphological and molecular evidences for DNA introgression in haploid induction via a high oil inducer CAUHOI in maize,Planta 230 (2009) 367-376.
[57] F. Qiu, Y. Liang, Y. Li, Y. Liu, L. Wang,Y. Zheng, Morphological, cellular and molecular evidences of chromosome random elimination in vivo upon haploid induction in maize,Curr. Plant Biol. 1(2014) 83-90.
[58] E.Combs,R.Bernardo,Accuracy of genomewide selection for different traits with constant population size, heritability,and number of markers, Plant Genome 6 (2013) 1-7.
[59] C. Riedelsheimer, J.B. Endelman, M. Stange, M.E. Sorrells, J.L.Jannink, A.E.Melchinger, Genomic predictability of interconnected bi-parental maize populations, Genetics 194 (2013)493-503.
[60] J.Crossa,P.Pérez-Rodríguez,J.Cuevas,O.Montesinos-López,D. Jarquín,G. de los Campos, J. Burgue?o, J.M. Camacho-González, S. Pérez-Elizalde, Y. Beyene, Genomic selection in plant breeding: methods, models, and perspectives, Trends Plant Sci. 22(2017) 961-975.
[61] X. Zhang,P. Pérez-Rodríguez,J. Burgue?o, M. Olsen, E.Buckler, G. Atlin, B.M. Prasanna, M. Vargas, F. San Vicente, J.Crossa, Rapid cycling genomic selection in a multi-parental tropical maize population,G3-Genes Genomes Genet.7(2017)2315-2326.
[62] Y. Beyene, K. Semagn, S. Mugo, A. Tarekegne,R. Babu, B.Meisel,P.Sehabiague,D.Makumbi,C.Magorokosho,S.Oikeh,Genetic gains in grain yield through genomic selection in eight bi-parental maize populations under drought stress,Crop Sci.55 (2015) 154-163.
[63] B.S. Vivek,G.K. Krishna, V. Vengadessan, R. Babu, P.H. Zaidi,L.Q. Kha, S.S. Mandal, P.Grudloyma, S. Takalkar, K.Krothapalli,Use of genomic estimated breeding values results in rapid genetic gains for drought tolerance in maize,Plant Genome 10(2016), 160070.
[64] H. Ma, G. Li, T. Würschum, Y. Zhang,D. Zheng, X. Yang, J. Li,W.Liu, J.Yan, S. Chen, Genome-wide association study of haploid male fertility in maize(Zea mays L.),Front.Plant Sci.9(2018) 974.
[65] M. Gowda,Y. Beyene, D. Makumbi, K. Semagn, M.S. Olsen, J.M. Bright,B. Das,S. Mugo, L.M. Suresh, B.M. Prasanna,Discovery and validation of genomic regions associated with resistance to maize lethal necrosis in four biparental populations, Mol. Breed. 38(2018) 66.
[66] M. Gowda,B. Das,D. Makumbi, R.Babu, K.Semagn, G.Mahuku, M.S. Olsen, J.M. Bright,Y. Beyene, B.M. Prasanna,Genome-wide association and genomic prediction of resistance to maize lethal necrosis disease in tropical maize germplasm,Theor. Appl. Genet.128 (2015) 1957-1968.
[67] F. Technow, A. Bürger, A.E. Melchinger, Genomic prediction of northern corn leaf blight resistance in maize with combined or separated training sets for heterotic groups,G3-Genes Genomes Genet.3(2013) 197-203.