Yuhui Hong,Yong Xio,N Song,Shousong Zhu,Rui Zho,Ke Li,Mengting Geng,Xiohui Yu,Honggng Wng,Wei Xi,*,Yinhu Chen,*
a Hainan Key Laboratory for Sustainable Utilization of Tropical Bioresource,College of Tropical Crops,Hainan University,Haikou 570228,Hainan,China
b Coconut Research Institute,Chinese Academy of Tropical Agricultural Sciences,Wenchang 571339,Hainan,China
c School of Bioengineering,Dalian University of Technology,Dalian 116024,Liaoning,China
Keywords:Cassava Ethylene response factor Expression profile Co-expression analysis Pathogen
ABSTRACT Cassava,Manihot esculenta Crantz(Me),is a major dietary source of calories for over 700 million people in tropical regions.The production of cassava is constantly threatened by cassava bacterial blight (CBB),caused by Xanthomonas axonopodis pv. manihotis (Xam).The gene resources for CBB-resistant breeding of cassava are limited.In model plant species,ethylene response factors play important roles in response to pathogen infection.In this study,cassava ethylene response factors(MeERFs)were identified and characterized as the first step in studying their potential for CBB-resistant breeding of cassava.In the cassava genome 155 MeERFs were identified,of which 23 were induced by Xam infection.The promoter regions of 204 genes harbored GCC-box that had the potential to interact with MeERFs.Using 37 transcriptomes derived from Xam infection treatment,four gene co-expression modules for the MeERFs and GCC-box containing genes were constructed.Six MeERFs were associated with two GCC-box containing genes:transcription initiation factor TFIIE subunit beta (MeTFIIE),and histone-lysine N-methyltransferase ASHR1 (MeASHR1).Dual-luciferase reporter assays showed that MeERF10 and MeERF58 positively regulated MeTFIIE; MeERF137 negatively regulated MeTFIIE; MeERF10 and MeERF137 positively regulated MeASHR1;and MeERF35 negatively regulated MeASHR1.The four MeERFs may mediate pathogen response by regulating the expression of the two GCC-box containing genes.
Cassava (2n=36,Manihot esculentaCrantz) originated in Brazil and is widely cultivated in the tropics.It is not only a staple source of dietary calories for over 700 million people but also provides proteins,vitamins,and micronutrients [1–4].Its storage roots can be used for bioenergy [5].More and more studies focus on yield improvement of cassava by identification and characterization of genes involved in yield development and disease resistance.However,only a few such genes have been identified for cassava breeding,although the full cassava genome sequence was released in 2012 [6].
Cassava bacterial blight(CBB)caused byXanthomonas axonopodispv.manihotis(Xam) is the most destructive disease in cassava production [7,8].Xamis classified as a vascular and foliar pathogenic species,and initially adheres to the host surface before entering the host through epidermal cell stomata and epidermal wounds during infection [9].The first visible symptom of CBB appears as a water-soaked lesion occurring a few days to a week after infection,followed by angular leaf spots,blight,wilt,exudation,and dieback symptoms [9].The disease leads to 50%–75%yield losses,depending on the environmental conditions [10,11].CBB was one of the main reasons for famine in Zaire between 1970 and 1975 when the loss of cassava production in central Africa was as high as 80% [12,13].Identification and characterization of genes responding toXaminfection of cassava will provide genetic resources for breeding cassava cultivars with higher resistance to CBB.
In model plant species,ethylene response factors(ERFs),which belong to the superfamily APELATA 2/ERF (AP2/ERF),are involved in many signaling processes,including responses toXaminfection.TheERFsfunction as integrators of hormonal pathways involved in ethylene response [14] and brassinosteroid response [15] and are modulated by the salicylic acid(SA) and jasmonic acid (JA)signaling pathways[16,17].They play important roles in disease defense via transcriptional regulation of jasmonate (JA)/ethylene (ET)-responsive defense genes.For example,AtERF1promotes the expression of the ET-inducible defense genesPDF1.2andBASIC CHITINASE(b-CHI,also called PR3),which can inhibit microbial growth[18,19].Overexpression of theERFgenes in rice(OsERF83),tobacco (BrERF11),andArabidopsis(AtERF11) increased disease resistance [17,20,21].Phosphorylation of AtERF6 can activate defense genes inArabidopsis[22].However,the roles ofERFgenes in cassava disease resistance await discovery.
The ERFs act as regulators of hormonal pathways and participate in plant resistance to bacterial blight.The riceOsEREBP1acts as a downstream component of a signal transduction pathway in response of rice to infection byXanthomonas oryzaepv.oryzae(Xoo),andOsEREBP1-overexpressing plants showed reduced susceptibility toXoo[23].The cottonERFgeneGhERF-IIb3is a positive regulator of the jasmonic acid (JA) pathway and increased cotton defense response to bacterial infection via JA accumulation [24].MeRAV1andMeRAV2,belonging to the AP2/ERF gene family,increased cassava disease resistance against CBB by transcriptional regulation of the genes involved in melatonin biosynthesis [25].Thus,ERFgenes mediate disease resistance by regulating the expression of downstream defense genes.ERFgenes can bind to promoters containing the GCC-boxciselement,which is found in the promoter region of many genes involved in pathogenesis,such as those encoding β-1,3-glucanase,chitinase and osmotic proteins[26,27].These findings suggest the large-scale identification and characterization ofERFgenes and their potential targets in staple crops including cassava.
The objective of this study was to identify and characterizes cassava ethylene response factor genes (MeERFs) and their potential target genes as the first step in investigating their potential for CBB-resistance breeding of cassava.We performed a genomewide screening ofMeERFs and their potential targets,GCC-box cis-element containing genes.The co-expression ofMeERFs and their potential targets were analyzed by weighted gene coexpression network analysis (WGCNA) and further verified by dual-luciferase reporting system assays.Our results provide important information and experimental resources for further identification of the genes involved in disease resistance for cassava breeding.
Manihot esculentaSouth China 8 (SC8) and Kasetsart University 50 (KU50,with high starch content) were used.Tissue-cultured seedlings of SC8 were subcultured on MS medium at 25 °C under a 16 h(06:00 to 22:00)light/8 h dark cycle in a greenhouse at Hainan University (Haikou,Hainan,China) for three months.ForXaminfection assays,SC8 seedlings were inoculated withXanthomonas axonopodispv.Manihotis11 (Xam11) [28].The third to fourth leaves were sampled at 0 5,8,24,and 48 h after inoculation.As an intermediate in the biosynthesis of the plant hormone ethylene,1-aminocyclopropane-1-carboxylic acid (ACC) is widely used as a proxy for ethylene,given that nearly all plant tissues readily convert it to ethylene[29].Three sets of 18 seedling were treated with salicylic acid (SA,100 μmol L-1),jasmonic acid (JA,100 μmol L-1),and ACC (100 μmol L-1) treatment separately.The third to fourth leaves from the top were sampled at 0,0.5,1,2,4,and 6 h after treatment,using three seedlings for each sample as biological replicates.KU50 was grown in the experimental field of Hainan University.All samples were quick-frozen in liquid nitrogen for subsequent RNA isolation.
The assembled genomic sequences,predicted amino acid sequences,and transcript sequences of all cassava genes were obtained from Phytozome 12.1 Conserved domains of cassava protein-coding genes were identified with HMMER [30] based on PFAM databases (http://pfam.sanger.ac.uk) [31].PutativeMeERFs inM.esculentawere identified by protein–protein BLAST using the 139Arabidopsis ERF(AtERF) amino acid sequences (The Arabidopsis Information Resource,TAIR 10 database,E-value cutoff 10-10).The sequences of AP2/ERF domains in theMeERFs were further predicted by alignment with Conserved Domain Database(CDD) (http://www.ncbi.nlm.nih.gov/cdd).Multiple sequence alignments for the AP2/ERF domains of theMeERFs were used to construct a phylogenetic tree using MEGA 7.0[32]with 1000 bootstrap replicates.
The gene structures ofMeERFs were based on gene models downloaded from the Phytozome 12.1 website described above and conserved motifs predicted by MEME(http://meme-suite.org/-tools/meme)were visualized on the phylogenetic tree with ITOL 4(https://itol.embl.de/) [33].All the protein-coding genes from cassava were aligned using BLAST against itself,which was locally formatted as a subject database by the BLAST software,with a cutoff of 1e-10.The BLAST results were processed with MCScanX[34]and duplication events in all identifiedMeERFs were identified by putative homologous chromosomal region determination.
Ten Sequence Read Archives(SRAs)for cassava root and leaf tissues and 37 SRAs for cassava leaves infected withXamwere retrieved from the National Center for Biotechnology Information(Table S1).The hista2 software [35]was used with default parameters to map RNA-seq reads and StringTie software [36] with default parameters was used to assemble transcripts and compute reads per kb per million reads (RPKM) values.Differentially expressed genes in mock-andXam-infected samples were identified with the DESeq2 [37] R package using a false discovery rate<0.001 and at least a twofold expression difference.Heat maps ofMeERFs were constructed based on log2-transformed RPKM values and visualized with TBtools [38].Normalization of RPKM values was performed by the min–max method.
The 2-kb upstream sequences (putative promoter regions) of 33,033 genes in the cassava genome were extracted and analyzed with TSSP The conserved GCC-box sequence (TAAGAGCCGCC),an ethylene-response element or GCC-box motif,in promoter regions was used as a marker to identifyMeERF-targeted gene candidates possibly involved in plant disease resistance [39].Gene Ontology assignment of the GCC-box genes was performed with Blast2GO[40].Enrichment analysis was performed with Omicshare (http://www.omicshare.com/tools/Home/Soft/gogsea).
Co-expression modules were generated for theMeERFs and GCC-box genes based on the above 37 selected transcriptomes(Table S1).The RPKM values for theMeERFs and GCC-box genes were processed to remove genes showing low or modest expression variation (coefficient of variation (CV) across the 37 transcriptomes <1) or unexpressed (RPKM ≤1).The Weighted Gene Co-Expression Network Analysis (WGCNA) [41] package installed in the R environment was used to identify coexpression modules for selected genes with min–max normalized and log2transformed FPKM values.Modules were defined as clusters of highly interconnected genes,and genes within the same cluster had have high correlation coefficients with each one another.A pair ofMeERFand GCC-box genes with a weight ≥0.15 was assigned as associated.The gene modules were visualized with Cytoscape [42].
Reverse transcription-quantitative real-time PCR(RT-qPCR)was used to identify gene expression patterns of the sixMeERFs and two GCC-box genes underXam11infection,SA,JA,and ACC treatments.Primers used for RT-qPCR are listed in Table S2.RT-qPCR was performed in 96-well optical plates with a standard SYBR Premix Ex TaqTM kit(TaKaRa,Japan).A final volume of 20 μL containing 10 μL 2×SYBRII,7 μL ddH2O,1 μL of each gene-specific primer(10 μmol L-1),and 1 μL diluted cDNA was used,following the supplier’s protocol.All real-time qPCR reactions were performed with a cycling setting as follows:95 °C for 5 s,55 °C for 15 s,and 68 °C for 20 s,40 cycles.The procedure was ended with a melt-curve ramping from 60 to 95 °C for 20 min to check the PCR specificity.All RT-qPCRs were performed in technical triplicate for each sample.Elongation factor 1 alpha(EF1α) [43] was validated as one of two most stably expressed genes across different tissues and developmental stages in cassava.The relative expression levels of the selectedMeERFs were normalized to that ofEF1α in this study.Relative expression levels of genes were calculated using the 2-ΔΔCTmethod [44].
Dual-luciferase reporter gene assays were performed to verify the interaction betweenMeERFs and their candidate targets,GCCbox genes.The full-length coding sequences of six selectedMeERFs were amplified and cloned into apGreenII 62-SK vector whose expression was driven by the 35S promoter.The double-reporter vector was modified from thepGreenII 0800-LUC reporter vector[45] and contains a GAL4-LUC and an internal control REN.The 2 kb upstream sequences of transcription initiation factor TFIIE subunit beta (MeTFIIE) and histone-lysine N-methyltransferase ASHR1 (MeASHR1) were cloned into thepGreenII ss0800-LUC vector.The GCC-box sequences(GCCGCC)in the promoters ofMeTFIIEandMeASHR1were mutated to TCCTCA by Thermo Fisher(Beijing,China) and cloned into thepGreenII 0800-LUC vector.The vector with the mutated GCC-box promoter was combined with thepGreenII 62-SK with insertion of the full-length coding sequence of theMeERFi.All primers used for vector construction are listed in Table S3.
Mesophyll protoplasts were prepared from 7 to 8-week-oldin vitro-grown SC8 seedlings and used for dual-luciferase reporter gene assays.The third to fourth leaves from the top of the seedlings were sliced into 1-mm-wide pieces and incubated in enzyme solution containing 1.6% cellulose R-10 (Yakult,Japan),0.8% macerozyme R-10 (Yakult) and 9% mannitol in the dark at 25 °C for 16 h.Protoplast transfection was performed following Wu et al.[46].The plasmid ofpGreenII 62-SK-MeERFwas combined withpGreenII 0800-LUC-proMeTFIIEorpGreenII 0800-LUCproMeASHR1to co-transform into cassava mesophyll protoplasts using the polyethylene glycol (PEG) method.The original plasmid ofpGreenII 62-SK combined withpGreenII 0800-LUC-proMeTFIIEorpGreenII 0800-LUC-proMeASHR1was used as a control transfection.A final PEG4000 concentration of 25%and transfection time of 10 min were applied.Protoplast transfection was performed in 96-well plates.The activities of LUC and REN luciferase were measured with the dual-luciferase reporter assays system (Promega,Madison,Wisconsin,USA) as described by the manufacturer’s instructions.Twelve plasmid combinations were tested:MeERF10/MeTFIIE,MeERF35/MeTFIIE,MeERF58/MeTFIIE,MeERF108/MeTFIIE,MeERF137/MeTFIIE,MeERF142/MeTFIIE,MeERF10/MeASHR1,MeERF35/MeASHR1,MeERF58/MeASHR1,MeERF108/MeASHR1,MeERF137/MeASHR1,andMeERF142/MeASHR1.At least five transient assay repeats were performed for each combination of the vectors.One-way analysis of variance was calculated with SPSS 17.0 (SPSS,Chicago,Illinois,USA) and multiple comparisons were performed using Duncan multiple-range tests.
A total of 189 protein-coding genes with AP2 domains were identified by HMMER using the Pfam database.The phylogenetic tree of the AP2/ERF gene family in cassava indicated that it was composed of three subfamilies with 27MeAP2,7MeRAV,and 155MeERF(Fig.S1).All 155MeERFs showed high similarity withAtERFs(E-value <10-10).The identifiedMeERFs were named fromMeERF01toMeERF155according to their positions in the assembled genome of cassava (Table S4).The deduced MeERF proteins varied in lengths from 77 amino acids (aa) (MeERF66) to 474 aa(MeERF112),with a mean length of 254.5 aa.The predicted isoelectric points (pIs) and protein molecular weight (MWs) of the MeERFs ranged from 4.51 (MeERF83) to 10.14 (MeERF59),and from 13.33 (MeERF103) to 52.71 (MeERF112) kD,respectively(Table S4).
MeERF01 to MeERF153were located on chromosomes and showed an uneven distribution pattern,whereasMeERF154andMeERF155were located on scaffolds.The numbers ofMeERFs on one chromosome (Chr.) varied from 3 (Chr.7/Chr.10) to 16 (Chr.1),with a mean of 8.5MeERFs per chromosome (Fig.S2).MeERFstended to cluster on chromosomes,such asMeERF04toMeERF11on Chr.1,MeERF48toMeERF54on Chr.5,MeERF61toMeERF67on Chr.6,andMeERF121toMeERF130on Chr.15.Most of these clusters were in regions harboring tandem duplicate genes.Fourteen sets of tandem duplicatedMeERFs were detected:MeERF04/05/06,MeERF08/09,MeERF20/21,MeERF37/38/39,MeERF52/53/54,MeERF62/63/64,MeERF65/66,MeERF92/93,MeERF102/103/104,MeERF114/115,MeERF116/117,MeERF123/124/125,MeERF142/143,MeERF150/151.All tandem duplicateMeERFregions overlapped with MeERF clusters on chromosomes (Fig.S2).
Homologous-blocks analysis showed that 124MeERFs were located in homologous segments,which contained homologousMeERFs(Table S5).Thirty-six pairs ofMeERFs,containing 55 uniqueMeERFs,were identified as duplicated genes (e-value <1e-40) and were distributed over all chromosomes but 9(Fig.1).Among these chromosomes,Chr.1 contained the most duplicated gene pairs(10) with Chr.5 (4),Chr.2 (3),and Chr.12 (3),followed by Chr.2 (8),Chr.15 (8),Chr.18 (7),and Chr.12 (6).FortyMeERFs had one duplicate gene and 13 had two duplicate genes.MeERF14 andMeERF127 had three duplicate genes:MeERF18/49/117andMeERF78/100/106.
Fig.1.Chromosomal distribution of duplicated MeAP2/ERFs pairs as determined by MCScanX.Duplicated gene pairs are linked by red lines.
In the phylogenetic tree constructed from the amino acid sequences of the AP2/ERF domain,theMeERFs could be grouped into ten clusters (Fig.2).The largest cluster,IX,contained 29MeERFs and the smallest,VII,the fewest,contained fiveMeERFmembers.ERFs could be divided into CBF/DREB and ERF subgroups using the 14th and 19th amino acids in the AP2/ERF domains.Based on this characteristic,55MeERFs in clusters I,II,III,and IV were assigned to the CBF/DREB subgroup and the other 100MeERFs in clusters V,VI,VII,VIII,IX,and X to the ERF subgroup.
Comparison of the gene structures of the 155MeERFs showed that 120(77.4%)of the 155 had a unique exon without intron,represented in all the clusters including all the members in clusters I,II,and IV.There were 34MeERFs with two exons,and one gene(MeERF83) with three exons (Fig.2).The MeERF proteins shared a common structural feature:a highly conserved DNA binding domain,the ERF domain,at the N terminus(Fig.2).MEME analysis identified three types of conserved motifs within the AP2/ERF domains of the 155 MeERFs.Members in the same subgroup(CBF/DREB or ERF) had the same motifs (Fig.2).
Ten transcriptomes for leaf tissues (3) and root tissues (7)derived from cassava cultivar Argentina 7 (Arg7),wild subspecies(W14) and Kasetsart University 50 (KU50) were used forMeERFs expression level analysis (Table S1).In KU50,Arg7 and W14,respectively 35%(54/155),37% (58/155)and 40% (62/155)MeERFs were expressed (with RPKM value >10) in leaf or root tissues.Thirty-fiveMeERFs (MeERF09,MeERF10,MeERF12,MeERF23,MeERF31,MeERF33,MeERF35,MeERF41,MeERF47,MeERF58,MeERF60,MeERF63,MeERF68,MeERF70,MeERF86,MeERF87,MeERF88,MeERF99,MeERF 100,MeERF108,MeERF109,MeERF111,MeERF115,MeERF116,MeERF117,MeERF119,MeERF121,MeERF127,MeERF128,MeERF132,MeERF141,MeERF142,MeERF145,MeERF146,andMeERF153) were highly expressed (with RPKM value >10) in the roots and leaves of all three accessions.Of the 155MeERFs,65% (101/155),63% (97/155) and 60% (93/155) showed low or no expression in tissues of KU50,Arg7,or W14,respectively(Fig.S3).
Fig.2.Phylogenetic tree,gene structure,and conserved motifs of 155 MeERFs,constructed from the amino acid sequences of conserved AP2/ERF domains.The neighbor-joining tree was constructed using MEGA 7.0 and 1000 bootstrap replicates were used to assess tree reliability.The outermost circle shows the conserved motifs of the 155 MeERF genes.The middle circle shows the gene structure of the 155 MeERF genes.The upstream,downstream,and coding sequence(CDS) sequences of genes are indicated by different colors.
SomeMeERFs also showed differential expression patterns in different cassava accessions.MeERF18andMeERF98displayed higher expression abundance in KU50 leaves than in Arg7 and W14 leaves.In contrast,MeERF42,MeERF61,andMeERF118showed higher transcription abundance in W14 leaves (RPKM value >10)than in KU50 and Arg7 leaves (Fig.S3).Analysis of the expression pattern based on the evolutionary relationship of theMeERFs showed differentiation among the different clusters.In particular,most of theMeERFgenes (20/25) in clusters I,II,and VII showed high expression abundance.Only a few(6/20)MeERFgenes in clusters IV and V showed high expression,while the others (15) in these clusters showed low or no expression.In contrast,theMeERFs in clusters III,VI,VIII,IX,and X showed different expression levels among the accessions.
The expression patterns of the 155MeERFs underXaminfection were investigated based on transcriptome data generated from several time points ofXaminfection (Table S1).Comparison ofMeERFs between highly virulentXam668and mock treatment showed that the expressions of 23 genes were significantly induced afterXam668infection (Fig.3).Of the 23MeERFs,six belong to the CBF/DREB subfamily and half of these belong to subgroup III.The remaining 17 genes belong to the ERF subfamily,among which subgroup IX shared the most members(8),followed by subgroups X (5) and VIII (4).
A total of 204 genes were identified as harboring a GCC-box ciselement in their promoter (Table S5).In the Gene Ontology cell composition category,48% of the genes were assigned to cell and cell composition.In the molecular function category,40% of the genes functioned in cell binding and catalytic processes(Fig.S4A).GO enrichment analysis showed that these genes were enriched in primary metabolic process,followed by catalytic activity and organelle-related terms (Fig.S4B).
To further analyze the relationship between GCC-box genes and the selectedMeERFgenes,co-expression analyses were performed based on 37 transcriptomes generated fromXam-infected cassava leaf samples (Table S1).The RPKM values of the 155MeERFs and 204 GCC-box genes in all transcriptomes were calculated and filtered by CV and RPKM value.A total of 359 genes were used for co-expression analysis,while 49 genes were removed due to low expression or low expression variation.The gene co-expression modules for theMeERFs and GCC-box genes are shown in Fig.4.Four gene co-expression modules were constructed with 98(blue),73(brown),40(gray)and 99(turquoise)genes(Table S7).Based on enrichment analysis,nine genes were characterized as being involved in pathogen-response pathways.Two GCC-box genes(MeTFIIEandMeASHR1),which were associated with pathogenic responses,were used for identifying co-expressedMeERFgenes.SixMeERFs,namelyMeERF10,MeERF35,MeERF58,MeERF142,MeERF108,andMeERF137,showed correlated expression patterns withMeTFIIEandMeASHR1(weight value ≥0.15).
Fig.4.The network of MeERFs and GCC-box genes in the blue and turquoise module.Circles and triangles represent MeERFs and GCC-box containing genes,respectively.The hub genes in the blue and turquoise modules are marked with names.Eight GCC-box containing genes involved in pathogen-response pathways are also marked with names.A line connecting two nodes represents an edge.
To further investigate the relationship between the selectedMeERFand GCC-box genes,the relative expression levels of the six selectedMeERFgenes and two GCC-box genes (MeTFIIEandMeASHR1) under infection withXam11were measured by RTqPCR (Fig.5).The expression level ofMeERF10was significantly up-regulated 10-fold at 5 h after infection,while the expressions ofMeERF35andMeERF142were slightly up-regulated.The expressions ofMeERF10,MeERF35,andMeERF142were down-regulated at 24 h after infection,but significantly up-regulated at 48 h.The expression levels ofMeASHR1,MeTFIIE,MeERF108andMeERF137were first down-regulated at 5 h after infection,then upregulated at 8 h after infection,and subsequently downregulated at 24 h after infection.All these genes showed negative or positive correlations in expression pattern (Pearson correlation coefficient,P<0.05):the expression patterns ofMeERF108,MeERF137,MeTFIIEandMeASHR1were similar,butMeTFIIEandMeASHR1showed contrasting expression patterns with these four genes.
GCC-box motif screening,co-expression module analysis and RT-qPCR assay suggested that the sixMeERFs,namelyMeERF10,MeERF35,MeERF58,MeERF108,MeERF137,MeERF142,might directly target the two GCC-box genes,MeTFIIEandMeASHR1.Dual-luciferase gene reporter assays were further performed to investigate binding of the six MeERF proteins to the GCC-box motifs of the two GCC-box genes.First,the wild-type promoter sequences ofMeTFIIEandMeASHR1were cloned into thepGreenII 0800-luc vectors for the assays.As shown in Fig.6A,the luciferase ratio of the interaction between MeERF10/MeERF58 andMeTFIIEwas nearly four times that of the control group.The luciferase ratio of the interaction between MeERF10 andMeASHR1was nearly twice that of the control group,whereas the interaction between MeERF35 andMeASHR1was markedly lower than that of the control group.The interaction between MeERF137 andMeTFIIEwas markedly lower than that of the control group(Fig.6A).The results suggested that four of the six MeERF proteins,namely MeERF10,MeERF35,MeERF58 and MeERF137,interacted with the promoters ofMeTFIIEandMeASHR1.
To further investigate whether the GCC-box motif is involved in interaction of MeERF proteins with the promoters ofMeTFIIEandMeASHR1genes,point mutations that converted the wild type GCC-box sequence GCCGCC to a mutant sequence TCCTCA were generated in thepGreenII 0800-luc vectors for dual-luciferase gene reporter assays.The mutations in theMeTFIIEpromoter reduced the interaction efficiency of MeERF10 and MeERF58 to the level of the control group(Fig.6B).The mutations inMeASHR1promoter reduced the interaction efficiency of MeERF10 and MeERF35,but increased the interaction with MeERF137 (Fig.6C).
Fig.5.Gene expression patterns by RT-qPCR for six MeERF genes,MeTFIIE and MeASHR1 at 0,5,8,2,and 48 h after Xam11 infection.The expression patterns of the six MeERFs and two GCC-box genes were further examined under SA,JA,and ACC treatments.The expression of MeERF10,MeERF137,and MeERF142 were induced early under SA,JA,and ACC treatments (Fig.S5).The expression of MeERF108, MeTFIIE,and MeASHR1 showed contrast trend under SA (Fig.S5A) and JA treatment (Fig.S5B). MeTFIIE and MeASHR1 shared similar expression patterns under ACC treatment (Fig.S5C).Most of the selected MeERFs were up-regulated by SA,JA,and ACC treatments at several timepoints(Fig.S5).One exception was MeERF58, which seemed to be down-regulated at different time points after treatment by application of all three hormones.
We report the identification and characterization of 155MeERFs in the cassava genome.Of them,54% (84/155) ofMeERFs were highly expressed in roots,suggesting that most are associated with storage root development and yield formation of cassava.The expression of 23MeERFs was induced by infection of the pathogenicXam668strain,suggesting that these genes are involved in response to infection by pathogenicXamstrains.
The 204 genes harboring GCC-box cis-elements in their promoters may beMeERFs-targeted genes.GO analysis showed that 69 of the 204 GCC-box containing genes were involved in response to stimuli (Fig.S3A,Table S5).Further co-expression module construction using the 37 transcriptomes derived fromXaminfection treatment identified eight GCC-box containing genes associated with pathogen response pathways,among which at least two GCC-box containing genes,MeTFIIEandMeASHR1were associated with both sixMeERFgenes and pathogenic responsive pathways.In general,TFIIE acts together with RNA polymerase II and other transcription factors to promote gene transcription in eukaryotic cells [47].The Resistance genesSNC1,RPP4,andLAZ5are all regulated by histone lysine methylation [48].InArabidopsis thaliana,histone methyl transferases SET DOMAIN GROUP8 (SDG8) and SDG25 regulate pep1-,flg22-,and effector-triggered immunity as well as systemic acquired resistance [49].The protein–DNA interaction assays further showed that expression of the GCC-box containing genes was controlled by at least fourMeERFgenes,namelyMeERF10,MeERF35,MeERF58andMeERF137,possibly via binding to the promoters ofMeTFIIEandMeASHR1.In particular,MeERF10andMeERF58positively regulatedMeTFIIE;MeERF137negatively regulatedMeTFIIE;MeERF10andMeERF137positively regulatedMeASHR1;andMeERF35negatively regulatedMeASHR1.These results show that the cassava genome also has a conservedMeERFgene regulatory network involved in response to theXaminfection.
There are considerably variations in the gene numbers of AP2/ERF gene family in plants,as 155MeERFs were identified in this study for cassava,which is more than the 98 in soybean[50],122 inArabidopsis[51] and 139 in rice [51],while it is fewer than the 171 in foxtail millet [52] and 248 in Chinese cabbage [53].All ERF proteins contain a conserved AP2/ERF domain,whose differences in the 14th and 19th amino acids result in the separation of the CBF/DREB subgroup that tends to bind with the A/GCCGAC element,and the ERF subgroups relating to the AGCCGCC element (GCC-box) binding [51].Based on this criterion,we grouped the MeERF proteins into a CBF/DREB subgroup of 55 members and an ERF subgroup of 100 members,and the latter group could be selected as candidates responded to diseases.
Expression pattern analyses showed that of the 155MeERFs,63 (40%) showed high expression levels with peak values at the 75 days after initiation of root development,23MeERFs were sustainably induced at 24 and 50 h after infection byXam.Of the 23 inducedMeERFs,17 belonged to ERF subgroup with the most members in the IX subfamily.Overexpression of B-3c subgroup(corresponding to our IX subgroup) memberTaPIEP1in wheat conferred host-enhanced resistance to fungal pathogenBipolaris sorokiniana[54].Tomato ERF family IX subgroup membersSlERF.A1,SlERF.B4,SlERF.C3andSlERF.A3were involved in disease resistance againstBotrytis cinerea[55].These results suggest that the ERF IX subfamily may play an important role in plant resistance.
Several GCC-box genes have been identified [56,57] as diseaseassociated (Table S7),including WRKY 33,osmotin-like protein(OSM34).These GCC-box genes can be used to probe for and verifyMeERFgenes involved in disease response by correlation of their expression.Expression ofPti4(an ERF family IX subgroup gene in tomato) inArabidopsisregulated defense-related gene expression by constitutively expressing several GCC box-containingPRgenes[58].AtERF11bound specifically to the GCC-box of theBTB AND TAZ DOMAIN PROTEIN 4(BT4)promoter to activate its transcription and mediate defense response toPstDC3000 (Pseudomonas syringaepv.tomatoDC3000) [59].
Co-expression analysis is a powerful and practical approach for investigating expression correlation among genes.Gene modules constructed based on gene expression profiles have been successfully used inArabidopsis,rice,tomato and other species to identify and verify key genes in the same pathway [60–62].In this study,we successfully applied the approach to identify candidate target genes ofMeERFsin silicoin the cassava genome.Further experimental assays showed that expression of the sixMeERFs,MeTFIIE,andMeASHR1in the cultivar SC8 under the infection ofXam11was consistent with the results from gene co-expression modules derived by WGCNA,as shown by the finding that the expression patterns ofMeERF108,MeERF137,MeTFIIEandMeASHR1were similar to but distinct from those of the other fourMeERFs (Fig.5).Construction of gene co-expression networks along with functional enrichment analysis is a useful strategy for assigning tightly coexpressed gene sets to co-functional gene clusters [62,63].A genome-wideArabidopsisgene co-expression network has been constructed;and the network clustering results provide a systems-level understanding of the gene modules that coordinate multiple biological processes [64].In the present study,the transcriptome datasets used for the gene co-expression network were obtained from condition-dependent experiments performed with the cassava materials under infection ofXam.The results could thus provide useful clues for further investigating gene regulatory networks involved cassava disease resistance.
ERFs act as activators or repressors to regulate gene transcription.InArabidopsis,a protoplast transactivation assay showed thatAtERF1,AtERF2,andAtERF5in group IX functioned as activators of transcription,whereasAtERF3andAtERF4in group VIII acted as repressors of transcription [65,66].Overexpression ofAtERF4inArabidopsisrevealed thatAtERF4acts as a novel negative regulator of JA-responsive defense gene expression and pathogenic resistance,whileAtERF2acts as a positive regulator of JA-responsive defense genes [67].These results suggest that theAtERFs in group VIII tend to act as repressors,whileAtERFs in group IX function as activators [65,67–69].In this study,the dual-luciferase reporter assay indicated thatMeERF10(group IX) andMeERF58(group II) positively regulated target genes,whileMeERF35(group VIII) negatively regulatedMeASHR1.These results suggested that theMeERFs also act to positively or negatively regulate downstream genes.Whether theMeERF genesin the same groups show conserved transcriptional activity awaits investigation.
We identified 155MeERFs and 204 GCC-box genes in the cassava genome and found 23MeERFs and 8 GCC-box genes induced byXam.We constructed four gene co-expression modules for theMeERFs and GCC-box genes based on 37 transcriptomes measured underXaminfection.SixMeERFs were significantly associated with two GCC-box genes,MeTFIIEandMeASHR1,and both were induced byXaminfection.Dual-luciferase gene reporter assays indicated thatMeERF10andMeERF58positively regulatedMeTFIIE,MeERF137negatively regulatedMeTFIIE,MeERF10,andMeERF137positively regulatedMeASHR1,andMeERF35negatively regulatedMeASHR1.The combination of comparative gene expression analysis and co-expression analysis ofMeERFs and their candidate interacting genes could provide clues to a gene network mediating pathogen response.
Declaration of competing interest
Authors declare that there are no conflicts of interest.
CRediT authorship contribution statement
Yong Xiao,Wei Xia,and Yinhua Chendesigned the experiments;Yuhui Hong,Yong Xiao,Na Song,Ke Li,Mengting Geng,and Wei Xiacurated data;Yuhui Hong,Na Song,Shousong Zhu,Rui Zhao,and Xiaohui Yuconducted all the lab experiments;Yong Xiao,Wei Xia,and Yinhua Chenperformed statistical analysis;Shousong Zhu,Rui Zhao,Ke Li,Xiaohui Yu,and Honggang Wangprepared the plant materials;Yuhui Hong and Na Songvalidated the promoter-ERF protein interactions;Yuhui Hong,Yong Xiao,and Wei Xiadrafted the manuscript;Yinhua Chenreviewed and edited the manuscript.
Acknowledgments
Thanks to Prof.De Ye of Hainan University,who made crucial suggestions and grammar corrections.This work was supported by the Natural Science Foundation of Hainan Province(2018CXTD330 and 318QN204),Key R&D Program of Hainan Province (ZDYF2019063),China Agriculture Research System (CARS-11-hncyh),and the National Natural Science Foundation of China(31560497).
Appendix A.Supplementary data
Supplementary data for this article can be found online at https://doi.org/10.1016/j.cj.2020.10.017.