亚洲免费av电影一区二区三区,日韩爱爱视频,51精品视频一区二区三区,91视频爱爱,日韩欧美在线播放视频,中文字幕少妇AV,亚洲电影中文字幕,久久久久亚洲av成人网址,久久综合视频网站,国产在线不卡免费播放

        ?

        Assembly and phylogenomic analysis of cotton mitochondrial genomes provide insights into the history of cotton evolution

        2023-12-25 09:49:54YnliFnYuknWnHjunLuJunLiDlrAktrFnLiuTinZoXinxinSnXioboLiJmsWlnTinznZnJinpinHuRonuiPn
        The Crop Journal 2023年6期

        Ynli Fn, Yukn Wn, Hjun Lu, Jun Li, Dlr Aktr,b,d, Fn Liu, Tin Zo,Xinxin Sn, Xiobo Li, Jms Wln, Tinzn Zn, Jinpin Hu, Ronui Pn,b,*

        a College of Agriculture and Biotechnology & ZJU-Hangzhou Global Scientific and Technological Innovation Center, Zhejiang University, Hangzhou 310058/311215, Zhejiang, China b Zhejiang Laboratory, Hangzhou 311121, Zhejiang, China

        c Key Laboratory of Growth Regulation and Translational Research of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou 310024, Zhejiang, China

        d Department of Genetics and Plant Breeding, Sylhet Agricultural University, Sylhet 3100, Bangladesh

        e State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 450000, Henan, China

        f Institute of Biology, Westlake Institute for Advanced Study, Hangzhou 310024, Zhejiang, China

        g College of Life Science, Zhejiang University, Hangzhou 310058, Zhejiang, China

        h Michigan State University, Department of Energy Plant Research Laboratory and Plant Biology Department, East Lansing, MI 48824, USA

        Keywords:Cotton evolution Cotton phylogeny Lint fiber cotton Mitochondrial genome Mitochondrial genes

        ABSTRACT Cotton is a major crop that provides the most important renewable textile fibers in the world.Studies of the taxonomy and evolution of cotton species have received wide attentions,not only due to cotton’s economic value but also due to the fact that Gossypium is an ideal model system to study the origin, evolution, and cultivation of polyploid species.Previous studies suggested the involvement of mitochondrial genome editing sites and copy number as well as mitochondrial functions in cotton fiber elongation.Whereas,with only a few mitogenomes assembled in the cotton genus Gossypium,our knowledge about their roles in cotton evolution and speciation is still scarce.To close this gap,here we assembled 20 mitogenomes from 15 cotton species spanning all the cotton clades (A–G, K, and AD genomes) and 5 cotton relatives using short and long sequencing reads.Systematic analyses uncovered a high level of mitochondrial gene sequence conservation,abundant sequence repeats and many insertions of foreign sequences,as well as extensive structural variations in cotton mitogenomes.The sequence repeats and foreign sequences caused significant mitogenome size inflation in Gossypium and its close relative Kokia in general, while there is no significant difference between the lint and fuzz cotton mitogenomes in terms of gene content,RNA editing,and gene expression level.Interestingly,we further revealed the specific presence and expression of two novel mitochondrial open reading frames (ORFs) in lint-fiber cotton species.Finally,these structural features and novel ORFs help us gain valuable insights into the history of cotton evolution and polyploidization and the origin of species producing long lint fibers from a mitogenomic perspective.

        1.Introduction

        As a major crop in many parts of the world, cotton is the most important source for renewable textile fibers.The cotton genus,Gossypium L.,belongs to the Gossypieae tribe under the Malvoideae subfamily within the Malvaceae family [1].Gossypium contains approximately 45 diploid species falling into eight clades (A to G,and K) as well as 5 tetraploid species (AD) originated from the hybridization of A and D species [2,3].Only the species containing A and AD genomes produce the commercially desirable lint fibers that are long and easily detached, while other species only have fuzz fibers [4].Two domesticated allotetraploid cottons, G.hirsutum (AD1) and G.barbadense (AD2), are the major producers of the renewable textile fibers.Gossypium is not only a major crop,but also an ideal model system to study the origin, evolution,and cultivation of polyploid species.

        Several lines of evidence indicate the importance of mitochondria in fiber elongation.After the initiation stage,cotton fiber cells elongate rapidly, and a higher energy input is required for the growth.The energy factory, mitochondrion, plays a crucial role by providing more ATP during this period[5,6].Mitochondrion is a semi-autonomous organelle and functions through the cooperation of the nuclear and its own genome, the mitogenome.Plant mitogenome encodes typically 30–40 protein-coding genes involved in the respiratory electron transport chain,ATP synthesis,cytochrome c maturation, the assembly of mitochondrial ribosomes,etc.[7].Evidence revealed that the mitogenome could affect the fiber development significantly.The mitochondrial genes’copy number and expression level showed upregulation during fiber development [8].Two RNA editing sites on subunit apt1 are found to be crucial for ATP synthase to produce sufficient ATP for fiber cell elongation in upland cotton AD1[6].Different mutants of pentatricopeptide repeat (PPR) proteins, which alter RNA sequence,turnover, processing, or translation, exhibited abnormal fiber development [9,10].Similarly, nucleus-encoded mitochondrial ATP synthase and complex I subunits are also known to be related to fiber growth [5,11].

        In contrast to the importance of mitochondria,studies on cotton mitogenomes are rare.To date, at least 12 nuclear genomes have been sequenced and continue to be updated with new technologies(CottonGen: https://www.cottongen.org) [12]; the known cotton plastid genomes (plastomes) has covered all the genome types[13].However, only mitogenomes of one A (G.arboreum, A2), five Ds (G.thurberi, D1; G.harknessii, D2-2; G.davidsonii, D3-d; G.raimondii, D5; G.trilobum, D8) and two ADs (G.hirsutum, AD1; G.barbadense, AD2)were assembled[14–17].The mitogenomes of other cottons, as well as their close relative species, remain unknown.The seldomly assembled mitogenomes hinder the comprehensions of cotton speciation and evolution from a mitogenomic perspective.

        Plant mitochondrial genes are often too conserved in sequence for phylogenetic analysis.However,since plant mitochondrial genomes are maternally transmitted and often acquire structural changes, they may provide valuable information for mapping the routes of natural hybridization and speciation.Moreover, mitogenomic structural changes can not only cause functional disruption of mitogenome-encoded proteins, such as those related to cytoplasmic male sterility (CMS) [18], but are also hypothesized to have the potential to create completely novel gain-of-function protein-coding ORFs, as the mitogenome generates prokaryotictype polycistronic transcripts.This hypothesis remains to be experimentally proved.

        An important question in cotton evolution is the origin of the allotetraploid AD genomes, which gave rise to the domesticated cotton species.Which of the currently existing 13 D- and 2 Aspecies are the closest to the allotetraploid progenitors used to be under debate[2].In general,the D-genome donor was believed to be D5,while the A-genome donor had been proposed to be either A1or A2[19].A more recent analysis of the nuclear genomes indicates that an extinct A0species,presumably the common ancestor of A1and A2, was the A parent during AD natural hybridization[20].Analysis of organelle genomes such as the mitochondrial genome may be an alternative approach to address this question.Another question in cotton evolution is the phylogenetic position of the B clade,which consists of African species[21].In the phylogenetic tree generated using nuclear markers,B clustered with the African and Asian species that contain A and F genomes [2].However, in the phylogenetic tree of the plastid genomes, B clustered with the Australian species that contain C, G, and K genomes[2,13].Mitogenomic analysis could potentially help to clarify this issue as well.

        Based mainly on short sequencing reads, eight mitogenomes,including A2, D1, D2-2, D3-d, D5, D8, AD1, and AD2from A, D, and AD clades, have been previously assembled [14–17], whereas the mitogenomes from the other 6 clades remain uncharacterized.In this study,we updated five of the eight known mitogenomes with long sequencing reads and meanwhile assembled mitogenomes of all the other cotton clades and their relatives in the subfamily Malvoideae.Hence,we assembled a dataset of 20 cotton mitogenomes,which covered all the cotton clades (A–G, K, and AD) as well as 5 cotton relatives.Systematic analysis of this dataset uncovered a high level of gene sequence conservation, extensive sequence repeats, many insertions of foreign sequences, considerable structural variations,and some novel genes.Our discovery of the structural features and novel genes of cotton mitogenomes has helped to provide insights into cotton evolution from a mitogenomic perspective.

        2.Materials and methods

        2.1.Plant materials

        Whole genome sequencing data containing raw long and short reads of 19 species from subfamily Malvoideae was downloaded from NCBI SRA (https://www.ncbi.nlm.nih.gov/sra/).The mitogenomes from 10 species assembled with both long and short reads are Gossypium herbaceum (A1), G.arboreum (A2), G.hirsutum(AD1), G.barbadense (AD2), G.tomentosum (AD3), G.mustelinum(AD4), G.darwinii (AD5), G.raimondii (D5), G.longicalyx (F1) and G.bickii (G1).The other nine mitogenomes were assembled using short reads only, which included G.anomalum (B1), G.sturtianum(C1),G.thurberi(D1),G.somalense(E2),and 5 close relatives of cotton–two from Gossypieae(Kokia drynarioides and Thespesia populneoides), one from Hibisceae (Abelmoschus esculentus), and two from Malveae (Malva sylvestris and Sida trichopoda).Accession numbers are listed in Table S1.

        Cotton tissues used for RT-PCR analysis were obtained from The National Wild Cotton Nursery, Sanya, Hainan, China.

        2.2.Genome assembly and gene annotation

        For the long reads,de novo assembly was performed using FLYE v2.8.3 [22].Mitochondrial contigs were identified from total contigs by BLASTN v2.10.0+ [23] against the mitogenome of G.hirsutum (GenBank: NC_027406) and connected manually in GENEIOUS R10 (Biomatters, Inc.), following the procedure in our previous publication [24].The repeats within the mitogenomes were resolved based on their short-read coverage, and the plastid insertions were resolved by their locations and directions on the plastome.Duplications in cotton mitogenomes are usually identical or have close to 100% sequence similarities.In general, after solving all the repeats and plastid insertions, a circular chromosome was generated.However,in some species,the repeats at both ends could not be joined, resulting in a linear chromosome(Table S2).

        For the short reads, low-quality bases were filtered through TRIMMOMATIC v0.36 [25].De novo assemblies were performed using SPAdes v3.13.1 [26], and completed in a similar way as described above for the long reads.Mitochondrial protein-coding and rRNA genes were annotated by their sequence similarities with the known mitochondrial genes, and tRNAs were predicted using tRNAscan-SE v2.0 [27].Plastid insertions were determined by BLASTN against plastomes of Malvoideae (identity > 90%), and repeats were determined by sequence similarity among the repeat sequences (identity> 95%).Plastomes of Gossypium thurberi, Kokia drynarioides, Malva sylvestris, Sida trichopoda and Thespesia populneoides were assembled and annotated in a similar way.

        Both long and short reads were used for G1,AD3,and AD4,while only short reads were used for the other species.To prevent potential mis-assemblies, short sequencing reads of A1-2, AD1-2, D1,5, B1,F1,and Kokia from previous studies were mapped back to the mitogenomes (Table S1).All the mitogenomes assembled in this study were fully covered by these reads,which confirmed the correctness of our assemblies.

        2.3.Mitogenome synteny analysis

        The homologous sequences between mitogenomes were searched using BLASTN with default parameters, after which hits longer than 300 bp were used to plot in R (https://www.r-project.org).The circos plot was drawn using CIRCOS v0.69 [28].The boxplot was plotted using ggplot2 [29] in R software.

        2.4.Phylogenetic analysis

        To reconstruct the phylogeny of Malvoideae,three datasets that include mitochondrial CDS, plastid genes (CDS and introns), and nuclear 45S (18S, 5.8S, and 25S rRNAs and the spacer regions),were prepared.The plastomes used were from G.tomentosum(GenBank: NC_016690), G.hirsutum (NC_007944), G.longicalyx(NC_023216), G.raimondii (NC_016668), G.sturtianum(NC_023218), G.barbadense (NC_008641), G.herbaceum(NC_016692), G.darwinii (NC_016670), G.mustelinum(NC_016711), G.arboreum (NC_016712), G.anomalum(NC_023213), G.bickii (NC_023214), G.somalense (NC_018110),Abelmoschus esculentus (NC_035234), Hibiscus cannabinus(NC_045873),Corchorus olitorius(NC_044468,outgroup),and Bombax ceiba(NC_037494,outgroup).Besides the mitogenomes assembled in this study, the mitogenomes of Hibiscus cannabinus(NC_035549) and Bombax ceiba (NC_038052, outgroup), and the 45S dataset of Hibiscus syriacus (KM117267) and Theobroma grandiflorum(JQ228378,outgroup)were also used.The sequences were aligned with the AUTO mode through MAFFT v7 [30].Then the matrices were concatenated to build ML trees using IQTREE v1.6.12 [31] under the following parameter: -bb 1000 -m GTR + G 4 + F -me 0.0001 runs 10.Figtree v1.4.2 (https://github.com/rambaut/figtree) was used to display the trees.

        2.5.Searching for sdh3 homologs in nuclear genomes

        We downloaded from the CottonGen website [12] nuclear encoded peptide sequences from A1and A2[20], AD1and AD2[32], B1[33], D1[34], D5and K12[35], E1[35], F1[36], and Gossypioides kirkii [37].Then we used the mitochondrial sdh3 peptide sequence from A1as the query to search for homologs in the nuclear peptides by BLASTP.Except B1and Gossypioides,all the species contained at least one copy of nuclear encoded SDH3.Sequences from the nuclear SDH3 and mitochondrial sdh3 genes were used to build the gene family tree using IQTREE.

        2.6.Analysis of the expression level of mitochondrial genes using transcriptomic data

        Since organelle transcripts generally do not contain a polyA-tail[38], transcriptomes enriched with oligo-dT would introduce bias for nuclear encoded genes.To this end, we first used rRNAdepletion RNA-seq data, i.e., SRR5468501, SRR5468504,SRR5468507, SRR5468510, SRR5468513 and SRR5468516 from Zhao et al.[39], which was obtained from sequencing using random primers, to check the expression of rps7 and the two new ORFs.Since most publicly available transcriptomic data were obtained from RNA-seq of polyA-enriched transcripts, this type of data was used to check the expression level of the mitochondrial genes in various organs and developmental stages.Although these data could not represent the real expression level of the mitochondrial genes, they still reflect the relative value and variation trend when comparing different organs of the same species.Low-quality bases were trimmed by TRIMMOMATIC,then the clean reads were mapped to mitogenomes using HISAT2 v2.1.0 [40].HTSeq-Count v0.11.2 [41] was used to count the reads mapped to the genic regions with the parameter ‘‘-nonunique all -m intersectionnonempty”.We used a modified FPKM,termed mFPKM(mitochondrial fragments per kilobase of exon model per million fragments),to calculate the mitochondrial expression level as shown below:

        2.7.RNA isolation and RT-PCR analysis

        Leaf and fiber (0, 10, and 20 DPA) materials from A2and AD1-2and leaves from A1and D1,5were used to check the expression of orf102 and orf154 in vivo.RNA extraction was performed using an Omega RNA isolation kit and reverse-transcribed into cDNA using random primers.Semi-quantitative RT-PCR of transcripts containing the two new mitochondrial genes was conducted,with primers designed using GENEIOUS (Table S2).

        3.Results

        3.1.Assembly of mitogenomes from 20 species and gene sequence analysis

        To obtain insights into cotton evolution from the mitogenomic perspective, we assembled 20 mitogenomes from the Malvoideae subfamily(Fig.1A).Long reads are beneficial for plant mitogenome assembly, because plant mitogenomes tend to contain sequence repeats that may hamper correct genome assembly.To this end,we first used the third-generation long reads to reassemble the mitogenomes of AD1, AD2, A2, D5, and D1to replace the previously published data that had been assembled using short reads.These 5 updated mitogenomes display obviously different synteny from their previously published versions (Fig.1B), which could be affected by the uncertain architecture known for plant mitogenomes.Interestingly, an additional 9 kb fragment was uncovered in A and AD mitogenomes(blue arrows in Fig.1B).We also assembled mitogenomes from 10 additional Gossypium species,including A1,B1,C1,E2,F1,G1,K8,AD3,AD4,and AD5,along with 5 cotton relatives — 2 close relatives from the cotton tribe Gossypieae (Kokia drynarioides and Thespesia populneoides),1 species from tribe Hibisceae (Abelmoschus esculentus), and 2 species from tribe Malveae(Malva sylvestris and Sida trichopoda) (Fig.1A), using either long or short sequencing reads(Table S1).This dataset containing mitogenomes from 20 cotton species and close relatives allowed us to perform a systematic mitogenomic analysis.

        A total of 36 known genes were identified in the mitogenomes assembled in this study,with an average nucleotide sequence identity of 99.3%among the species.Most of these genes are present in all mitogenomes,except that at least one of the three genes—rpl2(ribosomal protein L2), rps7 (ribosomal protein S7), and sdh3 (succinate dehydrogenase 3) — are absent in four of the cotton relatives(Fig.S1A; Table S3).In all the 15 cotton mitogenomes, 25 out of the 36 genes are completely identical in sequence,and the remaining 11 genes only differ by a few base pairs(Table S3).For example,an A-to-T transversion caused a nonsynonymous mutation in the cox3 gene in F, A, and AD (Fig.S1B).The sdh3 gene contains a 44-bp fragment in C1that is completely different from those in other cotton species (Fig.S1C) and a premature termination codon in E2,whereas at least one copy of SDH3 can be found in all the Gossypium nuclear genomes(Fig.S1D).This finding suggests that a duplicated copy of sdh3 likely transferred to the nuclear genome while the mitochondrial copy might have lost its function.

        Fig.1.Assembly and analysis of cotton mitogenomes.(A) Malvoideae mitogenomes assembled in this study.The length of each bar on the right indicates the mitogenome size,with blue and red regions representing the proportions of sequence repeats and plastid-like sequences,respectively.The mitogenome of Hibiscus cannabinus(GenBank:NC_035549;Hibisceae)used here is a previously published high-quality assembly.(B)Synteny comparison of A2,AD1,AD2,D1,and D5 mitogenomes assembled in this study and previous studies.The grey ribbons connect homologous sequences.Blue arrows indicate the position of the 9 kb fragment absent from previous assemblies.(C)Distribution of the 38 sequences specific to Gossypium and Kokia.Dots indicate the 38 specific sequences, with their size representing the sequence length.Magenta dots highlight Seq19 and Seq20 that are only found in F1,A,and AD species.(D)Presence of the 9 kb fragment and Repeat933 in F1,A,and AD mitogenomes with reference to the position of the rps7 and nad4L genes.Rearrangements likely happened between Repeat933 that resulted in differences in structures are also depicted.

        To determine if the mitogenomic sequences can be used to resolve the relations among cotton species, we generated a maximum likelihood (ML) phylogenetic tree of cotton mitogenomes using mitochondrial gene sequences.The 36 mitochondrial genes are concatenated into one 31,167-column matrix,which however, only contain 210 parsimony informative sites (PIS) because of the highly conserved sequences.Due to the insufficient number of PIS, this tree can correctly distinguish the tribes but not species within the cotton tribe(Fig.S2).

        3.2.Extensive sequence repeats and insertions of foreign sequences in cotton mitogenomes

        Despite the high gene sequence conservation across the 20 mitogenomes, a remarkable size expansion of the mitogenome was discovered in Gossypium and its close relative Kokia, which averaged 643 kb in length in contrast to the average size of 531 kb in other species (Fig.1A; Table S3).To trace the origins of the expansion,we searched the mitogenomes for sequence repeats and sequences likely transferred from the plastid genome.Compared with those of other species, Gossypium and Kokia mitogenomes contain many more repeats (ca.11.2% vs.4.3% of the mitogenome) but few sequences potentially from the plastid genome (Fig.1A; Table S4).However, the mitogenome size expansion(ca.110 kb) cannot be fully explained by the presence of repeats,which take up ca.70 kb of each mitotogenome (Tables S3 and S4).To explore other possible contributors to this expansion, we searched for sequences (> 500 bp) that are specifically present in the mitogenomes from Gossypium and Kokia and identified 38 such sequences (Seq01–Seq38) (Fig.1C; Table S5).A search against the NCBI nucleotide database with these 38 sequences (E-value:1e-10, word size: 28, align length: > 100 bp for BLAST) revealed that most of them show similarities with mitogenome sequences from distantly related taxa, especially Fabales and Sonalales(Table S6), therefore may have been derived from these genomes.Hence, the expansion of cotton mitogenome size is mainly contributed by sequence repeats and insertions from mitochondrial genomes of distantly related species.

        Interestingly, Seq19 and Seq20, which are present in the aforementioned 9 kb fragment found in Gossypium and Kokia (Fig.S3A,1B), are specifically present in F, A, and AD cottons (Fig.1C).The 9 kb fragment is located downstream of nad4L in F1and AD2but downstream of rps7 in A1-2and AD1,3-5.A repeat sequence(Repeat633) is always present immediately downstream of rps7 and nad4L and upstream of the 9 kb fragment in these species,indicating that sequence rearrangements via Repeat633 may have occurred twice during A and AD evolution(Fig.1D).To test the correct structure at Repeat633 and potential heteroplasmy(i.e.,coexistence of different mitogenome topologies in one individual), we searched the four structures (Fig.1D) against long sequencing reads from multiple studies and counted the reads that can cover through.In F1and AD2, the 9 kb fragment locates mostly behind nad4L, while in A and other ADs, the 9 kb fragment locates mainly behind rps7 (Fig.S3B).The result revealed that, though different structures coexist in the mitogenomes, our assemblies at Repeat633 could represent the main structure of the species.

        3.3.Structural variations in cotton mitogenomes support A0 to be the A-genome donor of AD and B to be closely related to F

        Further analysis of the structure of the Malvoideae mitogenomes revealed significant differences,which may have been attributed to sequence rearrangements and insertions.Using the C1cotton G.sturtianum as an example, homologous sequences between its mitogenome and those of the cotton relatives Abelmoschus and Malva are short and dispersed (Fig.2A).Within the same cotton clade, mitogenomic structures are highly similar,except that AD2and AD3have some structural rearrangements compared with the other AD genomes (Fig.S4).Among different clades within Gossypieae,the mitogenomic structure differs significantly(Fig.2B);examples are given below.These finding led us to reason that,using mitogenomic structure as an indicator,we might be able to infer the origin of the allotetraploid AD genomes.

        By comparing AD1with A2or D5, two existing diploid species most closely related to the allotetraploid progenitors, we found that the mitogenomic structure of AD1is more similar to A2than to D5(Fig.2C).Also, as described earlier in this report, the 9 kb fragment (Fig.1B, C) and the A-to-T mutation in cox3 (Fig.S1B)are specifically present in F,A and AD.These observations indicate that A is likely the maternal parent of AD, as organelles are inherited maternally in cotton.Furthermore, despite the high sequence similarities between A and AD mitogenomes, a 373-bp insertion was detected in the intron region of the rps3 gene in A1and A2,but not in the same position in AD or the other cotton species(Fig.2D).Based on the parsimony principle,we predicted that this insertion in rps3 occurred after A-D hybridization and prior to the separation of the A-species (Fig.2E).Therefore, our findings support that A0, the common ancestor of A1and A2, is the A-genome donor as well as the maternal parent of AD during the hybridization of the two genomes (Fig.2E).

        Another dilemma in cotton evolution is the position of the B clade, whereby phylogenetic trees generated by plastid and nuclear genomes in previous studies placed this clade in different places in phylogeny.To see if this issue could be clarified with our own datasets, we constructed the plastid and nuclear trees using combined plastid gene sequences and the nuclear 45S rRNA gene, respectively.B is close to F in the nuclear tree (Fig.S5A) but grouped with C,G,and K in the plastid gene tree(Fig.S5B),which is consistent with results from previous studies.On the mitochondrial tree,B is also close F,but with poor support(Fig.S2).To determine the evolutionary position of B using mitogenomes, we compared the mitogenomic structure of B1with those of F1, C1,G1and K8, and found that B1shares the longest colinear blocks with F1(Fig.2F).We suspected that B1is closer to the African and Asian species F than to the Australian species C,G,and K in cotton phylogeny, which is supported by data from both the nuclear and mitochondrial genomes.It is likely that B1plastid might have a different origin from those in other closely related species due to a plastid capture event during B speciation.

        3.4.Two novel mitochondrial genes are specifically present and expressed in cotton species with lint fibers

        We next examined whether mitogenomes have obtained any new features during the evolution of A and AD cotton species,which contain lint fibers.We noticed that the 36 mitochondrial genes found in the mitogenomes of this study are very well conserved in all the fifteen newly assembled cotton mitogenomes.Mitochondrial RNA editing sites were previously found to be needed to support fiber growth, prompting us to speculate that,despite their highly conserved gene sequences, some RNA editing sites may differ among the mitogenomes.To this end, we compared the mitochondrial RNA editing sites in three representative species,A2,AD1,and D5,using previously published transcriptomic data of 0-day post-anthesis (DPA) ovules and leaves (Table S7).A total of 450–460 editing sites were uncovered in mitochondrial RNA from each of the three species, with no noticeable site difference between tissue types or species (Table S8), suggesting that differences in RNA editing may not be involved in the speciation of lint-fiber cottons.Furthermore, a modified FPKM (fragments per kilobase of exon model per million mapped fragments) value,or mFPKM,was used to indicate the expression level of mitochondrial genes.No obvious expression difference was found in the mitochondrial genes between the lint- and fuzz-fiber cottons(Fig.S6).

        Fig.2.Structural variation in cotton mitogenomes provides important clues to cotton evolution.(A)Mitogenomic synteny between the three Malvoideae tribes.Homologous regions are linked by solid lines.Numbers shown are in kb.(B) Mitogenomic synteny among Gossypieae species.(C) Mitogenomic synteny among an allotetraploid species(AD1) and two diploid species (A2, D5) closely related to the allotetraploid progenitors.(D) Schematics of sequence comparison of rps3 from various species.The two long black bars indicate the 373-bp insertion specifically found in the intron of A1 and A2.The shorter black bars/lines indicate divergent bases.In the rps3 gene structure shown on the top,gray boxes indicate exons and the line indicates intron.(E)A hypothetical sequence of events occurring during AD hybridization.A0 is a hypothetic ancestor of AD as the A-genome donor.(F)Boxplots of the length of hits(>1000 bp;red dots)resulted from blasting the mitogenome of B1 against those of F1,C1,G1,and K8.F1 receives fewer but on average longer hits than the other species.

        Next,we extended our search for differences beyond the known protein-coding regions of the transcripts, and discovered interesting differences in the rps7 transcript between the lint- and fuzzfiber cottons.In the fuzz fiber-containing species D5,the rps7 transcript is similar to those in Arabidopsis and rice (Figs.3A, Fig.S7),but in the lint fiber-containing species A2and AD1, it contains a C-terminal 1700-bp extension (Fig.3A).This transcript extension in A2and AD1comes from sequence in the region containing Repeat633 and the 9 kb fragment identified earlier in this work(Fig.3B) and harbors two open reading frames (ORFs) that can be translated within the polycistronic transcript.The 102-aa orf102 is a part of Repeat633, whereas the 154-aa orf154 is derived from Seq19 within the 9 kb fragment (Fig.3B).Blast searches against NCBI nr databases did not identify sequences with significant similarities to these two ORFs from any eukaryotic or prokaryotic genomes, indicating that orf102 and orf154 are likely novel mitochondrial genes created during cotton evolution.

        Fig.3.Two novel mitochondrial genes are formed and specifically expressed in A and AD cottons.(A) Mappings of the leaf and ovule RNA-Seq data onto the mitogenomes identified an expansion of the rps7 transcript in A2 and AD1.Blue regions indicate transcription.(B)Two new ORFs are located in the extended transcript:orf102 in Repeat633 and orf154 in Seq19.The grey region indicates transcription.(C)A heatmap showing the expression level(log2 mFPKM)of the mitochondrial genes in the leaf tissue of D5,A2,and AD1.(D)Gene structures of rps7,nad4L,and the two new ORFs in A1,D1 and D5.Primer pairs in rps7(F1&R1),orf102(F2&R2),and orf154(F3&R3)were used for RT-PCR analysis.(E)RT-PCR analysis of rps7,orf102,and orf154 transcripts in the leaf tissue of A1,D1 and D5.DNA marker sizes are 5000,3000,2000,1000,750,500,250,and 100 bp,respectively.

        To determine if these two new mitochondrial genes are expressed, we first searched the published transcriptomic data of A2, AD1, and D5using mFPKM (Table S7).The expression level of orf102 is much higher in lint fiber cottons A2and AD1than in D5,and the expression of orf154 is relatively high in A2and AD1but undetectable in D5(Fig.3C).We then performed RT-PCR analysis of orf102 and orf154 in D5and two additional cotton species, A1with lint fibers and D1with fuzz fibers.These two genes are highly expressed in A1but no transcripts can be detected in D1or D5(Fig.3D, E).Therefore, among the species analyzed, orf102 and orf154 seem to be specifically expressed in lint-fiber cottons.

        To further verify the specific expression of orf102 and orf154 in lint-fiber-containing species,we analyzed their transcripts in additional cotton tissues and lint-fiber cotton species.The orf154-containing 9 kb fragment, which is specific to F, A and AD species, is downstream of rps7 in A and most AD species but downstream of nad4L in F1and AD2.The orf102-containing Repeat 633 has one copy downstream of nad4L in all the A, D, and AD species, and an additional copy downstream of rps7 in A and AD species but not in D species(Fig.3D).Further analysis of the previously published transcriptomic data found that both orf102 and orf154 are expressed in various tissues in the lint-fiber cottons A1,AD1and AD2but exhibit extremely low or no expression in the fuzz-fiber cottons D5and F1(Fig.4A).RT-PCR analysis showed that both the leaf and fiber(0,10,and 20 DPA)tissues of A2,AD1,and AD2contains high expression of orf102 and orf154, as well as the extended transcripts containing rps7-orf102 and rps7-orf154 fusions, with the expected missing of the rps7-orf154 fusion transcript in AD2(Fig.4B,C).Taken together,we conclude that orf102 and orf154 are specifically expressed in lint-fiber cottons, irrespective of their location as a C-terminal extension of rps7 or nad4L.Our findings strongly suggest a potential role for these two newly evolved mitochondrial genes in lint-fiber cotton speciation,a hypothesis worth following up with functional analysis in the future.

        4.Discussion

        In recent years, cotton nuclear genomes have been extensively studied, which provided useful information to the understanding of cotton evolution and opportunities to genetic breeding.However,many important questions remained,including those regarding the natural hybridization of the allotetraploids and speciation of the lint fiber cottons.In this study, we use a new approach to address these questions by systematically analyzing the mitogenomes of the cotton genus Gossypium and its relatives.

        4.1.Cotton phylogeny based on mitogenomes

        Fig.4.Expression of rps7, orf102, and orf154 in different cotton species and tissues.(A) Heatmaps showing the expression level (log2 mFPKM) of rps7, orf102, and orf154 in different cotton species and tissues.(B)Gene structures of rps7,nad4L,orf102,and orf154 in A2,AD1 and AD2.Primer pairs for the amplification of rps7,orf102,orf154 and their fusions in RT-PCR analysis are listed on the right.(C)RT-PCR analysis of rps7,orf102, and orf154 in the leaf and fiber tissues of A2,AD1 and AD2.DNA marker sizes are 5000,3000, 2000, 1000, 750, 500, 250, and 100 bp, respectively.

        When studying plant evolution, nuclear and plastid genomes but not their mitochondrial counterparts have been well utilized,because plant mitochondrial genes tend to be highly conserved and lack sequence changes required for phylogenetic analysis.Our mitogenomic analysis of cotton and its relatives clearly showed that these mitogenomes have evolved slowly in gene sequence but rapidly in genome structure.We predict that the extensive structure changes may be due to the presence of a large number of sequence repeats, which can promote sequence rearrangements, as well as foreign sequence insertions.Although the high conservation in gene sequence makes it difficult for us to construct a gene-based phylogenetic tree like those using plastid and nuclear genes[13,42,43],we were able to take advantage of the dynamic mitogenomic structural variations to infer phylogenetic relationships among cotton species.We have obtained evidence for the A-genome origin of the allotetraploid AD cottons, supporting the view recently proposed based on the study of nuclear genomes that an extinct A0species is the A-donor of the allotetraploids [20].Our findings have also helped with clarifying the position of the B clade in cotton phylogeny,supporting the nuclear genome-based conclusion that B is closer to F and A[2] but contradicting the conclusion based on plastid genes that found B1to be grouped with C, G, and K [13,43].We hypothesize that the plastid in B1might have been obtained via a plastid capture event during B speciation [44].We believe that analysis of the mitogenomic structure can be widely used in plant evolution studies, especially for those aiming to understand the origin of natural polyploids.4.2.Two mitochondrial ORFs specifically present and expressed in lintfiber cottons

        Our systematic comparison of the mitogenomes from cotton species with and without lint fibers found no consistent distinctions in the number, sequence, and RNA editing sites of known mitochondrial genes.Instead, we discovered two novel ORFs,orf102 and orf154, which are derived respectively from Repeat633 and a 9 kb fragment – sequences specifically inserted in F, A, and AD mitogenomes,and are uniquely transcribed in lint-fiber cottons A and AD.Given that Repeat633 is a conserved sequence in all cottons as well as some cotton relatives,we speculate that Repeat633 was located downstream of nad4L in a common ancestor of cotton,a position that was passed on to C, D, E, G and K.However,Repeat633 duplicated in the ancestor of A, B and F, and a second Repeat633 inserted downstream of rps7, after which the copy downstream of nad4L was lost in B (Fig.5).The location of‘‘Repeat633 + 9 kb fragment” downstream of rps7 in A but downstream of nad4L in F could be explained by a possible rearrangement of the two Repeat633 sequences during A speciation(Fig.5).During natural hybridization and polyploidization,AD species inherited the A-type orientation with the exception of AD2,which had obtained the F-type orientation possibly due to another rearrangement of the two Repeat633 sequences during AD2speciation (Fig.5).

        Fig.5.Hypothetical evolutionary history of orf102 and orf154 in cottons.Repeat633 was located downstream of nad4L in a common ancestor of cotton, a position that was passed on to C,D,E,G and K.Repeat633 duplicated in the ancestor of A,B and F,and a second Repeat633 inserted downstream of rps7,after which the copy downstream of nad4L was lost in B.A rearrangement of the two Repeat633 sequences during A speciation resulted in the location of‘‘Repeat633+9 kb fragment”downstream of rps7 in A but downstream of nad4L in F,after which the transcription of rps7 was extended.During natural hybridization and polyploidization,AD species inherited the A-type orientation with the exception of AD2, which had obtained the F-type orientation due to another rearrangement of the two Repeat633 sequences during AD2 speciation.Grey peaks indicate transcription levels; black arrows indicate the direction of mitogenome changes during cotton evolution; and red arrows indicate sequence rearrangements.

        4.3.Can new genes be created in the mitogenome?

        It is well known that, in comparison to those of animals and yeasts, mitogenomes of seed plants are greatly inflated in size[7].The large plant mitogenomes contain a lot more sequence repeats,foreign DNA insertions,and sequences of unknown origin,yet the number of plant mitochondrial genes stays intriguingly conserved [45,46].Transfer of genes from the mitogenome to the nuclear genome through horizontal gene transfer, and disruption of mitochondrial genes such as those resulted in cytoplasmic male sterility (CMS), have been well documented [18].Plant CMS,including that in cottons, can be caused by new ORFs or chimeric ORFs through the combination of modified native genes[17,47,48].However, completely novel mitogenomic ORFs, which are gain-of-function, have never been reported.

        It is conceivable that plant mitogenomes, which change frequently in structure through recombination,sequence duplication,and foreign sequence insertion, and contain polycistronic transcripts, can potentially foster the formation of new proteincoding genes.A ‘‘Mitochondrial Fostering” theory was proposed,which posited that the organelle genome plays an integral role in the arrival and development of orphan genes (genes with no homologs in other lineages) that can then be transferred to the nuclear genome [49].Consistent with this theory, a high correlation was discovered between the amount of mitochondrial DNA transferred to the nuclear genome and the number of orphan genes in land plants, suggesting a role for the mitochondrial genome in the evolution of nuclear orphan genes in land plants [49].This exciting possibility needs to be further tested, which may lead to new findings about the role of the mitogenome in plant evolution and speciation and new strategies in cotton breeding.

        CRediT authorship contribution statement

        Yanlei Feng:Methodology,Investigation,Data curation,Visualization, Writing - Original Draft, Writing - review & editing.Yukang Wang:Investigation, Data curation.Hejun Lu:Resources,Investigation, Data curation.Jun Li:Investigation.Delara Akhter:Investigation.Fang Liu:Resources.Ting Zhao:Investigation.Xingxing Shen:Investigation.Xiaobo Li:Investigation, Funding acquisition,Writing-Original Draft.James Whelan:Investigation,Writing-Original Draft.Tianzhen Zhang:Resources,Investigation,Writing - Original Draft.Jianping Hu:Investigation, Writing -Original Draft, Writing - review & editing.Ronghui Pan:Conceptualization, Funding acquisition, Supervision, Writing -Original Draft, Writing - review & editing.

        Declaration of competing interest

        The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

        Acknowledgments

        This work was supported by the Zhejiang Natural Science Foundation Outstanding Youth Grant (LR20C020002), the Zhejiang Provincial Natural Science Foundation of China (LZ23C020002),the National Natural Science Foundation of China (32200231),the National Key Research and Development Program of China(2022YFD1401600), the Leading Innovative and Entrepreneur Team Introduction Program of Zhejiang (2019R01002), Key Research Project of Zhejiang Lab (2021PE0AC04), and the U.S.National Science Foundation (MCB 2148206).

        Appendix A.Supplementary data

        Supplementary data for this article can be found online at https://doi.org/10.1016/j.cj.2023.05.004.

        中文字幕人乱码中文字幕乱码在线 | 亚洲天堂av福利在线| 又粗又黑又大的吊av| 少妇人妻真实偷人精品视频| 国产一区二区丰满熟女人妻| 在线观看中文字幕不卡二区| 国产成人精品免费视频大全软件| 亚洲av综合日韩| 国产99页| 成人免费毛片立即播放| 国产精品人人做人人爽人人添| 野花社区视频www官网| 欧美日韩亚洲一区二区精品| 日本一区二区在线播放| 免费看美女被靠到爽的视频| 97se亚洲国产综合自在线| 亚洲av成人一区二区三区网址| 亚洲乱码av中文一区二区第八页| 十八禁无遮挡99精品国产| 欧美日韩精品一区二区三区不卡| 国内精品91久久久久| 论理视频二区三区四区在线观看| 日本乱偷人妻中文字幕| 国产欧美乱夫不卡无乱码| 中文字幕中乱码一区无线精品 | 亚洲国产中文字幕精品| 国产综合无码一区二区辣椒| 久久国产精品国产精品日韩区| 亚洲人妻av在线播放| 亚洲色一区二区三区四区| 中国xxx农村性视频| 日日噜噜夜夜狠狠2021| 护士人妻hd中文字幕| 亚洲精品www久久久| jjzz日本护士| 日本大片一区二区三区| 性生交片免费无码看人| 久久aⅴ无码一区二区三区| 久久综合这里只有精品| 久久精品中文字幕| 免费人成年小说在线观看|