XU Hanzhi ,LIU Huiru ,ZHANG Hua ,and HE Maoxian
1) CAS Key Laboratory of Tropical Marine Bio-Resources and Ecology, Guangdong Provincial Key Laboratory of Applied Marine Biology,South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou 510301, China
2) University of Chinese Academy of Sciences, Beijing 100049,China
3) Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), Guangzhou 511458,China
Abstract The pearl oyster Pinctada fucata martensii is an economically valuable shellfish that is cultured for seawater pearl production,which mainly depends on oyster growth.However,the growth mechanisms of the pearl oyster are still poorly understood.In this study,oysters were grouped with relative growth rate,including fast-growing (FG) group and slow-growing (SG) group.Oxford Nanopore Technologies (ONT) long-read sequencing was applied to investigate the molecular mechanisms involved in the growth of this species.Five alternative splicing (AS) types were analyzed in both FG and SG groups,which include alternative 3’ splice site,alternative 5’ splice site,exon skipping,intron retention,and mutually exclusive exon.Transcriptome analysis showed that four of five different AS events (excluding mutually exclusive exons) occurred more frequently in FG than in SG oysters,and the five main AS types exhibited different characteristics.The AS events that were detected may be involved in growth,and the difference in expression of AS events between FG and SG oysters may be involved in the mechanism underlying the difference in growth.Fifty differentially expressed genes (DEGs) were identified between the FG and SG oysters.The results showed that 40 genes were significantly up-regulated in FG oysters,while 10 genes were significantly down-regulated in SG oyster.Several genes related to nutrient metabolism,shell formation,and immunity were more highly expressed in FG oysters than in SG oysters.In summary,FG oysters exhibited higher metabolic and biomineralization activities and had a more powerful immune system than SG oysters.These results provide insight into the growth of P.f.martensii that can be used to improve breeding programs.
Key words Pinctada fucata martensii;differential growth;Oxford Nanopore Technologies sequencing;differentially expressed genes
The pearl oysterPinctada fucata martensiiis a shellfish species with a high economic and ecological value.It is widely cultured in the South China Sea,particularly in Guangdong,Guangxi,and Hainan provinces mainly for pearl production (Shiet al.,2009;Zhanget al.,2020).To address the problems of slow growth and mass mortality of pearl oysters that negatively impact the industry,researchers are developing breeding programs to improve brood stock quality by selecting species with rapid growth and high disease resistance capability (Denget al.,2009).Individual growth and size play crucial roles in pearl yield,while fast-growing (FG) oysters produce larger and thicker pearls than their slow-growing (SG) counterparts (Fuet al.,2012).Hence,understanding the growth mechanisms of this species is an essential step for improving the pearl industry.
With the development of multi-omics analysis and sequencing of theP.f.martensiigenome (Adzigbliet al.,2019),researchers have identified numerous genes related to growth.For example,Haoet al.(2019) used metabolomic and transcriptomic approaches to assess the metabolic and transcript differences between FG and SGP.f.martensiigroups.They found that collagen and other extracellular matrix-related proteins,as well as integrin-related genes,gluconolactonase,fructose-1,6-diphosphatase,and glutathione S-transferase genes are involved in growth regulation.Guan and He (2017) constructed two RNA-Seq libraries in diploid and triploidP.f.martensiiand identified differentially expressed genes (DEGs) between them.They reported that 11 genes were involved in growth,and all of them were upregulated in triploid oysters.Shi and He(2014) also used RNA-Seq to evaluate differences in gene expression between large and small oysters.They found that calmodulin,collagen,cysteine sulfonic acid decarboxylase,glutathione S-transferase,L-rhamnonate dehydratase,regucalcin,sea star regeneration-associated protease,sodium-and chloride-dependent glycine transporter 1,solute carrier family 23 member 2,somatostatin receptor,cytochrome P450,nacre protein,and shematrin-5 genes were upregulated in small oysters,compared with large ones.In general,these genes were involved in shell and cell growth.Liuet al.(2020) found that many receptor genes,including neuropeptide Y receptor,thyrotropin-releasing hormone receptor,γ-aminobutyric acid type-B receptor-like protein,somatostatin receptor type 2-like protein,and dopamine D2 receptor-like protein,were involved in neuroendocrine regulation,which is critical for growth and reproduction.In addition to endogenous factors,growth is influenced by changing environmental conditions such as food availability,salinity,temperature,predation,and culture conditions(stocking density and biofouling cleaning procedures) (Adzigbliet al.,2019).All of these studies show that growth in pearl oysters is an extremely complex process.Nevertheless,the mechanism of growth in oysters remains poorly understood.
Oxford Nanopore Technology (ONT) sequencing is currently one of the most powerful methods for the rapid generation of long-read sequences.Using this technology,an entire cDNA can be sequenced from end to end (Bianet al.,2020).ONT is cost-effective for generating extremely long reads,and can be used to characterize the transcriptome and quantify transcript expression.Compared with nextgeneration sequencing,ONT sequencing analysis of gene structure,such as alternative splicing (AS) and fusion genes,is more accurate.Additionally,ONT sequencing can not only identify AS events but also distinguish different AS types in full-length transcripts (Cuiet al.,2020).To gain more information about growth mechanisms inP.f.martensii,we performed differential transcriptome analysis on FG and SG oysters using ONT sequencing.The results of this study will improve our understanding of growth mechanisms inP.f.martensii.
The shell lengths of 320 10-month-oldP.f.martensiiindividuals obtained from a Shenzhen population were measured,numbered,and then cultured in Dapeng Bay at the Daya Bay Marine Biology Research Station of the Chinese Academy of Sciences (Shenzhen,Guangdong,China).The water temperature,pH,and salinity were set at 28℃± 1℃,8.0 ± 0.3,and 30.0 ± 0.5,respectively.After 4 months of culture,we measured the shell length of 303 live oysters.The 15 oysters with the highest relative growth rate were selected as the FG group,and the 15 oysters with the lowest relative growth rate were selected as the SG group.The adductor muscle tissue of each oyster was collected,frozen in liquid nitrogen,and stored at -80℃.
Three oysters from each group were used for library construction.The cDNA libraries were constructed following the protocol of the Ligation Sequencing Kit (SQKLSK109,UK),which included four steps:1) using the strand-switching protocol and preparing full-length cDNAs from polyadenylated RNA;2) amplifying the cDNAs by PCR and adding barcodes during the PCR step;3) pooling the PCR products and attaching sequencing adapters;and 4) priming the flow cell and loading the cDNA library into the flow cell.The libraries were then sequenced using the Nanopore PromethION platform (Oxford,UK).The quality of the libraries was assessed using the Agilent Bioanalyzer 2100 system (Agilent Technologies,Santa Clara,CA,USA).
Nanopore sequencing raw data were analyzed using the Guppy data processing toolkit in MinKNOW2.2 (Oxford,UK).Short reads,low-quality reads,and reads with adaptors were filtered out to obtain clean data.According to the principle of cDNA sequencing,a primer sequence identified at both ends of a read was indicative of a full-length sequence.We used minmap2 to map the read itself and obtained overlap information between reads.Finally,consistent sequences were obtained using Racon software.Highquality isoforms were mapped to the reference genome ofP.f.martensiiusing the Genomic Mapping and Alignment Program (GMAP).
Transcripts were validated against known reference transcript annotations using the python library MatchAnnot(https://github.com/TomSkelly/MatchAnnot).Coding region sequences and their corresponding amino acid sequences were analyzed using TransDecoder v3.0.0 (https://transdecoder.github.io/) software based on new transcripts obtained from AS.Transcription factors were identified using the animalTFDB 2.0 database.Alternative polyadenylation (APA) analysis was conducted with TAPIS 1.2.1(https://bitbucket.org/compbio/tapis/overview),and AS events,including intron retention,exon skipping,alternative 3’ splice sites,alternative 5’ splice sites,and mutually exclusive exons,were identified using the AStalavista tool 3.0 (Foissac and Sammeth,2007).
The functions of non-redundant transcripts were annotated using the following databases:NCBI non-redundant protein sequences (NR),Clusters of Orthologous Groups of proteins (KOG/COG/eggNOG),Protein family (Pfam),a manually annotated and reviewed protein sequence database (Swiss-Prot),Kyoto Encyclopedia of Genes and Genomes (KEGG),and Gene Ontology (GO).Transcripts were subjected to enrichment analysis using GO implemented in the GOseq R package based on the Wallenius non-central hypergeometric distribution,which can adjust for gene length bias.KEGG is a database resource used to understand biological systems and high-level functions.KOBAS 2.0 software (Xieet al.,2011) was used to identify KEGG pathways enriched with DEGs.
Full-length reads were mapped to the reference transcriptome sequence.Reads with a match quality >5 were further quantified.Expression levels were estimated by reads per gene per 10000 reads mapped.DESeq provided statistical routines for determining differential expression in digital gene expression data using a model based on the negative binomial distribution.The resultingPvalues were adjusted using Benjamini and Hochberg’s approach for controlling the false discovery rate (FDR).Genes with an FDR<0.01 and fold-change ≥ 2 identified by DESeq were assigned as DEGs.
DEGs were selected to validate novel transcripts identifiedviaRNA-Seq.Six random individuals were sampled from the FG or SG groups,respectively.Total RNA was extracted from the adductor muscle tissue of each oyster using TRIzol reagent and subsequently reverse transcribed using a First-strand cDNA Synthesis Kit (Thermo Fisher Scientific,Waltham,MA,USA) following the manufacturer’s instructions.Table 1 lists the primer sequences used for qRT-PCR,which was performed using a Roche Light-Cycler480 instrument (Roche,Basel,Switzerland).Briefly,the reaction system consisted of 5 μL of SYBR Green (Toyobo,Osaka,Japan),0.3 μL of each primer (10 μmol L-1),1 μL of diluted cDNA,and 3.4 μL of ultrapure water to a total volume of 10 μL.PCR included an initial denaturation step at 95℃ for 30 s followed by 40 cycles at 95℃for 5 s,57℃ for 10 s,and 72℃ for 15 s.Each sample was tested in triplicate.The 18S ribosomal RNA and β-actin gene were used as internal controls,and the comparative 2–ΔCTmethod was used to analyze the expression levels of candidate genes.
Table 1 Primer sequences of target genes used for qRT-PCR
As events and qRT-PCR dates between FG and SG were expressed as mean ± S.E.M.Statistical differences were evaluated by t-test and the level of statistically significant difference was set atP≤ 0.05.All statistics were done with SPSS 19.0 (SPSS,Chicago,USA).
Full-length cDNA sequences are important for correct annotation and identification of authentic transcripts from animal tissues.In this study,we completed full-length transcriptome sequencing of six samples (three for FG and three for SG oysters).However,the expression correlation for the FG3 sample was inconsistent with that of FG1 and FG2;therefore,we did not use the FG3 sample data.The sequencing output of clean data for each sample reached 1.99 Gb (Table 2).Full-length sequences were identified by the presence of primer sequences at both ends of the reads.The number of full-length sequences obtained from each sample ranged from 1838192 to 2857612 (Table 3).The full-length sequences were polished to obtain consistent transcript sequences.All consistent transcript sequences were aligned to the reference genome using minimap2 software for redundancy analysis,and ultimately,we obtained 9981 transcript sequences.
Table 2 Oxford Nanopore sequencing information for Pincata fucata martensii
Table 3 Full-length sequence data from Oxford Nanopore sequencing
Alternative polyadenylation regulates gene expression and enhances the complexity of the transcriptome.A total of 3230 genes detected by the TAPIS pipeline had at leastone poly(A) site,and 1996 genes had two or more poly(A)sites (Fig.1).Mature mRNAs are generated by a variety of splicing methods and are translated into different proteins to increase biological complexity and diversity.The variable AS types in each sample were obtained using AStalavista software.In the ONT data,we detected 1274 AS events,the most frequent of AS events were exon skipping(32.94%),alternative 5’ splice sites (26.18%),mutually exclusive exons (18.82%),and alternative 3’ splice sites(16.47%),whereas few intron retentions (5.59%) were identified (Fig.2).The expression of AS events differed between FG and SG oysters.The average numbers of AS events in the FG and SG oysters were 271 and 243,respectively.The most frequent AS events identified in FG oysters were exon skipping (31.73%),followed by alternative 5’ splice sites (27.31%),mutually exclusive exons(19.56%),alternative 3’ splice sites (14.76%),and intron retention (6.64%) (Fig.3);in SG oysters,they were alternative 5’ splice sites (29.75%),exon skipping (28.10%),mutually exclusive exons (23.14%),alternative 3’ splice sites (14.05%),and intron retention (4.96%) (Fig.4).
Fig.1 Results of alternative polyadenylation analysis.
Fig.2 Proportional representation of five types of AS events.
Fig.3 Proportional representation of five types of AS events in FG oysters.
Fig.4 Proportional representation of five types of AS events in SG oysters.
Four of the five different AS events (excluding mutually exclusive exons) occurred more frequently in FG than in SG oysters (Fig.5).In FG oysters,the average numbers of mutually exclusive exon,intron retention,exon skipping,alternative 5’ splice site,and alternative 3’ splice site events were 53,18,86,74,and 40,respectively,whereas in SG oysters,they were 56,12,68,72,and 34.
Fig.5 Differences of AS events between FG and SG oysters(P >0.05).
The newly identified transcript sequences were compared with various databases using BLAST 2.2.26 software to obtain annotation information for each transcript.In total 8186 transcripts were annotated in NR,5386 in KOG,2288 in COG,6780 in eggNOG,5373 in Pfam,4528 in Swiss-Prot,4808 in KEGG,and 4140 in GO.Moreover,8274 transcripts were annotated in all databases.In GO annotation(Fig.6),transcripts were classified into three main GO categories:cellular component (CC),molecular function (MF),and biological process (BP).In the three main categories,cellular process (BP) (3808),binding (MF) (3957),and membrane (CC) (2872) were the most enriched subcategories.Based on NR annotation,species homologous toP.f.martensiiwere predicted by sequence alignment,andCrassostrea gigasandCrassostreavirginicawere the closest matching genomes,followed byMizuhopectenyessoensis(Fig.7).
Fig.6 GO classification of the putative functions of unique transcripts.
Fig.7 Distribution of homologous species annotated in the NR database.
In the transcriptome database,665 genes were not previously annotated in the genome.When we conducted BLAST analyses,644 of these genes were found in the Blastx search against NR,365 in Pfam,340 in KOG,126 in COG,479 in eggNOG,294 Swiss-Prot,322 in KEGG,and 295 in GO.
The transcriptome analyses yielded 50 DEGs.Compared with the SG group,40 DEGs were upregulated,and 10 DEGs were downregulated in the FG oysters.The volcano plot in Fig.8 shows the differences in gene expression levels in pairwise comparisons of SG and FG oysters.GO analysis revealed that the genes upregulated in FG were enriched in polysaccharide catabolic process,transport process,hydrolase activity,and transcription.The genes upregulated in the SG group were involved mainly in carbohydrate binding.In addition,KEGG pathway analysis indicated that DEGs were mainly assigned to metabolism,followed by genetic information processing,cellular processes,environmental information processing,human diseases,and organism systems.
Fig.8 Volcano plot of the transcripts.Each point in the volcano plot represents a transcript,the abscissa represents the logarithm of the difference in expression of a transcript in the two samples,and the ordinate represents the nega-tive logarithm of the statistical significance of the change in the expression of the transcript.Green dots represent downregulated DEGs,red dots represent upregulated DEGs,and black dots represent non-DEGs.
To validate the DEGs,we selected six genes related to growth and immunity with FDR ≤ 0.009 for qRT-PCR analysis.The sodium-dependent multivitamin transporter (Pma.10027563),complement C1q tumor necrosis factor-related protein (Pma.10018660),FMRF-amide neuropeptide-like(Pma.10030629),glutathione S-transferase (Pma.10012984),and fatty acid binding protein (ONT.1444) genes were upregulated in FG oysters.The GATA zinc finger domaincontaining protein 14-like gene (ONT.1460) was upregulated in SG oysters.The qRT-PCR and RNA-Seq results for ONT.1444 and Pma.10012984 were inconsistent (Fig.9).The result indicates that qRT-PCR results support transcriptome sequencing results,so the transcriptome sequencing data can be used for subsequent analysis.
Fig.9 The mRNA expression of six DEGs in the adductor muscle tissue measured using qRT-PCR analysis (n=6).The statistical significance of differences was determined using Student’s t-tests (two-tailed) and is denoted as *P <0.05,P** <0.01,and ***P <0.001.
AS is a complex,regulated process by which multiple transcripts are generated from a single pre-mRNA.AS plays an important role in the regulation of animal growth,development,and physiological metabolism (Gueroussovet al.,2015;Gallego-Paezet al.,2017).Xuet al.(2018) analyzed the AS events in tissue samples from four developmental stages of goats,and they were mainly involved in organ function and development.Hanet al.(2019) found two splicing isomers of the zinc finger-like protein gene in chickens,which significantly affected carcass and growth traits of 4-week-old chickens,such as birth weight,chest width,body oblique length,and subcutaneous fat weight.Heet al.(2020)detected the mRNA expression levels of the spliced variant of CKLF-like MARVEL transmembrane domain-containing 2 (CMTM2) in different tissues.Compared with CMTM2,the spliced variant had a higher expression level in muscle and liver tissues,indicating that it plays an essential role in growth traits.In an identification and analysis of the whole genome of soybean,Shenet al.(2014)found that more than 63% of alternative exons were detected in AS events,and in the same tissues,the number of AS events in the early developmental stage was higher than that in the late developmental stage.Analysis of AS in juvenile and adult human heart tissue showed that the types and amounts of AS differed between young and adult human hearts (Giudiceet al.,2014).These results indicate that the frequency of AS events is related to gene structure and transcriptional activity.In our study,we found that AS events occurred more frequently in FG than in SG oysters,and the five main AS types exhibited different characteristics.Four of the five different AS events (excluding mutually exclusive exon events) occurred more frequently in FG than in SG oysters.In FG oysters,exon skipping was the most frequent AS event type,whereas alternative 5’ splice sites were the most frequent in SG oysters.Variations in AS frequency and AS type are correlated with changes in gene features and transcriptional levels in oysters,and they play a crucial role in growthviaenergy metabolism,nutrient metabolism,and shell formation.Therefore,differences in the expression of AS events between FG and SG may at least partially explain the observed growth differences.
In our study,we found that some DEGs with higher expression levels in the FG group,such as fatty acid binding proteins (FABPs),sodium-dependent multivitamin transporter (SMVT),and complement C1q tumor necrosis factorrelated protein 2 (CTRP2).They are involved in nutrient and energy metabolism,which suggests that these genes might be positively associated with growth.FABPs are members of a polygenic family,and their main function is to combine with fatty acids (FAs) to achieve their absorption and transport (Haunerland and Spener,2004).FABP regulates the oxidative energy supply of FAs and the metabolism of phospholipids and triglycerides through the uptake,transport,esterification,and β-oxidation of FAs (Dutta-Roy,2000).It is well known that FAs,as basic energy substances,are essential constituents of the body and maintain activities essential to life,such as cell membrane fluidity,lipid performance,reproductive development,and lipid metabolism in aquatic animals (Anet al.,2020).Thus,regulation of the synthesis,digestion,and absorption of these FAs plays a key role in animal growth.SMVT facilitates the absorption of biotin,pantothenic acid,and the vitamin-like substance lipoate (Zempleniet al.,2009;Schwantjeet al.,2019),which are involved in a variety of metabolic processes related to growth,including fatty acid synthesis,gluconeogenesis,amino acid catabolism,and the tricarboxylic acid cycle.Therefore,SMVT plays a crucial role in energy metabolism and promoting growth and development.CTRP2 is mainly involved in the regulation of lipid metabolism in adipose and liver tissues.It acts on adipocytes to suppress fat mobilization and restrain excessive triglyceride hydrolysis as well as membrane phospholipid turnover,thereby modulating systemic lipid profiles and maintaining normal metabolic processes in the body (Lei and Wong,2019).As nutrient and energy metabolism play an irreplaceable role in promoting growth,it is not surprising to find that DEGs related to metabolism were expressed at higher levels in FG than in SG oysters.
Generally,genes involved in shell formation are highly expressed in the mantle.However,we found that some DEGs related to shell formation were expressed in the oyster muscle.Pinctada fucatamantle gene 4 (PFMG4) and amorphous calcium carbonate-binding protein (ACCBP)were upregulated in the muscle of FG oysters,and lysinerich matrix protein 4 (KRMP4) was upregulated in the SG group.ACCBP can inhibit calcite formation and induce amorphous calcium carbonate (ACC) formation.ACC is very important in biomineralization processes,and it functions as a precursor of calcium carbonate biominerals (Suet al.,2013).Maet al.(2007) conducted western blot analysis and found that little ACCBP was present in the muscle,which suggested that ACCBP may be secreted into the interstitial fluid and the hemolymph after synthesis of the adductor muscle.Future studies should focus on the secretion mechanism of ACCBP.Wanget al.(2015) reported that PFMG4 could enhance osteoblast differentiation and increase alkaline phosphatase activity and mineral deposition.Our finding that PFMG4 was expressed in muscle tissue suggests that it may be involved in growth in ways other than shell formation.It is also possible that PFMG4 is a secreted protein.Overall,our results showed that FG oysters possessed high biomineralization activity.
Previous studies reported that DEGs related to immunity were mainly upregulated in SG oysters (Haoet al.,2019).However,in this study,we found that genes related to immunity were upregulated in the FG oyster group.In that earlier study,oysters were from a selected line named‘Haixuan NO 1’ and were sorted by size.In the present study,oysters were selected from a Shenzhen population and sorted by relative growth rate.Different genetic backgrounds and sampling methods might explain the contrasting results,at least in part,and these differences further indicate the complexity of the growth mechanism of pearl oysters.Glutathione S-transferase (GSTs) is an ancient enzyme superfamily and is an important component of the antioxidant defense system in organisms (Vaishet al.,2020).GSTs can catalyze the formation of soluble or non-toxic derivatives of harmful substances,thereby reducing their toxicity.GSTs also have glutathione peroxidase activity,which can remove the secondary products of lipid oxidation and play a central role in defense against environmental pollutants (Gestalet al.,2010).Selenoprotein P-like protein,which is similar to selenoprotein P,plays a pivotal role in maintaining the dynamic balance and distribution of selenium in the whole organism,and it can provide enough selenium to maintain the viability and function of the tissues (Fujiiet al.,1997).Selenium is an essential trace element in animals,and it mainly participates in various life activities in the form of selenium protein (Burk and Hill,2015).A decrease in the selenium protein level can block the normal function of tissues and cells and even lead to their deaths (Olsonet al.,2005;Caoet al.,2017;Liaoet al.,2018).The level of selenoprotein P in the body also affects the balance between cellular and humoral immunity (Jacksonet al.,2004).Solute carrier family 22 member 8-like plays an important role in the excretion/detoxification of endogenous and exogenous organic anions.This protein mainly mediates the uptake of several organic compounds related to immunity,such as prostaglandin E2,prostaglandin F (2-alpha),allopurinol,6-mercaptopurine,and L-carnitine (Sweetet al.,2002;Sykeset al.,2004).For invertebrates,the function of FABP is more diverse,particularly in immune defense.For example,in shrimp and crabs,FABP is highly expressed in the major immune organs after immune stimulation.Researchers have reported that recombinant FABP protein can bind to bacteria and inhibit their growth,and can also enhance the phagocytosis of blood cells (Chenget al.,2013;Tanet al.,2015).Therefore,FG oysters appear to possess a more powerful immune system to resist external coercion,so the growth is not hindered by environmental degradation or invasive pathogens.
In this study,we generated the full-length transcriptome ofP.f.martensiiusing ONT sequencing and documented differences in the growth pattern between FG and SG oysters.Four of the five different AS events (excluding mutually exclusive exon events) occurred more frequently in FG than in SG oysters,and the five main AS types exhibited contrasting characteristics.The AS events that were detected may be involved in growth,and differences of AS events between FG and SG oysters may be the reason for the difference of their growths.DEG results showed that FG oysters possessed higher nutrient metabolism,energy metabolism,and biomineralization activity than SG oysters,and the FG group also had a stronger immune system.These results highlighted the complexity and diversity of the differential growth patterns ofP.f.martensiiindividuals.Our findings are valuable for elucidating the mechanisms of growth and guiding molecular breeding inP.f.martensii.
Acknowledgements
We thank International Science Editing (http://www.in ternationalscienceediting.com) for editing this manuscript.This study was supported by the Earmarked Fund for the China Agriculture Research System (No.CARS-49) and the Science and Technology Planning Project of Guangdong Province,China (No.No2020B1212060058).
Journal of Ocean University of China2022年1期