亚洲免费av电影一区二区三区,日韩爱爱视频,51精品视频一区二区三区,91视频爱爱,日韩欧美在线播放视频,中文字幕少妇AV,亚洲电影中文字幕,久久久久亚洲av成人网址,久久综合视频网站,国产在线不卡免费播放

        ?

        三江黃牛全基因組數(shù)據(jù)分析

        2017-10-13 10:37:27宋娜娜鐘金城柴志欣汪琦何世明吳錦波蹇尚林冉強蒙欣胡紅春
        中國農(nóng)業(yè)科學(xué) 2017年1期

        宋娜娜,鐘金城,柴志欣,汪琦,何世明,吳錦波,蹇尚林,冉強,蒙欣,胡紅春

        ?

        三江黃牛全基因組數(shù)據(jù)分析

        宋娜娜1,2,鐘金城1,2,柴志欣1,2,汪琦1,2,何世明3,吳錦波3,蹇尚林4,冉強5,蒙欣5,胡紅春4

        (1西南民族大學(xué)動物遺傳育種學(xué)國家民委-教育部重點實驗室,成都 610041;2西南民族大學(xué)青藏高原研究院成都 610041;3阿壩州畜牧科學(xué)研究所,四川汶川 623000;4阿壩州畜牧工作站,四川汶川 623000;5汶川縣畜牧工作站,四川汶川 623000)

        【目的】研究三江黃牛群體遺傳多樣性,從基因組層面討論其群體遺傳變異情況?!痉椒ā刻崛?0個體基因組總DNA,等濃度等體積混合,構(gòu)建混合樣本DNA池,利用CovarisS2進行隨機打斷基因組DNA,電泳回收長度500 bp的DNA片段,構(gòu)建DNA文庫。應(yīng)用Illumina HiSeq 2000測序,最終得到測序數(shù)據(jù)。利用BWA軟件將短序列比對到牛參考基因組(UMD 3.1),來檢測三江黃?;蚪M突變情況。SAMtools、Picard-tools、GATK、Reseqtools對重測序數(shù)據(jù)進行分析,Ensembl、DAVID、dbSNP數(shù)據(jù)庫對SNPs和indels進行注釋?!窘Y(jié)果】全基因組重測序分析共計得到77.8 Gb序列數(shù)據(jù),測序深度為25.32×,覆蓋率為99.31%。測序得到778 403 444個reads和77 840 344 400個堿基,比對到參考基因組(UMD 3.1)的reads為673 670 505,堿基為67 341 451 555,匹配率分別為86.55%和86.51%,成對比對上的reads數(shù)為635 242 898(81.61%),成對比對上的堿基數(shù)為63 512 636 924(81.59%);共確定了20 477 130個SNPs位點和1 355 308個indels,其中2 147 988個SNPs(2.4%)和90 180個indels(6.7%)是新發(fā)現(xiàn)的??係NPs中,鑒定出純合SNPs989 686(4.83%),雜合SNPs19 487 444(95.17%),純合/雜合SNP比為1﹕19.7。轉(zhuǎn)換數(shù)為14 800 438個,顛換為6 680 058個,轉(zhuǎn)換/顛換(TS/TV)為2.215。剪切位點突變SNP727個,開始密碼子變非開始密碼子SNP117個,提前終止密碼子的SNP530個,終止密碼子變非終止密碼子SNP88個。檢測到非同義突變數(shù)為57 621,同義突變?yōu)?3 797,非同義/同義比率為0.69。檢測到非同義SNPs分布在9 017個基因上,其中發(fā)現(xiàn)567個基因與已報道的重要經(jīng)濟性狀相符,肉質(zhì)、抗病、產(chǎn)奶、生長性狀、生殖等相關(guān)基因的數(shù)量分別為471、77、21、10、8個,其中包括功能相重疊的基因;indels數(shù)據(jù)中,缺失數(shù)量為693 180(51.15%),插入數(shù)量為662 148(48.85%),純合indels數(shù)量為161 198(11.89%),雜合indels數(shù)量1 194 110(88.11%),大部分的變異都位于基因間隔區(qū)和內(nèi)含子區(qū);三江黃牛全基因組雜合度()、核苷酸多樣性()及theta W分別為7.6×10-30.0 0390.0 040,說明其遺傳多樣性較為豐富。三江黃牛群體Tajima'D為-0.06 832,推測可能由于群體內(nèi)存在不平衡選擇所致?!窘Y(jié)論】本研究為進一步分析與經(jīng)濟性狀相關(guān)的遺傳學(xué)機制和保護三江黃牛品種遺傳多樣性提供了基因組數(shù)據(jù)支持。

        三江黃牛;基因組;第二代測序技術(shù);SNP;indel

        0 引言

        【研究意義】三江黃牛原產(chǎn)于四川省阿壩藏族自治州汶川縣,其中三江、白石、水磨和映秀等鄉(xiāng)鎮(zhèn)為主產(chǎn)區(qū),在理縣、茂汶等縣市均有分布[1];三江黃牛具有軀干較長、役用性能良好、肉質(zhì)好、抗病力和適應(yīng)性強等特點,是經(jīng)長期自然選擇和人工選育而成,在遺傳資源上是一個極為寶貴的基因庫,但由于當(dāng)?shù)亟?jīng)濟社會的發(fā)展和農(nóng)業(yè)生產(chǎn)方式的改變,以及2008年汶川特大地震的發(fā)生導(dǎo)致三江黃牛產(chǎn)區(qū)功能的布局空間受限,使三江黃牛養(yǎng)殖規(guī)模、種群數(shù)量銳減,已瀕臨滅絕,因此,保護三江黃牛遺傳資源顯得尤為重要[2]?!厩叭搜芯窟M展】基因組包含了一個物種全部的遺傳信息,全基因測序是解讀基因組的核心技術(shù),揭示基因組的多樣性和信息的復(fù)雜性,最早的測序技術(shù)源于20世紀(jì)60年代中期[3-5],70年代后期第一代測序體系逐漸建立,主要有Sanger等[6]發(fā)明的雙脫氧鏈末端終止法、Maxam等[7]發(fā)明的化學(xué)降解法;隨著生物技術(shù)的發(fā)展,二代測序技術(shù)逐漸走入大眾視野,即大規(guī)模平行測序平臺(massively parallel DNA sequeneing platform),主要包括:焦磷酸測序Roche/454 FLX、基于邊合成邊測序的Illumina/Solexa技術(shù)和邊連接邊測序的SOLID技術(shù)。測序技術(shù)不斷發(fā)展,有助于對基因組進行更全面和更深入的解析,使得解析稀有物種的基因組,以及對轉(zhuǎn)錄組、表達譜、小RNA等大規(guī)模的功能基因組學(xué)的研究成為可能。人類基因組計劃的完成,開啟了不同物種全基因組測序的時代。?;蚪M測序和HapMap計劃的完成[8-9],鑒定出相當(dāng)數(shù)量的遺傳變異,其中單核苷酸多態(tài)性(SNP)是研究最為廣泛的遺傳變異類型,用于鑒定基因與牛表型變異相關(guān)的基因組區(qū)域,現(xiàn)已測序了多個牛種的全基因組[8,10-17]。利用Illumina HiSeq II平臺,Eck等[8]在花斑牛上檢測到240萬SNPs,利用相同的測序平臺,KAWAHARA-MIKI等[10]共鑒定了603萬SNPs,采用SOLID技術(shù),STOTHARD等[11]成功比對了黑安格斯和荷斯坦公牛的基因組變異,確定了約700萬個SNPs和790拷貝數(shù)變異。同時,WGS-SNP位點衍生到全基因組關(guān)聯(lián)研究,能夠以更高的精度預(yù)測物種的重要經(jīng)濟性狀,以及檢測整個基因組的重要信息[18-19]?!颈狙芯壳腥朦c】近年來對黃牛基因組層面的變異研究較多,但對三江黃牛全基因組研究尚未見報道,對三江黃牛品種的遺傳資源研究相對匱乏?!緮M解決的關(guān)鍵問題】從基因組層面揭示三江黃牛的變異情況,探討三江黃牛遺傳多樣性,為進一步分析與經(jīng)濟性狀相關(guān)的遺傳學(xué)機制和保護三江黃牛品種遺傳多樣性提供基因組數(shù)據(jù)支持。

        1 材料與方法

        1.1 供試材料

        樣本采集于2015年4月,地點是四川省阿壩州汶川縣的三江鄉(xiāng)和水磨鎮(zhèn),兩鄉(xiāng)鎮(zhèn)是三江黃牛主要分布區(qū)域,選取毛色黃色、體型較大、特征明顯的三江黃牛個體,采集其耳組織樣50個,75%乙醇保存,帶回實驗室倒出保存液,-80℃保存?zhèn)溆谩?/p>

        1.2 DNA文庫的構(gòu)建及測序

        采用苯酚-氯仿法提取基因組DNA,1.5%瓊脂糖凝膠電泳和A260/A280的比值檢測DNA的純度和濃度,將50個樣本的基因組DNA等濃度等體積混合,利用CovarisS2進行隨機打斷,電泳回收所需長度的DNA片段(—500 bp),加上接頭,進行cluster制備,最后應(yīng)用Illumina HiSeq 2000測序儀,Paired-end法對插入片段進行測序,雙端測序的長度為150 bp,最終得到測序數(shù)據(jù)。

        1.3 測序數(shù)據(jù)的質(zhì)量控制和數(shù)據(jù)過濾

        為保證數(shù)據(jù)的質(zhì)量,測序原始數(shù)據(jù)要經(jīng)過質(zhì)量控制控和數(shù)據(jù)過濾,在信息分析前對數(shù)據(jù)進行質(zhì)控,并通過數(shù)據(jù)過濾來減少數(shù)據(jù)的噪音。通過分析堿基的組成和質(zhì)量值可控制原始數(shù)據(jù)的質(zhì)量(圖1)。由圖1-a可以看出測序得到低質(zhì)量(Q<20)堿基含量較低,圖1-b看出A、T曲線重合,G、C曲線重合,說明堿基組成平衡,測序結(jié)果較好,可以進一步對數(shù)據(jù)進行處理分析。將得到的原始測序序列(raw reads)里有部分帶接頭或低質(zhì)量的reads進行過濾,得到高質(zhì)量的凈數(shù)據(jù)(clean data),后續(xù)分析都基于凈數(shù)據(jù)。數(shù)據(jù)過濾主要是去除帶接頭的成對reads;去除單端read中N堿基(N表示無法確定堿基信息)比例大于10%的成對reads;當(dāng)單端測序read中含有低質(zhì)量(質(zhì)量值Q≤7)堿基數(shù)超過該條read堿基總數(shù)的30%時,去除此成對reads。

        (a)堿基質(zhì)量值;(b)堿基分布比

        1.4 基因組數(shù)據(jù)組裝和測序數(shù)據(jù)處理

        利用BWA[20]軟件將序列比對到參考基因組。應(yīng)用工具包SAMtools、Picard-tools對比對結(jié)果進行統(tǒng)計、預(yù)處理(排序,去重復(fù)等),基因組分析工具包(genome analysis tool kit, GATK)[21]完成SNP/indel檢測,即經(jīng)比對在獲得樣本所有SNP信息的基礎(chǔ)上,將檢測到的基因型與參考序列之間存在多態(tài)性的位點進行過濾,得到高可信度的SNP/indel數(shù)據(jù)集,將所得到的SNPs和indels調(diào)用為VCF格式,比對到dbSNP數(shù)據(jù)庫,鑒定新的SNPs及indels。Break Dancer[22]工具包分析結(jié)構(gòu)變異(SV),最后應(yīng)用Reseqtools[23]工具對變異結(jié)果進行注釋統(tǒng)計作圖等。

        2 結(jié)果

        2.1 數(shù)據(jù)產(chǎn)出

        2.1.1 凈數(shù)據(jù) 測序共獲得三江黃?;蚪M77.8G凈數(shù)據(jù)(Clean data),將所得到的Clean data進行統(tǒng)計(表1)。以普通?;蚪M(UMD 3.1)(GCA_000003055.3)為參考,使用BWA軟件[21]將clean reads比對到參考基因組(表2),測序深度為25.32×,覆蓋率達99%以上,說明具有較高的單堿基正確性,比對到參考基因組reads和堿基的比率分別為86.55%和86.51%,說明測序樣品同參考物種相似度高,親緣關(guān)系較近。

        2.1.2 染色體測序深度和覆蓋度 對三江黃牛每條染色體測序深度和覆蓋度統(tǒng)計。整個基因組測序深度為25.32×,深度最高為14號染色體26.55×,最低為X染色體21.54×?;蚪M的覆蓋率為99.22%,其中X染色體覆蓋率97.59%。由圖2可知,覆蓋上的reads和染色體長度成正比。

        2.1.3 GC含量 GC含量對測序有一定的影響,高GC和低GC的區(qū)域會使測序的難度加大,導(dǎo)致部分序列無法準(zhǔn)確測出,由圖3可知,整個GC分布范圍內(nèi)覆蓋深度較好(25×),GC含量結(jié)果無明顯偏向性,說明建庫與測序質(zhì)量良好。

        表1 三江黃牛凈數(shù)據(jù)

        表2 三江黃牛數(shù)據(jù)比對到參考基因組

        圖2 reads覆蓋各染色體區(qū)域的長度

        圖3 GC含量和測序深度

        2.2 SNP檢測

        GATK的unifiedGenotyper完成對三江黃牛樣品的SNP檢測,共檢測到20 477 130個SNPs位點,SNP密度確定為大約每131個堿基含有一個突變位點,突變分布能夠定位各種與經(jīng)濟性狀相關(guān)聯(lián)的候選基因組區(qū)域。將SNP比對到dbSNP數(shù)據(jù)庫(圖4),數(shù)據(jù)庫中共計90 045 399個SNPs,三江黃牛SNPs與數(shù)據(jù)庫未匹配上的為2 147 988個,說明其為新發(fā)現(xiàn)的SNPs,占總SNPs的2.4%。純合SNPs數(shù)為989 686(4.83%),雜合SNPs數(shù)為19 487 444(95.17%),純合/雜合SNP比為1﹕19.7?;蜷g隔區(qū)的SNPs為15 009 500,占總SNPs的73.29%,大多數(shù)的SNPs集中在基因間隔區(qū)和內(nèi)含子區(qū),少部分在外顯子、剪接位點和非編碼區(qū)域(表4)。轉(zhuǎn)換TS(transition)/顛換TV(transversion)是檢測隨機序列誤差的重要指標(biāo),是對SNP的質(zhì)量評估,經(jīng)驗值TS/TV>2.1[24],三江黃牛SNP轉(zhuǎn)換數(shù)為14 800 438個,顛換為6 680 058個,轉(zhuǎn)換/顛換(TS/TV)為2.215(圖4),高于經(jīng)驗值,說明所識別大多數(shù)的SNP是準(zhǔn)確的。在所有SNP中,由于SNP位點突變導(dǎo)致剪切位點突變和編碼氨基酸密碼子變化的SNP共1 462個,其中剪切位點突變SNP 727個,開始密碼子變?yōu)榉情_始密碼子SNP 117個,提前終止密碼子SNP 530個,終止密碼子變非終止密碼子SNP 88個,在染色體上分布情況如圖5。

        圖4 SNPs(indels)比對到dbSNP數(shù)據(jù)庫和轉(zhuǎn)換/顛換比

        圖5 SNPs位點突變效應(yīng)在每個染色體上的數(shù)量分布

        人類和其他動物許多表型都與非同義SNP(non- synonymous SNP, nsSNP)相關(guān)[25],SNP注釋是提供SNP位點與功能相關(guān)聯(lián)的依據(jù)。本研究共檢測到57 621個非同義突變,83 797個同義突變,非同義/同義SNP比率為0.69。Ensembl[26]數(shù)據(jù)庫對非同義SNP注釋得到9 017個基因(電子附表1,http:// pan.baidu.com/s/1qXN18dA)。DAVID[27]數(shù)據(jù)庫對含nsSNP較多的108個基因進行功能基因富集分析(電子附表2,http://pan.baidu.com/s/1qXN18dA),基因功能注釋可分為7類,主要集中在生化、代謝、免疫等過程(電子附表3,http://pan.baidu.com/s/1qXN18dA),其中免疫功能相關(guān)的GO:0006955涉及到免疫應(yīng)答,GO:0019882涉及到抗原加工和呈遞,對機體有重要作用,包括、、等基因。同時還分析了nsSNP與肉質(zhì)、產(chǎn)奶量、生長速度等重要經(jīng)濟性狀的相關(guān)性,發(fā)現(xiàn)567個基因與已報道的重要經(jīng)濟性狀相關(guān)[10,28-34],并對其基因進行注釋(電子附表4,http://pan.baidu.com/s/1qXN18dA),471個與肉質(zhì)相關(guān)基因,77個與抗病相關(guān)的基因,21個與產(chǎn)奶性狀相關(guān)基因,10個與生長性狀相關(guān)的基因,8個與生殖相關(guān)的基因,567個基因中還包括一些功能相重疊的基因,例如同時和肉質(zhì)、生長性狀相關(guān)的、基因,與肉質(zhì)和抗病均相關(guān)的、基因,產(chǎn)奶和抗病均相關(guān)的基因等。還有一些研究相對較多且機制較為清晰的基因,包括生長激素(),生長激素受體()和瘦肉素受體()催乳素受體()基因[29,32]等。

        表3 SNPs和indels的注釋

        圖6 SNPs和indels在每個染色體上數(shù)量分布

        2.3 indel檢測

        最近研究中認(rèn)為高于50 bp的缺失(deletion)和插入(insertion)為結(jié)構(gòu)變異,而低于50 bp的deletion和insertion合稱indel[35]。indel是基因組中除SNP數(shù)量最多的變異,三江黃牛共檢測到1 355 308個indels,缺失和插入數(shù)量分別為693 180和662 148個,比例分別為51.15%、48.85%,比對到數(shù)據(jù)庫共發(fā)現(xiàn)90 180個新indels,占總indels的6.7%(圖4)。純合indels為161 198(11.89%),雜合indels為1 194 110(88.11%)。indels注釋情況(表3),基因間隔區(qū)的indels最多為982 443個,占總indels的72.49%。CDS區(qū)indels1 545個,外顯子indels4 137個,3′端非編碼區(qū)和5′端非編碼區(qū)indels數(shù)分別為3 606和240。SNPs和indels在每個染色體上數(shù)量分布(圖6),除11、12、13號染色體上的SNP外,其余每條染色體上的SNP數(shù)均與染色體長度相關(guān),隨染色體長度的減小而降低。indels在每條染色體上的長度分布隨染色體長度減小而降低。插入和缺失在CDS區(qū)和全基因組上的分布情況(圖7),由圖可知,缺失和插入的數(shù)量在全基因組上的分布隨長度增加而減少,在CDS區(qū)未發(fā)現(xiàn)類似情況。但由兩圖都可看出插入和缺失數(shù)量集中在1—10 bp,其中1—3 bp最多。基于成對比對上reads的結(jié)果,檢查插入的長度是否異常,針對缺失部分進行分析,共檢測出1 906個結(jié)構(gòu)變異。

        2.4 基因組的雜合度和群體核苷酸多樣性指數(shù)

        雜合度()和核苷酸多樣性()是反映多態(tài)性高低的指標(biāo)(圖8)。將reads比對到參考基因組,識別三江黃牛19 487 444個雜合SNPs,其整個基因組的雜合度為7.6×10-3,說明三江黃牛品種的遺傳多樣性較高。三江黃牛全基因組為0.0039,說明遺傳多樣性較為豐富。

        2.5 基因組Tajima'D和theta W指數(shù)

        群體Tajima'D值是目標(biāo)DNA序列在進化過程中是否遵循中性進化模型,導(dǎo)致D值為負(fù)可能是搭載效應(yīng)[36]。三江黃牛群體Tajima'D為-0.06832(圖9),說明群體中存在不平衡的選擇。theta W是反映群體多態(tài)性的指標(biāo),是群體在核苷酸多態(tài)性水平上偏離中性進化且處于突變-漂移平衡的理想模型,三江黃?;蚪Mtheta W為0.0040(圖9),說明三江黃牛群體遺傳多樣性較為豐富。

        圖7 Indels在全基因組上的長度分布

        圖8 indels在CDS上的長度分布

        圖9 (a)全基因組的雜合率H;(b)全基因組的多態(tài)性指標(biāo)Pi;(c)全基因組的多態(tài)性指標(biāo)theta W;(d)全基因組的Tajima'D

        3 討論

        在研究中,筆者使用Illumina 2000測序平臺對瀕危三江黃牛品種進行了全基因組測序,三江黃牛品種數(shù)量少,選擇能夠代表品種的個體進行測序尤為重要,為避免個體差異,在較低成本下聚集更多個體的信息來反映三江黃牛品種群體遺傳多樣性情況,因此將50個體采用混合DNA樣本進行測序。測序獲得三江黃牛基因組77.8G凈數(shù)據(jù),86.55%的reads、81.61%的成對reads、86.51%的堿基、81.59%的成對堿基比對到參考基因組,測序深度為25.32×,覆蓋率達99%以上,具有較高的單堿基正確性,與先前對普通牛測序的結(jié)果相比[8,11],測序深度較高,檢測到的變異充分可信[15]。

        通過GATK分析,在29條常染色體和X染色體上共發(fā)現(xiàn)20 477 130個SNPs位點和1 355 308個indels,三江黃牛種群數(shù)量較少,僅2 000多頭,SNP數(shù)量較多,說明該品種遺傳多樣性較為豐富??係NP中,雜合SNPs 19 487 444個(95.17%),純合SNPs 989 686個(4.83%),與Shin等[37]測序10個韓國公牛所得到2 234 514個(90.3%)雜合SNPs和239 370個(9.7%)的純合SNPs結(jié)果相似。純合/雜合SNPs的比為1﹕19.7,明顯低于Kawahara-Mik等[10]研究的日本牛(1﹕1.2)和Choi等[12]研究的韓牛(1﹕1.92)。從測序角度闡述純和SNP是該混和樣本中的所有樣品在這個位點都是同一個堿基型且和參考基因組一致,雜合SNP是這個位點在所有混和樣品里有多個堿基型,三江黃牛SNP純合比率低,雜合度高,推測可能測序的50個體之間變異差異較大。還可能是三江黃牛選育程度低,近親繁殖的概率低,與其他牛之間的基因交流較多,本身特異的功能基因正在丟失,保護三江黃牛品種顯得尤為重要。測序所得indels大約占總變異(indels和SNPs)的5.3%,稍高于Kawahara-Mik等[10]和Choi等[12]研究的結(jié)果。三江黃牛轉(zhuǎn)換/顛換值為2.21,高于Choi等[16]研究韓國牛牛種所得到2.1。將SNPs和indels比對到數(shù)據(jù)庫發(fā)現(xiàn)2 147 988個新的SNPs和90 180個新的indels,分別占總SNPs的2.4%和總indels的6.7%,推測可能由于近年來隨著基因組測序的發(fā)展,發(fā)現(xiàn)了較多新的SNPs及indels,數(shù)據(jù)庫越來越完善,使比對上的比例明顯增大,新發(fā)現(xiàn)的逐漸變少。大多數(shù)indels的長度較短,缺失的長度范圍在1—29 kb,插入的范圍在1—44 kb,缺失和插入的數(shù)量集中在1—10 bp,其中1—3 bp最多,從人類基因組數(shù)據(jù)上也觀察到類似現(xiàn)象[38]。三江黃牛數(shù)據(jù)中,接近84.7%插入和79.6%缺失長度小于3 kb。29個常染色體上檢測到的SNPs和indels與染色體長度成正比,結(jié)果符合預(yù)期,其中X染色體突變率最低,為4.33%,在小種群研究上,X染色體相比常染色體有較低的突變率[39]。

        通過Ensemble數(shù)據(jù)庫對nsSNP注釋得到9 017個基因,與Eck等[8]報道的結(jié)果相一致,高于Kawahara-Mik等[10]研究的日本牛和Lee等[14]研究的韓國牛。非同義SNP注釋發(fā)現(xiàn)567個與經(jīng)濟性狀相關(guān)的基因,其中肉質(zhì)、抗病、產(chǎn)奶、生長、生殖等相關(guān)的基因分別為471、77、21、10、8個。三江黃牛主要是供耕地役用,近年來逐漸向役肉兼用方向發(fā)展,研究中一些肉質(zhì)相關(guān)基因中在其他黃牛品種上已有報道,例如脂肪酸結(jié)合蛋白4(FABP4)的突變與棕櫚油酸肌肉內(nèi)脂肪含量相關(guān)[40],加壓素Ⅱ受體(UTS2R)突變與骨骼肌脂肪堆積相關(guān)[41],鈣蛋白酶1(CAPN1)突變與阿伯丁安格斯牛肉嫩度有關(guān)[42],和基因被發(fā)現(xiàn)與內(nèi)洛爾、荷斯坦黑白花、安格斯、夏洛來、海福特、西門塔爾牛牛肉的脂肪含量有關(guān)[34,43],肌聯(lián)蛋白基因()也被發(fā)現(xiàn)是影響肉質(zhì)重要的候選基因[44],在本研究中、等肉質(zhì)相關(guān)基因上分布了較多nsSNP(>10)。還有一些與發(fā)育和疾病相關(guān)的基因,如和基因與荷斯坦牛育種值相關(guān)[45],蛋白激酶()基因與早期胚胎發(fā)育相關(guān)[46],Y連鎖肽重復(fù)序列結(jié)構(gòu)域()在公牛精子發(fā)生過程中發(fā)揮關(guān)鍵作用[47],性別決定區(qū)Y()是檢測公牛精子質(zhì)量和生育能力的重要候選標(biāo)記[48]。、和是與瑞士褐牛韋弗綜合征疾病相關(guān)的重要候選基因[49]。哺乳動物中的色素沉淀是由于毛發(fā)或皮膚缺乏或存在黑色素引起的,影響色素合成的主要基因有酪氨酸酶蛋白1()、多巴色素互變異構(gòu)酶(、絲氨酸肽酶()、黑皮素受體1()、酪氨酸酶()、信號蛋白()。在嚙齒動物,黑色和黃色間的變化是由和拮抗劑所引起的。MITF基因能夠調(diào)節(jié),和基因的表達。三江黃牛毛色以黃色為主,其次為黑色和草黑色,在其他牛種上很多與顏色相關(guān)的基因在三江黃牛上也發(fā)現(xiàn)了,如CORIN基因發(fā)現(xiàn)與韓牛黃毛色有關(guān)[37],推測可能也是影響三江黃牛黃色毛的重要基因。、、、、等基因發(fā)現(xiàn)能夠?qū)е骂^發(fā)毛囊表型變化[50]。

        4 結(jié)論

        本研究通過對三江黃牛全基因組測序得到77.8 Gb凈數(shù)據(jù),鑒定出大量的遺傳變異,說明三江黃牛遺傳多樣性較為豐富,為進一步研究三江黃牛品種的遺傳特性提供基因組數(shù)據(jù)支持。

        References

        [1] 孫福勇, 劉君. 三江黃牛的生態(tài)分布及其品種特點. 草業(yè)與畜牧, 2009(9): 51-52.

        SUN F Y, LIU J. Distribution and ecological features of Sanjiang cattle breed., 2009(9): 51-52.(in Chinese)

        [2] 陳智華, 顧磊, 鐘金城. 三江黃牛 Bola-DRB3 基因第二外顯子的 PCR-RFLP 多態(tài)性研究. 西南民族大學(xué)學(xué)報(自然科學(xué)版), 2008, 33(4): 782-787.

        CHEN Z H, GU L, ZHONG J C. Study on the polymorphism of the Bola-DRB3 gene exon 2 in the Sanjiang Cattle by PCR-RFLP method.(), 2008, 33(4): 782-787. (in Chinese)

        [3] HOLLEY R W, EVERETT G A, MADISON J T, ZAMIR A. Nucleotide sequences in the yeast alanine transfer ribonucleic acid., 1965, 240(5): 2122-2128.

        [4] FRESCO J R, ADAMS A, ASCIONE R, HENLEY D, LINDAHL T. Tertiary structure in transfer ribonucleic acids//Cold Spring Harbor Laboratory Press, 1966, 31: 527-537.

        [5] CELANDER D W, CECH T R. Visualizing the higher order folding of a catalytic RNA molecule., 1991, 251(4992): 401-407.

        [6] SANGER F, NICKLEN S, COULSON A R. DNA sequencing with chain-terminating inhibitors., 1977, 74(12): 5463-5467.

        [7] MAXAM A M, GILBERT W. Sequencing end-labeled DNA with base-specific chemical cleavages., 1979, 65(1): 499-560.

        [8] ECK S H, BENET-PAGèS A, FLISIKOWSKI K, MEITINGER T, FRIES R, STROM T M. Whole genome sequencing of a single Bos taurus animal for single nucleotide polymorphism discovery., 2009, 10(8): R82.

        [9] GIBBS R A, BELMONT J W, Hardenbol P, WILLIS T D, YU F L, YANG H M, CHANG L Y, HUANG W, LIU B, SHEN Y, et al. The international HapMap project., 2003, 426(6968): 789-796.

        [10] KAWAHARA-MIKI R, TSUDA K, Shiwa Y, ARAI-KICHISE Y, MATSUMOTO T, KANESAKI Y, ODA S, EBIHARA S, YAJIMA S, YOSHIKAWA H, KONO T. Whole-genome resequencing shows numerous genes with nonsynonymous SNPs in the Japanese native cattle Kuchinoshima-Ushi., 2011, 12(1): 103.

        [11] STOTHARD P, CHOI J W, BASU U, SUMNER-THOMSON J M, MENG Y, LIAO X, MOORE S S. Whole genome resequencing of black Angus and Holstein cattle for SNP and CNV discovery., 2011, 12(1): 1.

        [12] CHOI J W, CHUNG W H, LEE K T, LEE K T, CHOI J W, JUNG K S, CHO Y, KIM N, KIM T H. Whole genome resequencing of Heugu (Korean Black Cattle) for the genome-wide SNP discovery., 2013, 33(6): 715-722.

        [13] CHOI J W, LIAO X, PARK S, JEON H J, CHUNG W H, STOTHARD P, PARK Y S, LEE J K, LEE K T, KIM S H, OH J D, KIM N, KIM T H, LEE H K, LEE S J. Massively parallel sequencing of Chikso (Korean brindle cattle) to discover genome-wide SNPs and InDels., 2013, 36(3): 203-211.

        [14] LEE K T, CHUNG W H, LEE S Y, CHOI J W, KIM J, LIM D, LEE S,JANG G W, KIM B, CHOY Y H, LIAO X, STOTHARD P, MOORE S S, LEE S H, AHN S, KIM N, KIM T H. Whole-genome resequencing of Hanwoo (Korean cattle) and insight into regions of homozygosity., 2013, 14(1): 519.

        [15] CHOI J W, LIAO X, STOTHARD P, CHUNG W H, JEON H J, MILLER S P, CHOI S Y, LEE J K, YANG B, LEE K T, HAN K J, KIM H C, JEONG D, OH J D, KIM N, KIM T H, LEE H K, LEE S J. Whole-genome analyses of Korean native and Holstein cattle breeds by massively parallel sequencing., 2014, 9(7): e101127.

        [16] CHOI J W, CHOI B H, LEE S H, LEE S S, KIM H C, YU D, CHUNG W H, LEE K T, CHAI H H, CHO Y M, LIM D. Whole-genome resequencing analysis of hanwoo and yanbian cattle to identify genome-wide SNPs and signatures of selection., 2015, 38(5): 466.

        [17] SASAKI S, WATANABE T, NISHIMURA S, SUGIMOTO Y. Genome-wide identification of copy number variation using high-density single-nucleotide polymorphism array in Japanese Black cattle., 2016, 17(1): 1.

        [18] DAETWYLER H D, CAPITAN A, PAUSCH H, STOTHARD P, VAN BINSBERGEN R, BR?NDUM R F, LIAO X, DJARI A, RODRIGUEZ S C, GROHS C, ESQUERRé D, BOUCHEZ O, ROSSIGNOL M N, KLOPP C, ROCHA D, FRITZ S, EGGEN A, BOWMAN P J, COOTE D, CHAMBERLAIN A J, ANDERSON C, VANTASSELL C P, HULSEGGE I, GODDARD M E, GULDBRANDTSEN B, LUND M S, VEERKAMP R F, BOICHARD D A, FRIES R, HAYES B J. Whole-genome sequencing of 234 bulls facilitates mapping of monogenic and complex traits in cattle., 2014, 46(8): 858-865.

        [19] QANBARI S, PAUSCH H, JANSEN S, SOMEL M, STROM T M, FRIES R, NIELSEN R, SIMIANER H. Classic selective sweeps revealed by massive sequencing in cattle., 2014, 10(2): e1004148.

        [20] LI H, DURBIN R. Fast and accurate short read alignment with Burrows-Wheeler transform., 2009, 25(14): 1754-1760.

        [21] MCKENNA A, HANNA M, BANKS E, SIVACHENKO A, CIBULSKIS K, KERNYTSKY A, GARIMELLA K, ALTSHULER D, GABRIEL S, DALY M, DEPRISTO M A. The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data., 2010, 20(9): 1297-1303.

        [22] CHEN K, WALLIS J W, MCLELLAN M D LARSON D E, KALICKI J M, POHL C S, MCGRATH S D, WENDL M C, ZHANG Q, LOCKE D P, SHI X, FULTON R S, LEY T J, WILSON R K, DING L, MARDIS E R. BreakDancer: an algorithm for high-resolution mapping of genomic structural variation., 2009, 6: 677-681.

        [23] HE W, ZHAO S, LIU X, DONG S, LV J, LIU D, WANG J, MENG Z. ReSeqTools: an integrated toolkit for large-scale next-generation sequencing based resequencing analysis., 2013, 12(4): 6275-6283.

        [24] 1000 genomes project consortium, ABECASIS G R, AUTON A, BRODKS L D, DEPRISTO M A, DURBIN R M, HANDSAKER R E, KANG H M, MARTH G T, MCVEAN G A. an integrated map of genetic variation from 1,092 human genomes., 2012, 491(7422): 56-65.

        [25] STENSON P D, BALL E V, MORT M, PHILLIPS A D, SHIEL J A, THOMAS N S, ABEYSINGHE S, KRAWCZAK M, COOPER D N. Human gene mutation database (HGMD?): 2003 update., 2003, 21(6): 577-581.

        [26] FLICEK P, AMODE M R, BARRELL D, BEAL K, BRENT S, CARVALHOSILVA D, CLAPHAM P, COATES G, FAIRLEY S, FITZGERALD S, GIL L, GORDON L, HENDRIX M, HOURLIER T, JOHNSON N, K?H?RI A K, KEEFE D, KEENAN S, KINSELLA R, KOMOROWSKA M, KOSCIELNY G, KULESHA E, LARSSON P, LONGDEN L, MCLAREN W, MUFFATO M, OVERDUIN B, PIGNATELLI M, PRITCHARD B, RIAT H S, RITCHIE G R S, RUFFIER M, SCHUSTER M, SOBRAL D, TANG Y A, TAYLOR K, TREVANION S, VANDROVCOVA J, WHITE S, WILSON M, WILDER S P, AKEN B L, BIRNEY E, CUNNINGHAM F, DUNHAM L, DURBIN R, FERNáNDEZ, SUAREZ X M, HARROW J, HERRERO J, HUBBARD T J P, PARKER A, PROCTOR G, SPUDICH G, VOGEL J, YATES A, ZADISSA A, SEARLE S M J. Ensembl 2012., 2011: gkr991.

        [27] HUANG D W, SHERMAN B T, LEMPICKI R A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources., 2009, 4(1): 44-57.

        [28] HIENDLEDER S, THOMSEN H, REINSCH N, BENNEWITZ J, LEYHE-HORN B, LOOFT C, XU N, MEDJUGORAC I, RUSS I, KüHN C, BROCKMANN G A, BLüMEL J, BRENIG B, REINHARDT F, REENTS R, AVERDUNK G, SCHWERIN M, F?RSTER M, KALM E, ERHARDT G. Mapping of QTL for body conformation and behavior in cattle., 2003, 94(6): 496-506.

        [29] NKRUMAH J D, LI C, YU J, HANSEN C, KEISLER D H, MOORE S S. Polymorphisms in the bovine leptin promoter associated with serum leptin concentration, growth, feed intake, feeding behavior, and measures of carcass merit., 2005, 83(1): 20-28.

        [30] HU Z L, FRITZ E R, REECY J M. AnimalQTLdb: a livestock QTL database tool set for positional QTL information mining and beyond., 2007, 35(suppl. 1): D604-D609.

        [31] HU Z L, REECY J M. Animal QTLdb: beyond a repository. A public platform for QTL comparisons and integration with diverse types of structural genomic information., 2007, 18(1): 1-4.

        [32] THOMAS M G, ENNS R M, SHIRLEY K L, GARCIA M D, GARRETT A J, SILVER G A. Associations of DNA polymorphisms in growth hormone and its transcriptional regulators with growth and carcass traits in two populations of Brangus bulls., 2007, 6(1): 222-237.

        [33] BAGNATO A, SCHIAVINI F, ROSSONI A, MALTECCA C, DOLEZAL M, MEDUGORAC I, S?LKNER J, RUSSO V, FONTANESI L, FRIEDMANN A, SOLLER M, LIPKIN E. Quantitative trait loci affecting milk yield and protein percentage in a three-country Brown Swiss population., 2008, 91(2): 767-783.

        [34] FERRAZ J B S, PINTO L F B, MEIRELLES F V, ELER J P, DE REZENDE F M, OLIVEIRA E C, ALMEIDA H B, WOODWARD B, NKRUMAH D. Association of single nucleotide polymorphisms with carcass traits in Nellore cattle., 2009, 8(4): 1360-1366.

        [35] ALBERS C A, LUNTER G, MACARTHUR D G, MCVEAN G, OUWEHAND W H, DURBIN R. Dindel: accurate indel calls from short-read data., 2011, 21(6): 961-973.

        [36] NIELSEN R. Molecular signatures of natural selection., 2005, 39: 197-218.

        [37] Shin Y, Jung H J, Jung M, Yoo S I, Subramaniyam S, Markkandan K, Kang J M, Rai R, Park J, Kim J J. Discovery of gene sources for economic traits in Hanwoo by whole- genome resequencing., 2016, 29(9): 1353-1362.

        [38] FUJIMOTO A, NAKAGAWA H, HOSONO N, NAKANO K, ABE T, BOROEVICH K A, NAGASAKI M, YAMAGUCHI R, SHIBUYA T, KUBO M, MIYANO S, NAKAMURA Y, TSUNODA T. Whole- genome sequencing and comprehensive variant analysis of a Japanese individual using massively parallel sequencing., 2010, 42(11): 931-936.

        [39] MAKOVA K D, LI W H. Strong male-driven evolution of DNA sequences in humans and apes., 2002, 416(6881): 624-626.

        [40] HOASHI S, HINENOYA T, TANAKA A, OHSAKI H, SASAZAKI S, TANIGUCHI M, OYAMA K, MUKAI F, MANNEN H. Association between fatty acid compositions and genotypes of FABP4 and LXR-alpha in Japanese Black cattle., 2008, 9(1): 1.

        [41] JIANG Z, MICHAL J J, TOBEY D J, WANG Z, MACNEIL M D, MAGNUSON N S. Comparative understanding of UTS2 and UTS2R genes for their involvement in type 2 diabetes mellitus., 2008, 4(2): 96-102.

        [42] GILL J L, BISHOP S C, MCCORQUODALE C, WILLIAMS J L, WIENER P. Association of selected SNP with carcass and taste panel assessed meat quality traits in a commercial population of Aberdeen Angus-sired beef cattle., 2009, 41(1): 36.

        [43] BUCHANAN F C, FITZSIMMONS C J, VAN KESSEL A G, Thue T D, Winkelman-Sim D C, Schmutz S M. Association of a missense mutation in the bovine leptin gene with carcass fat content and leptin mRNA levels., 2002, 34(1): 105-116.

        [44] Watanabe N, Satoh Y, Fujita T, Ohta T, Kose H, Muramatsu Y, Yamamoto T, Yamada T. Distribution of allele frequencies atand*between Japanese Black and four other cattle breeds with differing historical selection for marbling., 2011 4: 10.

        [45] LIEFERS S C, VEERKAMP R F, PAS M F W, DELAVAUD C, CHILLIARD Y, LENDE T A. missense mutation in the bovine leptin receptor gene is associated with leptin concentrations during late pregnancy., 2004, 35(2): 138-141.

        [46] YANG Q E, OZAWA M, ZHANG K, JOHNSON S E, EALY A D. The requirement for protein kinase C delta (PRKCD) during preimplantationbovine embryo development., 2014, 28(4): 482-490.

        [47] LIU Y, QIN X, SONG X Z, JIANG H Y, SHEN Y F, DURBIN K J, LIEN S, KENT M P, SODELAND M, REN Y R, ZHANG L, SODERGREN E, HAVLAK P, WORLEY K C, WEINSTOCK G M, GIBBS R A. Bos taurus genome assembly., 2009, 10(1): 1.

        [48] MISHRA C, PALAI T K, SARANGI L N, PRUSTY B R, MAHARANA B R. Candidate gene markers for sperm quality and fertility in bulls., 2013, 6: 905-910.

        [49] McClure M, Kim E, Bickhart D, Null D, Cooper T, Cole J, Wiggans J, Ajmone-Marsan P, Colli L, Santus E, Liu G, Schroeder S, Matukumalli L, Tassell C V, Sonstegard T. Fine mapping for Weaver syndrome in Brown Swiss cattle and the identification of 41 concordant mutations across NRCAM, PNPLA8 and CTTNBP2., 2013, 8(3): e59251.

        [50] Van den Bossche J, Malissen B, Mantovani A, De Baetselier P J A, Ginderachter V. Regulation and function of the E-cadherin/catenin complex in cells of the monocyte- macrophage lineage and DCs., 2012, 119(7): 1623-1633.

        (責(zé)任編輯 林鑒非)

        The Whole Genome Data Analysis of Sanjiang Cattle

        SONG Nana1,2, ZHONG Jincheng1,2, CHAI Zhixin1,2, WANG Qi1,2, HE Shiming3,WU Jinbo3, JIAN Shanglin4, RAN Qiang5, MENG Xin5, HU Hongchun4

        (1Key Laboratory of Animal Genetics and Breeding of State Ethnic Affairs Commission and Ministry of Education, Southwest University for Nationalities, Chengdu 610041;2Institute of Tibetan Plateau Research, Southwest University for Nationalities, Chengdu 610041;3Animal Husbandry Science Institute of ABa Autonomous Prefecture, Wenchuan 623000, Sichuan;4Animal Husbandry and Veterinary Station of Aba Autonomous Prefecture, Wenchuan 623000, Sichuan;5Animal Husbandry and Veterinary Station of Wenchuan, Wenchuan 623000, Sichuan)

        【Objective】The objective of this paper is to study the genetic diversity of Sanjiang cattle group and discuss its genetic variation at the genome level.【Method】Fifty individual genomic DNA were extracted and mixed with isocratic and equal volumes, then the DNA pool of the mixed samples were constructed. Genomic DNA was interrupted randomly by using CovarisS2 and the DNA fragments of 500 bp were recovered by electrophoresis, and DNA library was constructed at last. Finally, the sequencing data were obtained through the Illumina HiSeq 2000. The short reads were mapped to bovine reference genome (UMD 3.1) to detect the genomic mutations of Sanjiang cattle using BWA software. The analysis of the re-sequencing data was implemented using SAMtools, Picard-tools, GATK, Reseqtools, the SNPs and indels were annotated based on the Ensembl, DAVID and dbSNP database. 【Result】A total of 77.8 Gb of sequence data were generated by whole-genome sequencing analysis, 99.31% of the reference genome sequence was covered with a mapping depth of 25.32-fold, 778 403 444 reads and 77 840 344 400 bases were obtained, of which 673 670 505 reads and 67 341 451 555 bases covered 86.55% and 86.51% of bovine reference genomes (UMD 3.1) respectively, paired-end reads mapping were 635 242 898 (81.61%), paired-end bases mapping were 63 512 636 924 (81.59%). A total of 20 477 130 SNPs and 1 355 308 small indels were identified, of which 2 147 988 SNPs (2.4%) and 90 180 (6.7%) indels were found to be new. Of the total number of SNPs, 989 686 (4.83%) homozygous SNPs and 19 487 444 (95.17%) heterozygous SNPs were discovered, homozygous/heterozygous SNPs was 1﹕19.7. Transitions were 14 800 438, transversions were 6 680 058, transition/transversion (TS/TV) was 2.215. SNPs of splice site mutations were 727,the number of SNPs which the start codon converts into no stop codon were 117, SNPs of premature stop codon were 530, the number of SNPs which stop codon converts into no stop codon were 88. A total of 57 621 non-synonymous SNPs and 83 797 synonymous SNPs were detected, the ratio was 0.69. Non-synonymous SNPs were detected in 9 017 genes, 567 genes were assigned as trait-associated genes, which included meat quality, disease resistance, milk production, growth rate, fecundity with the number of 471, 77, 21, 10, and 8 respectively, the function of some genes were overlap. In detection of indels, 693 180 (51.15%) were deletions and 662 148 (48.85%) were insertions, 161 198 (11.89%) were homozygous and 1 194 110 (88.11%) were heterozygous. Most variations were located in intergenic regions and introns. Heterozygosity (), nucleotide diversity () and theta W of Sanjiang cattle genome-wide were 7.6×10-3, 0.0039, 0.0040, respectively, which indicated that Sanjiang cattle have an abundant genetic diversity. The Tajima'D of Sanjiang cattle population was -0.06832, which speculated that the population exists an unbalanced selection.【Conclusion】Results of this research will provide valuable genomic data for further investigations of the genetic mechanisms underlying traits of interest and protection of Sanjiang cattle breeds genetic diversity.

        Sanjiang cattle; genome; next generation sequencing; SNP; indel

        2016-06-12;接受日期:2016-11-07

        四川省科技廳項目(2015JY0248)、中央高校服務(wù)民族地區(qū)發(fā)展項目(2015NFW01)

        宋娜娜,Tel:13688499824;E-mail:songnana28@126.com。通信作者鐘金城,E-mail:zhongjincheng518@126.com

        中文字幕人妻少妇精品| 女女女女bbbbbb毛片在线| 亚洲免费观看| 亚洲精品二区在线观看| 久久免费亚洲免费视频| 亚洲av成人噜噜无码网站| 老太脱裤让老头玩ⅹxxxx| 亚洲AV成人综合五月天在线观看| 黄片国产一区二区三区| 麻豆网神马久久人鬼片| 236宅宅理论片免费| 尤物蜜芽福利国产污在线观看 | 国产精品办公室沙发| 国产精品成人99一区无码| 亚洲欧美成人在线免费| 亚洲国产一区二区中文字幕| 亚洲熟女www一区二区三区| 亚洲AV色无码乱码在线观看| 美腿丝袜一区二区三区| 91色老久久偷偷精品蜜臀懂色 | 婷婷丁香社区| 米奇亚洲国产精品思久久| 久久久天堂国产精品女人| 最新精品国偷自产在线| 亚洲国产午夜精品乱码| 日本av第一区第二区| 日韩视频在线观看| 窝窝影院午夜看片| 国产福利一区二区三区视频在线看| 国产嫩草av一区二区三区| 亚洲男人av天堂午夜在| 国产艳妇av在线出轨| 国产自拍视频一区在线| 亚洲精品蜜夜内射| 国产3p视频| 日本大片在线一区二区三区| 欧美性生交大片免费看app麻豆| 欧美成人精品一区二区综合| 国产精品日韩中文字幕| 亚洲综合av大全色婷婷| 亚洲精品黑牛一区二区三区|