張佩珩 卜東波 熊勁 譚光明
摘 要:新一代測序的發(fā)展和推廣應用使生物序列數(shù)據(jù)增長速度遠遠超過了摩爾定律對計算機處理能力增長的預期。該研究人員將深入分析各種基因組數(shù)據(jù)的特點,針對性地研究高效數(shù)據(jù)壓縮和傳輸?shù)姆椒?,研究新型的?shù)據(jù)存儲系統(tǒng)構(gòu)架;研究在壓縮空間上進行數(shù)據(jù)處理的方法,將存儲、壓縮和處理、應用結(jié)合起來考慮,發(fā)展適應超大規(guī)?;蚪M數(shù)據(jù)的搜索方法;深入分析測序數(shù)據(jù)的特點和測序數(shù)據(jù)常見處理任務對計算資源的需求特點,探索新的軟硬件模型和可能的新型體系結(jié)構(gòu),探索新的計算服務模型在測序數(shù)據(jù)存儲、傳輸和處理上的應用,從計算技術(shù)上為迎接個體基因組時代的到來做好充分準備,同時推動我國相關(guān)信息技術(shù)和產(chǎn)業(yè)的創(chuàng)新發(fā)展。
關(guān)鍵詞:深度測序 大數(shù)據(jù) 計算模型 體系結(jié)構(gòu) 序列比對 序列拼接 序列壓縮
Abstract:With the development of next-generation sequencing, the sequence data increase much faster than Moore's Law. In this project we will further analyze the characteristics of various genomic data, research data compression and transmission methods, study the new data storagestorage system architecturewe will research data processing method in the compression space, comprehensively considering storage and compression as well as processing together, develop methods to search over large-scale genomic datawe will analyze the characteristics of sequencing data sequencing and data processing tasks, explore new computing models and new hardware-software architecture. These work will help us to prepare for the arrival of individual genomes era, while promoting innovation and development of China's information technology and industry.
Key Words:Deep Sequencing;Big Data;Computing Model;Architechture;Sequence Alignment;Sequence Assembly;Sequence Compress
閱讀全文鏈接(需實名注冊):http://www.nstrs.cn/xiangxiBG.aspx?id=50827&flag=1