亚洲免费av电影一区二区三区,日韩爱爱视频,51精品视频一区二区三区,91视频爱爱,日韩欧美在线播放视频,中文字幕少妇AV,亚洲电影中文字幕,久久久久亚洲av成人网址,久久综合视频网站,国产在线不卡免费播放

        ?

        Xunzi the LLM—A Way for People to Access Ancient Chinese Texts大型語言模型“荀子” 讓人們接觸中國古籍

        2024-11-06 00:00:00
        時代英語·高一 2024年7期
        關(guān)鍵詞:荀子古籍檢索

        Thousands of years ago, texts appeared on animal bones, bronzes, bamboo slips, and silk brocades before they were written on paper. But now these ancient Chinese texts have a new container.

        In December 2023, a research team from Nanjing Agricultural University has rolled out Xunzi, a large language model (LLM) and XunziChat in association with Gulian, a professional ancient Chinese text publisher.

        Wang Dongbo, the leader of the research team, said that the large language model was named after Xunzi because Xunzi was not only a prominent Confucian philosopher during the late Warring States Period (475 BC—221 BC), but also a pioneer in presenting and explaining theories of linguistics in ancient China.

        When asked why he and his partners made the large language model, Wang explained that traditional Chinese characters, vertical layout, and the absence of pausing and punctuation are all obstacles that readers have to overcome when they read traditional texts.

        To create Xunzi the LLM, Wang and his partners first did a lot of research. Since 2013, his team has worked tirelessly to digitize Chinese classics like the Siku Quanshu, or the Complete Library in Four Sections. “The hard work involves a large-scale corpus of two billion Chinese characters, which has laid a solid foundation for the large language model,” said Wang.

        幾千年前,文字先是寫在獸骨、青銅器、竹簡和織錦上,然后才被人們寫在紙上。但如今,這些古老的中文文本已經(jīng)有了新載體。

        2023年12月,南京農(nóng)業(yè)大學的一個研究團隊,與一家專業(yè)的古籍出版公司古聯(lián)聯(lián)手,推出了大型語言模型荀子和荀子對話模型。

        研究團隊帶頭人王東波表示,該大型語言模型以荀子的名字命名,是因為荀子不僅是戰(zhàn)國(公元前475年—公元前221年)晚期著名的儒學思想家,還是提出和解釋中國古代語言學理論的先驅(qū)者。

        當被問及他和他的同伴創(chuàng)建這個大型語言模型的原因時,王東波解釋道:繁體字、豎版、缺少停頓和標點符號都是讀者在閱讀繁體文本時需要克服的障礙。

        為了創(chuàng)建大型語言模型荀子,王東波和他的同伴們先做了大量的研究。自2013年以來,他的團隊始終致力于將《四庫全書》等中國經(jīng)典書籍數(shù)字化?!敖?jīng)過辛勤努力,我們建立了20億個漢字的大型語料庫,為建立大型語言模型奠定了堅實的基礎(chǔ)?!蓖鯑|波說。

        But their efforts seem to have paid off. Now Xunzi the LLM can tag, translate, punctuate, and understand scraps of ancient Chinese texts. It can even do part-of-speech analysis and retrieve specific information, such as names, events, and places from a text.

        With this LLM, ancient Chinese texts can be accessed by more Chinese people, including students. For instance, if users type shangu into the chat box, they will not only discover what it is translated to but also see that it can refer to a person’s courtesy name in certain ancient Chinese texts. Through Xunzi’s retrieval function, users can get more specific cultural information based on courtesy names.

        “The model can help us mine for more information hidden in our cultural legacy and find unnoticed models and connections,” said Wang.

        But Wang and his team aren’t simply focused on target users in China. They are aiming at the rest of the world as well. They have shared the LLM on GitHub and other websites, allowing users to download and use it for free. “Our team is committed to the philosophy of making our data and model globally accessible. We hope this will encourage more people to appreciate excellent traditional Chinese culture,” Wang explained.

        他們的努力似乎得到了回報。現(xiàn)在,大型語言模型荀子可以對中國古代文本的片段進行標記、翻譯、加標點和閱讀理解。它甚至可以進行詞性分析并檢索特定信息,如文本中的名稱、事件和地點。

        通過這個大型語言模型,包括學生在內(nèi)的更多中國人,可以接觸到中國古籍。例如,如果用戶在聊天框中輸入shangu的拼音,它不僅能識別出山谷一詞,還會給用戶指出與這個詞相關(guān)的、古籍中一個中國文人的字等。通過荀子的檢索功能,用戶可以根據(jù)古人的字獲取更具體的文化信息。

        “這個模型可以幫助我們挖掘更多隱藏在文化遺產(chǎn)中的信息,找到未被注意到的樣本和關(guān)聯(lián)?!蓖鯑|波說。

        然而,王東波和他的團隊不僅著眼于中國的目標用戶,還將目光投向了世界其他地區(qū)。他們在GitHub和其他網(wǎng)站上共享了荀子,允許用戶免費下載和使用?!拔覀儓F隊秉持著讓我們的數(shù)據(jù)和模型能在全球范圍內(nèi)被人們使用的理念,希望以此鼓勵更多人了解中國優(yōu)秀傳統(tǒng)文化?!蓖鯑|波解釋道。

        Word Bank

        theory /'θ??ri/ n. 理論;原理

        pause /p??z/ v. 暫停;停頓

        The woman spoke almost without pausing for breath.

        obstacle /'?bst?kl/ n. 障礙;阻礙

        analysis /?'n?l?s?s/ n. (對事物的)分析

        appreciate /?'pri??ie?t/ v. 欣賞;賞識

        You can’t really appreciate foreign literature in translation.

        猜你喜歡
        荀子古籍檢索
        中醫(yī)古籍“疒”部俗字考辨舉隅
        關(guān)于版本學的問答——《古籍善本》修訂重版說明
        天一閣文叢(2020年0期)2020-11-05 08:28:06
        荀子“道心”思想初探
        《荀子》的數(shù)學成就初探
        荀子的“王道”觀念
        2019年第4-6期便捷檢索目錄
        關(guān)于古籍保護人才培養(yǎng)的若干思考
        天一閣文叢(2018年0期)2018-11-29 07:48:08
        和諧
        我是古籍修復師
        金橋(2017年5期)2017-07-05 08:14:41
        專利檢索中“語義”的表現(xiàn)
        專利代理(2016年1期)2016-05-17 06:14:36
        国语对白福利在线观看 | 邻居少妇张开腿让我爽视频| 不卡视频在线观看网站| 久久偷看各类wc女厕嘘嘘偷窃| 人妻系列无码专区久久五月天| 日本公与熄乱理在线播放| 最新国产av无码专区亚洲| 69天堂国产在线精品观看| 东京热加勒比日韩精品| 国产成人精品一区二区不卡 | 亚洲国产另类精品| 亚洲午夜无码久久yy6080 | 99热在线观看| 欧美国产日韩a在线视频| 国产亚洲精品国看不卡| 极品少妇人妻一区二区三区| 国产激情视频在线观看的| 看黄a大片日本真人视频直播| 国产在线观看入口| 成人女同av免费观看| 亚洲国产精品av在线| 无码小电影在线观看网站免费| 色窝窝免费播放视频在线| www插插插无码视频网站| 精品国产夫妻自拍av| 东北老熟女被弄的嗷嗷叫高潮| 极品人妻被黑人中出种子| 无码av免费一区二区三区| 亚洲国产精品久久久性色av| 亚洲精品一区二区三区在线观| 色偷偷av一区二区三区| 豆国产95在线 | 亚洲| 自拍亚洲一区欧美另类| 日本免费精品免费视频| 久久狠狠爱亚洲综合影院| 亚洲白白色无码在线观看| 日本最新在线一区二区| 欧美成人精品第一区| 天天综合亚洲色在线精品| 精品亚洲国产探花在线播放| 亚洲av粉嫩性色av|