王燕軍
摘 ?要: 分布式路由算法廣泛應(yīng)用于認(rèn)知無(wú)線電網(wǎng)絡(luò)(CRNs)。為此,分析多跳CRNs的路由問(wèn)題,利用無(wú)中心的Markov決策過(guò)程(DEC?POMDP)建立問(wèn)題模型,并確保次級(jí)用戶對(duì)主級(jí)用戶的干擾少于預(yù)定閾值,進(jìn)而控制端到端時(shí)延。最后引用多智能體學(xué)習(xí)算法解決此問(wèn)題模型,進(jìn)而形成基于多智能體學(xué)習(xí)的路由(MALR)。實(shí)驗(yàn)結(jié)果表明,提出的路由能夠控制時(shí)延,并降低了干擾率。
關(guān)鍵詞: 認(rèn)知無(wú)線電網(wǎng)絡(luò); MALR; Markov決策過(guò)程; 干擾降低; 多智能體學(xué)習(xí); 時(shí)延控制
中圖分類(lèi)號(hào): TN915.04?34; TP393 ? ? ? ? ? ? ? ? ? ?文獻(xiàn)標(biāo)識(shí)碼: A ? ? ? ? ? ? ? ? 文章編號(hào): 1004?373X(2019)19?0023?05
Abstract: The distributed routing algorithm is widely used in cognitive radio networks (CRNs). The distributed cooperative multi?agent routing problem in multi?hop CRNs is analyzed. The decentralized partially observable Markov decision process (DEC?POMDP) is used to establish the problem model, which can guarantee that the interference from secondary user to primary user is lower than the predefined threshold, and control the end?to?end delay. The multi?agent learning algorithm is introduced to deal with the problem model, so as to form the multi?agent learning?based routing (MALR). The experimental results show that the proposed routing can control the delay and reduce interference probability.
Keywords: cognitive ratio network; MALR; Markov decision process; interference reduction; multi?agent learning; delay control
隨著無(wú)線應(yīng)用業(yè)務(wù)的拓展,對(duì)無(wú)線頻譜要求越來(lái)越高。當(dāng)頻譜是空閑時(shí),注冊(cè)用戶(也稱(chēng)為主用戶,Primary Users,PUs)具有頻譜優(yōu)先接入權(quán)。認(rèn)知無(wú)線電網(wǎng)絡(luò)(Cognitive Radio Networks,CRNs)是解決注冊(cè)頻譜的重新使用問(wèn)題[1?2]。在CRNs網(wǎng)絡(luò)內(nèi),在不干擾PUs用戶傳輸?shù)臈l件下,次級(jí)用戶(Secondary Users,SUs)可以接入已注冊(cè)頻譜。與傳統(tǒng)無(wú)線網(wǎng)絡(luò)類(lèi)似,CRN存在集中網(wǎng)絡(luò)或分布式(自組網(wǎng)絡(luò))形式。在集中網(wǎng)絡(luò)中,單一基站提供頻譜接入和SUs的單跳通信。在分布式網(wǎng)絡(luò)中,SUs能夠與網(wǎng)絡(luò)內(nèi)其他用戶以多跳方式進(jìn)行通信。與傳統(tǒng)的多跳無(wú)線網(wǎng)絡(luò)不同,CRNs中的路由設(shè)計(jì)存在挑戰(zhàn),在設(shè)計(jì)CRNs路由時(shí)需要考慮多個(gè)因素。首先,路由協(xié)議應(yīng)考慮PUs活動(dòng)的真實(shí)模型。其次,CRNs具有分布式特性。由于SUs不可能使用共同控制信道接收關(guān)于網(wǎng)絡(luò)的分布式信息,僅使用局部信息決策路由,所以路由必須具有分布特性。第三,SUs流量的路由性能?chē)?yán)重受到CRNs環(huán)境因素的影響,特別是PUs的活動(dòng)狀態(tài)和其他SUs的流量。因此,應(yīng)著重考慮CRNs快速環(huán)境變化[3?6]。為此,本文考慮分布式協(xié)作多代理的CRNs路由問(wèn)題。此問(wèn)題的約束就是因SUs傳輸導(dǎo)致的PUs的數(shù)據(jù)包丟失數(shù)必須少于預(yù)定閾值。為此,利用馬爾可夫調(diào)制泊松過(guò)程(Markov Modulated Poisson Process,MMPP)模擬PUs活動(dòng),建立問(wèn)題模型,再引用多智能體學(xué)習(xí)求解,從而建立穩(wěn)定路由。實(shí)驗(yàn)數(shù)據(jù)表明,提出的MALR(Multi?Agent Learning?Based Routing)路由能夠有效地降低時(shí)延,并控制干擾率。
1.1 ?系統(tǒng)模型
本文針對(duì)認(rèn)知無(wú)線電網(wǎng)絡(luò)的路由問(wèn)題展開(kāi)分析,并提出基于多智能體學(xué)習(xí)路由MALR。首先利用Markov決策過(guò)程建立問(wèn)題模型,再利用多智能體學(xué)習(xí)算法解決路由問(wèn)題,從而保證數(shù)據(jù)快速傳輸,并控制對(duì)其他鏈路的干擾。實(shí)驗(yàn)數(shù)據(jù)表明,與FPLA和OPERA算法相比,提出的MALR路由減少了傳輸時(shí)延,也降低了干擾率。
參考文獻(xiàn)
[1] ABDELAZIZ S, ELNAINAY M. Metric?based taxonomy of rou?ting protocols for cognitive radio Ad Hoc networks [J]. Journal of network and computer applications, 2014, 40(3): 151?163.
[2] AI?RAWI H A A, YAN K L A, MOHAMD H, et al. A reinforcement learning?based routing scheme for cognitive radio Ad Hoc networks [C]// Proceedings of 2014 IFIP Wireless and Mobile Networking Conference. Vilamoura: IEEE, 2014: 1?8.
[3] BARVE S, KULKARNI P. Multi?agent reinforcement learning based opportunistic routing and channel assignment for mobile cognitive radio Ad Hoc network [J]. Mobile networks and applications, 2014, 19(6): 720?730.
[4] 沈艷霞,薛小松.無(wú)線傳感網(wǎng)絡(luò)移動(dòng)信標(biāo)節(jié)點(diǎn)路徑優(yōu)化策略[J].傳感器與微系統(tǒng),2012,31(12):42?46.
SHEN Yanxia, XUE Xiaosong. Path optimization strategy of WSNs mobile beacon nodes [J]. Transducer and microsystem technologies, 2012, 31(12): 42?46
[5] 陳友榮,王章權(quán),程菊花,等.基于最短路徑樹(shù)的優(yōu)化生存時(shí)間路由算法[J].傳感技術(shù)學(xué)報(bào),2012,25(3):406?413.
CHEN Yourong, WANG Zhangquan, CHENG Juhua, et al. Lifetime optimized routing algorithm based on shortest path tree [J]. Chinese journal of sensors and actuators, 2012, 25(3): 406?413.
[6] CALEFFI M, AKYILDIZ I F. OPERA: optimal routing metric for cognitive radio Ad Hoc networks [J]. IEEE transactions on wireless communication, 2012, 11(5): 2884?2894.
[7] CHU S C A, ALFA A S. A model for bursty PU channel and its impact on the study of cognitive radio networks [C]// Proceedings of 2013 International Wireless Communications and Mobile Computing Conference. Sardinia: IEEE, 2013: 461?466.
[8] DING L, MELODIA T, BATALAMA N. Distributed resource allocation in cognitive and cooperative Ad Hoc networks through joint routing, relay selection and spectrum allocation [J]. Computer networks, 2015, 83(3): 315?331.
[9] EL?SHERIF A A, MOHAMED A. Joint routing and resource allocation for delay minimization in cognitive radio based mesh networks [J]. IEEE transactions on wireless communication, 2014, 13(5): 186?197.
[10] GRONDMAN I, BUSONIU L. A survey of actor?critic reinforcement learning: standard and natural policy gradients [J]. IEEE transactions on systems, man and cybernetics, 2012, 42(6): 1291?1307.
[11] KAE W C, HOSSAIN E. Estimation of primary user parameters in cognitive radio systems via hidden Markov model [J]. IEEE transactions on signal processing, 2013, 61(7): 782?795.
[12] LIANG Q, WANG X, TIAN X. Two?dimensional route swit?ching in cognitive radio networks: a game?theoretical framework [J]. IEEE/ACM transactions on networking, 2015, 2(23): 1053?1066.
[13] PING S, AIJAZ A. SACRP: a spectrum aggregation?based cooperative routing protocol for cognitive radio Ad?Hoc networks [J]. IEEE transactions on communications, 2015, 63(8): 2015?2030.
[14] ZHU Quanyan, YUAN Zhou, SONG Jubin, et al. Interfe?rence aware routing game for cognitive radio multi?hop networks [J]. IEEE journal on selected areas in communications, 2012, 30(10): 2006?2015.