張永雄 王亮明 李東
摘 要: 為了解決當(dāng)前交通標(biāo)志種類繁多和所處環(huán)境多變,導(dǎo)致智能識(shí)別正確率不高的問題,提出基于多示例深度學(xué)習(xí)的交通標(biāo)志識(shí)別算法。根據(jù)樣本圖像塊與其對(duì)應(yīng)的標(biāo)簽設(shè)計(jì)一個(gè)包含顏色、幾何、區(qū)域特征的訓(xùn)練集,得到樣本特征與標(biāo)簽的對(duì)應(yīng)規(guī)律;根據(jù)權(quán)重修正反饋,推導(dǎo)包與標(biāo)簽的邏輯關(guān)系,建立多示例訓(xùn)練學(xué)習(xí)算子,準(zhǔn)確分類交通標(biāo)志。進(jìn)行訓(xùn)練集損失函數(shù)計(jì)算,通過最優(yōu)分類器響應(yīng)減少訓(xùn)練數(shù)據(jù)損失。最后,基于大數(shù)據(jù)樣本驅(qū)動(dòng)形成背景約束,從而去除示例中模棱兩可的訓(xùn)練數(shù)據(jù),完成交通標(biāo)志的準(zhǔn)確識(shí)別?;赒T平臺(tái),開發(fā)相應(yīng)的識(shí)別軟件。實(shí)驗(yàn)測(cè)試結(jié)果顯示,與當(dāng)前交通標(biāo)志識(shí)別技術(shù)相比,所提算法擁有更高的識(shí)別正確性與魯棒性,且對(duì)各類交通標(biāo)志具有較高的識(shí)別準(zhǔn)確率,在智能汽車、自動(dòng)交通監(jiān)控等領(lǐng)域具有一定的應(yīng)用價(jià)值。
關(guān)鍵詞: 交通標(biāo)志識(shí)別; 損失函數(shù)優(yōu)化; 訓(xùn)練集; 多示例; 深度學(xué)習(xí); 背景約束
中圖分類號(hào): TN911.73?34; TP391 文獻(xiàn)標(biāo)識(shí)碼: A 文章編號(hào): 1004?373X(2018)15?0133?04
Traffic sign recognition algorithm based on multi?instance deep learning and
loss function optimization
ZHANG Yongxiong1, 3, WANG Liangming2, LI Dong1
(1. School of Software Engineering, South China University of Technology, Guangzhou 510006, China;
2. School of Computer Science & Engineering, South China University of Technology, Guangzhou 510006, China;
3. Guangzhou College of Technology and Business, Guangzhou 510850, China)
Abstract: Since the current traffic signs recognition algorithm has low intelligent recognition accuracy due to its various types of traffic signs and changeable environments, a traffic sign recognition algorithm based on multi?instance deep learning is proposed. According to the image block of samples and its corresponding label, a training set including color, geometry and regional characteristics is designed to obtain the correspondence rule between the sample characteristic and tag. On the basis of feedback of weight correction, the logical relation between package and label is derived, and the learning operator of multi?instance training is established to classify the traffic signs accurately. The loss function of training set is calculated by means of the optimal classifier response to reduce the loss of training data. The background constraint is formed on the basis of large data sample driver, so as to eliminate the ambiguous training data in the instance and accomplish the accurate recognition of traffic signs. The corresponding recognition software was developed with QT platform. The experimental results show that, in comparison with the current traffic signs identification technology, the proposed algorithm has higher recognition accuracy and robustness. The algorithm has high recognition accuracy for various traffic signs, and a certain application value in the fields of intelligent vehicle and automatic traffic monitoring.
Keywords: traffic sign recognition; loss function optimization; training set; multi?instance; depth learning; background constraint
隨著城市汽車保有量的上升,在交通擁堵和事故日益多發(fā)的背景下,主動(dòng)安全駕駛技術(shù)越來越受到企業(yè)和人們的重視。作為主動(dòng)安全駕駛技術(shù)的核心之一,交通標(biāo)志識(shí)別在主動(dòng)安全系統(tǒng)中有著重要的作用,當(dāng)駕駛員由于疲勞未注意交通標(biāo)志時(shí),自動(dòng)識(shí)別算法可以準(zhǔn)確及時(shí)地識(shí)別并提醒駕駛員,由此可以避免發(fā)生交通事故,達(dá)到主動(dòng)安全駕駛的目的[1]。目前交通標(biāo)志識(shí)別的難點(diǎn)在于標(biāo)志種類繁多,且每種標(biāo)志隨著天氣、時(shí)間和季節(jié)會(huì)發(fā)生視覺特征的變化,從而使得樣本量巨大,對(duì)識(shí)別算法的抗干擾性提出了挑戰(zhàn)[2]。
為此,在交通標(biāo)志識(shí)別方面,國內(nèi)研究人員已經(jīng)取得了一定研究成果。文獻(xiàn)[3]提出基于優(yōu)化卷積神經(jīng)網(wǎng)絡(luò)結(jié)構(gòu)的交通標(biāo)志識(shí)別算法,綜合批量歸一化(BN)方法、逐層貪婪預(yù)訓(xùn)練(GLP)方法,以及把分類器換成支持向量機(jī)(SVM)這三種方法對(duì)卷積神經(jīng)網(wǎng)絡(luò)(CNN)結(jié)構(gòu)進(jìn)行優(yōu)化,提出基于優(yōu)化CNN結(jié)構(gòu)的交通標(biāo)志識(shí)別算法。但是,此技術(shù)未充分考慮交通標(biāo)志樣本的海量多變,往往影響了識(shí)別準(zhǔn)確性。文獻(xiàn)[4]提出基于分塊自適應(yīng)融合特征的交通標(biāo)志識(shí)別算法,交通標(biāo)志由外部輪廓和內(nèi)部指示符號(hào)組成。HOG特征可較好地描述圖像輪廓但易受噪聲影響,而LBP特征對(duì)圖像細(xì)節(jié)刻畫好,提出基于分塊HOG?LBP自適應(yīng)融合特征的交通標(biāo)志識(shí)別方法;通過分塊計(jì)算梯度直方圖得到的權(quán)重系數(shù)判斷該塊是屬于輪廓還是內(nèi)部指示,前者選擇HOG權(quán)重大,后者選擇LBP特征權(quán)重大,將自適應(yīng)串行融合后的特征送入支持向量機(jī)識(shí)別。然而,這種技術(shù)思維仍然是算法驅(qū)動(dòng),而不是大數(shù)據(jù)驅(qū)動(dòng),在交通標(biāo)志超出樣本范圍時(shí)往往不準(zhǔn)確。
為提高交通標(biāo)志的識(shí)別率,本研究以數(shù)據(jù)驅(qū)動(dòng)的思維提出多示例深度學(xué)習(xí)算法,完成大數(shù)據(jù)樣本的收集整理標(biāo)注工作;綜合特征向量(顏色、幾何和區(qū)域特征),建立數(shù)據(jù)訓(xùn)練?算法的邏輯推導(dǎo)關(guān)系;最后加入訓(xùn)練集損失函數(shù)計(jì)算,進(jìn)一步去除特征數(shù)據(jù)冗余,完成對(duì)交通標(biāo)志的準(zhǔn)確識(shí)別。
人類大腦學(xué)習(xí)過程主要是根據(jù)過去的經(jīng)驗(yàn)形成總結(jié)或者規(guī)律;計(jì)算機(jī)學(xué)習(xí)過程,就是根據(jù)已有數(shù)據(jù)進(jìn)行樣本訓(xùn)練,得到分類器或者擬合函數(shù)的過程。以學(xué)習(xí)過程分類,其中分為監(jiān)督學(xué)習(xí)、半監(jiān)督學(xué)習(xí)和非監(jiān)督學(xué)習(xí)。其中,監(jiān)督學(xué)習(xí)中每個(gè)訓(xùn)練樣本都有已知標(biāo)記,半監(jiān)督學(xué)習(xí)中部分樣本沒有標(biāo)記,而非監(jiān)督學(xué)習(xí)中樣本都沒有標(biāo)記[5]。
多示例學(xué)習(xí)中有:包(bags)和示例(instance)兩個(gè)概念。包由多個(gè)示例組成,本研究中一張幅圖片就是一個(gè)包,圖片分割出的圖像塊(patches)就是示例。在多示例學(xué)習(xí)中,包帶有類別標(biāo)簽而示例不帶類別標(biāo)簽,最終目的是給出對(duì)新的包的類別預(yù)測(cè)[6]。由于用于訓(xùn)練分類器的示例是沒有類別標(biāo)記的,但是對(duì)于包所屬標(biāo)簽定義卻存在類別標(biāo)記,因此,在本文中,多示例學(xué)習(xí)是介于監(jiān)督學(xué)習(xí)與無監(jiān)督學(xué)習(xí)之間且不同于半監(jiān)督學(xué)習(xí)的一種學(xué)習(xí)方法。另外,在本文多示例學(xué)習(xí)機(jī)制中,如果一個(gè)包里面存在至少一個(gè)被分類器判定標(biāo)簽為“+”的示例,則該包為正包,反之,若其示例均被分類器判定標(biāo)簽為“-”,則其為負(fù)包[6]。
在本文中,當(dāng)一個(gè)bag的標(biāo)記為A時(shí),這個(gè)bag里面所有數(shù)據(jù)的標(biāo)記都是A;當(dāng)一個(gè)bag的標(biāo)記為B時(shí),這個(gè)bag里面至少有一個(gè)數(shù)據(jù)的標(biāo)記為B,目標(biāo)是把每種交通標(biāo)志,即每個(gè)包分別歸類。如圖1所示,圖中交通標(biāo)志種類繁多,即本研究的識(shí)別對(duì)象。
本研究首先設(shè)計(jì)一個(gè)訓(xùn)練集[X],其中[Xi]代表訓(xùn)練集[X]中第[i]個(gè)包,每個(gè)包含有一系列示例,如有[m]個(gè)示例:
為了體現(xiàn)本文算法的優(yōu)勢(shì),將交通標(biāo)志識(shí)別性能較好的技術(shù),即文獻(xiàn)[3?4]設(shè)為對(duì)照組;并基于QT平臺(tái)開發(fā)識(shí)別軟件。執(zhí)行算法實(shí)驗(yàn)參數(shù)為:多示例模型分子= [38,96,12,4],特征模型為[12,7,57,34]。
本文開發(fā)的交通標(biāo)志系統(tǒng)界面如圖4a)所示,系統(tǒng)功能具有實(shí)時(shí)視頻開啟、主動(dòng)安全開啟、多示例學(xué)習(xí)、深度學(xué)習(xí)、智能識(shí)別等。本研究先設(shè)計(jì)訓(xùn)練集,包含顏色、幾何、區(qū)域特征,建立包與標(biāo)簽的關(guān)系,然后進(jìn)行損失函數(shù)計(jì)算,得到最優(yōu)分類器,如圖4b)所示,本文算法能夠準(zhǔn)確識(shí)別交通標(biāo)志。
利用對(duì)照組文獻(xiàn)[3]技術(shù),把分類器換成支持向量機(jī)(SVM),這三種方法對(duì)卷積神經(jīng)網(wǎng)絡(luò)(CNN)結(jié)構(gòu)進(jìn)行優(yōu)化,提出基于優(yōu)化CNN結(jié)構(gòu)的交通標(biāo)志識(shí)別算法。但是,此技術(shù)未充分考慮交通標(biāo)志樣本的海量多變,往往影響了識(shí)別準(zhǔn)確性,如圖4c)所示,識(shí)別有誤差,左上角標(biāo)志識(shí)別錯(cuò)誤。
利用對(duì)照組文獻(xiàn)[4]技術(shù),通過分塊計(jì)算梯度直方圖得到的權(quán)重系數(shù)來判斷該塊是屬于輪廓還是內(nèi)部指示,前者選擇HOG權(quán)重大,后者選擇LBP特征權(quán)重大,將自適應(yīng)串行融合后的特征送入支持向量機(jī)識(shí)別。然而,這種技術(shù)思維仍然是算法驅(qū)動(dòng),而不是大數(shù)據(jù)驅(qū)動(dòng),在交通標(biāo)志超出樣本范圍時(shí),識(shí)別往往不準(zhǔn)確,如圖4d)所示,識(shí)別有誤差,右下角標(biāo)志識(shí)別錯(cuò)誤。
為了進(jìn)一步測(cè)試三種算法的穩(wěn)定性,基于本文整理的500 000個(gè)交通標(biāo)志作為測(cè)試數(shù)據(jù)庫,測(cè)試結(jié)果見表1。由表1可知,本文算法的穩(wěn)健性最高,在這種大型樣本庫中,其準(zhǔn)確識(shí)別率為96.3%,而文獻(xiàn)[3?4]兩種算法的準(zhǔn)確識(shí)別率要低于所提算法,分別為91.7%,93.4%。這表明所提交通標(biāo)志識(shí)別算法具有更理想的穩(wěn)健性。
為了解決交通標(biāo)志識(shí)別算法容易受到標(biāo)志種類變化和環(huán)境時(shí)間變化的干擾,導(dǎo)致交通標(biāo)志識(shí)別系統(tǒng)存在識(shí)別力不足,影響系統(tǒng)功能實(shí)現(xiàn)。本文分別從大數(shù)據(jù)樣本收集標(biāo)注、多示例深度學(xué)習(xí)、集成智能識(shí)別出發(fā),提出基于多示例深度學(xué)習(xí)的交通標(biāo)志識(shí)別算法?;诖髷?shù)據(jù)樣本與深度學(xué)習(xí)算法彌補(bǔ)了傳統(tǒng)算法驅(qū)動(dòng)技術(shù)的不足,即將規(guī)則人為寫入程序;而本研究采用數(shù)據(jù)驅(qū)動(dòng)的深度學(xué)習(xí)技術(shù),不僅賦予系統(tǒng)更多的智能反饋?zhàn)哉{(diào)節(jié),同時(shí)保證了識(shí)別精度與穩(wěn)定性。實(shí)驗(yàn)結(jié)果表明本文交通標(biāo)志識(shí)別算法具有更高的準(zhǔn)確度和穩(wěn)定性。
參考文獻(xiàn)
[1] 謝錦,蔡自興,鄧海濤,等.基于圖像不變特征深度學(xué)習(xí)的交通標(biāo)志分類[J].計(jì)算機(jī)輔助設(shè)計(jì)與圖形學(xué)學(xué)報(bào),2017,29(4):632?640.
XIE Jin, CAI Zixing, DENG Haitao, et al. Classification of traffic signs based on image invariant feature depth learning [J]. Journal of computer aided design and computer graphics, 2017, 29(4): 632?640.
[2] KR??K E, TOTH ?. Traffic sign recognition and localization for databases of traffic signs [J]. Acta electrotechnica et informatica, 2011, 11(4): 31?35.
[3] 王曉斌,黃金杰,劉文舉.基于優(yōu)化卷積神經(jīng)網(wǎng)絡(luò)結(jié)構(gòu)的交通標(biāo)志識(shí)別[J].計(jì)算機(jī)應(yīng)用,2017,37(2):530?534.
WANG Xiaobin, HUANG Jinjie, LIU Wenju. Traffic sign re?cognition based on optimized convolutional neural network [J]. Computer applications, 2017, 37(2): 530?534.
[4] 戈俠,于鳳芹,陳瑩.基于分塊自適應(yīng)融合特征的交通標(biāo)志識(shí)別[J].計(jì)算機(jī)工程與應(yīng)用,2017,53(3):188?192.
GE Xia, YU Fengqin, CHEN Ying. Traffic sign recognition based on block adaptive fusion feature [J]. Computer enginee?ring and applications, 2017, 53(3): 188?192.
[5] ZHOU Shusen, ZOU Hailin, LIU Chanjuan. Deep extractive networks for supervised learning [J]. Optik: international journal for light and electron optics, 2016, 127(20): 9008?9019.
[6] FARIA A W C, COELHO F G F, SILVA A M. A new approach for multiple instance learning based on positive instance selection and kernel density estimation [J]. Engineering applications of artificial intelligence, 2017, 59(12): 196?204.
[7] 劉占文,趙祥模,李強(qiáng),等.基于圖模型與卷積神經(jīng)網(wǎng)絡(luò)的交通標(biāo)志識(shí)別方法[J].交通運(yùn)輸工程學(xué)報(bào),2016,16(5):122?131.
LIU Zhanwen, ZHAO Xiangmo, LI Qiang, et al. A traffic sign recognition method based on graph model and convolutional neural network [J]. Journal of traffic and transportation engineering, 2016, 16(5): 122?131.
[8] ZHONG Jingjing, TSE P W, WANG Dong. Novel Bayesian inference on optimal parameters of support vector machines and its application to industrial survey data classification [J]. Neurocomputing, 2016, 211: 159?171.
[9] YU Yongtao. Bag?of?visual?phrases and hierarchical deep mo?dels for traffic sign detection and recognition in mobile laser scanning data [J]. ISPRS journal of photogrammetry and remote sensing, 2016, 11(7): 36?39.
[10] OUERHANI Y. Advanced driver assistance system: road sign identification using VIAPIX system and a correlation technique [J]. Optics and lasers in engineering, 2016, 12(2): 71?76.