李尚平,李向輝,張 可,李凱華,袁泓磊,黃宗曉
改進(jìn)YOLOv3網(wǎng)絡(luò)提高甘蔗莖節(jié)實(shí)時(shí)動(dòng)態(tài)識(shí)別效率
李尚平1,李向輝1,張 可2,李凱華1,袁泓磊1,黃宗曉2
(1. 廣西民族大學(xué)信息科學(xué)與工程學(xué)院,南寧 530006; 2. 廣西大學(xué)機(jī)械學(xué)院,南寧 530004)
為推廣甘蔗預(yù)切種良種、良法種植技術(shù),結(jié)合甘蔗預(yù)切種智能橫向切種機(jī)的開發(fā),實(shí)現(xiàn)甘蔗切種裝置對(duì)蔗種特征的連續(xù)、動(dòng)態(tài)智能識(shí)別。該文通過甘蔗切種機(jī)黑箱部分內(nèi)置的攝像機(jī)連續(xù)、動(dòng)態(tài)采集整根甘蔗表面數(shù)據(jù),采用改進(jìn)的YOLOv3網(wǎng)絡(luò),建立智能識(shí)別卷積神經(jīng)網(wǎng)絡(luò)模型,通過拍攝裝置內(nèi)部的攝像頭對(duì)輸入識(shí)別系統(tǒng)的整根甘蔗的莖節(jié)圖像特征進(jìn)行實(shí)時(shí)定位與識(shí)別,并比對(duì)識(shí)別信息,及時(shí)更新莖節(jié)數(shù)據(jù),識(shí)別、標(biāo)記出莖節(jié)位置,再經(jīng)過數(shù)據(jù)處理得到實(shí)時(shí)的莖節(jié)信息,輸送到多刀數(shù)控切割臺(tái)進(jìn)行實(shí)時(shí)切割。經(jīng)過訓(xùn)練及試驗(yàn)測(cè)試,結(jié)果表明:經(jīng)過訓(xùn)練及試驗(yàn)測(cè)試,模型對(duì)莖節(jié)的識(shí)別的準(zhǔn)確率為96.89%,召回率為90.64%,識(shí)別平均精度為90.38%,平均識(shí)別時(shí)間為28.7 ms,與原始網(wǎng)絡(luò)相比平均精確度提升2.26個(gè)百分點(diǎn),準(zhǔn)確率降低0.61個(gè)百分點(diǎn),召回率提高2.33個(gè)百分點(diǎn),識(shí)別時(shí)間縮短22.8 ms,實(shí)現(xiàn)了甘蔗蔗種的連續(xù)、實(shí)時(shí)動(dòng)態(tài)識(shí)別,為甘蔗預(yù)切種智能橫向切種機(jī)的開發(fā)提供數(shù)據(jù)基礎(chǔ)。
卷積神經(jīng)網(wǎng)絡(luò);機(jī)器視覺;模型;YOLOv3網(wǎng)絡(luò);甘蔗莖節(jié);識(shí)別定位
甘蔗作為一種重要的熱帶經(jīng)濟(jì)作物,與中國國家發(fā)展,人民生活息息相關(guān)。然而甘蔗種植生產(chǎn)仍然面臨諸多問題,目前中國使用的甘蔗種植機(jī)主要為實(shí)時(shí)切種式甘蔗種植機(jī),該機(jī)型存在使用勞動(dòng)強(qiáng)度大、播種不均勻、耗蔗種多等問題[1]。近年來,先進(jìn)的種植技術(shù)提出了甘蔗預(yù)切種的良法種植技術(shù),即利用優(yōu)良甘蔗品種按要求切段,經(jīng)過蔗種篩選,消毒處理等后再種植。該方法便于機(jī)械化種植,具有出芽率高、用種少、甘蔗產(chǎn)量高以及種植成本低的特點(diǎn)。這種方法首先需要進(jìn)行甘蔗的預(yù)切種,基于這個(gè)生產(chǎn)需求,開發(fā)甘蔗預(yù)切種智能橫向切種機(jī),需要對(duì)甘蔗莖節(jié)等特征進(jìn)行快速、有效地識(shí)別定位,并能夠在短時(shí)間內(nèi)輸出結(jié)果,以便進(jìn)行甘蔗蔗種的工廠化生產(chǎn)。
目前針對(duì)甘蔗的莖節(jié)識(shí)別工作還停留在單根或者基本的圖像處理與識(shí)別方面,對(duì)于整根甘蔗圖像進(jìn)行快速處理的方法還很缺乏。如黃亦其等[2]研究了基于局部均值的甘蔗莖節(jié)識(shí)別方法,通過對(duì)圖像進(jìn)行均值濾波,在HSV顏色空間的分量上進(jìn)行圖像分割等處理,然后在G-B色差分量圖上以一定步長(zhǎng)沿中心橫向移動(dòng),計(jì)算平均灰度值,從而找到最大灰度值對(duì)應(yīng)位置確定為莖節(jié)位置,該方法只針對(duì)一段的甘蔗進(jìn)行處理,并且結(jié)果受選定的步長(zhǎng)與固定模板寬度影響,識(shí)別率為90%,時(shí)間為0.48 s。陸尚平等[3]探討了基于機(jī)器視覺的甘蔗莖節(jié)特征提取與識(shí)別方法,對(duì)甘蔗蔗段的圖像的HSV顏色空間中分量與分量圖像進(jìn)行不同處理,然后將2個(gè)圖像進(jìn)行與運(yùn)算合成,對(duì)合成圖劃分區(qū)域并提取不同特征,之后使用支持向量機(jī)的方法處理甘蔗莖節(jié)與節(jié)間部分信息,獲取甘蔗莖節(jié)位置,莖節(jié)數(shù)與位置的平均識(shí)別率分別為94.11%與91.52%,單幅圖像算法執(zhí)行時(shí)間為0.76 s,該方法仍沒有針對(duì)整根甘蔗的處理。張衛(wèi)正等[4]基于圖像處理的甘蔗莖節(jié)識(shí)別與定位,通過背景轉(zhuǎn)換、中值濾波并通過閾值獲取甘蔗二值化圖像,之后采用圖像分割,圖像翻轉(zhuǎn)以及計(jì)算圖像中像素值的方法,得到莖節(jié)位置,處理的莖節(jié)圖像數(shù)據(jù)為單節(jié)且沒有處理整根,時(shí)間為0.3 s。張衛(wèi)正等[5]基于高光譜成像技術(shù)進(jìn)行甘蔗莖節(jié)識(shí)別與定位方法研究,通過圖像采集裝置上方的光譜儀進(jìn)行數(shù)據(jù)收集,提取莖節(jié)特征波段建立模型,對(duì)莖節(jié)進(jìn)行識(shí)別,但是識(shí)別范圍僅限于甘蔗莖節(jié)周圍區(qū)域,并不能識(shí)別整根。韋相貴[6]也探討基于智能算法的甘蔗定位切割方法,通過改進(jìn)的粒子群算法優(yōu)化支持向量機(jī)模型,對(duì)甘蔗圖像進(jìn)行識(shí)別,模型可以對(duì)圖片進(jìn)行分段識(shí)別。王盛等[7]進(jìn)行基于計(jì)算機(jī)視覺識(shí)別技術(shù)的甘蔗種植機(jī)械化研究,通過模糊聚類算法對(duì)莖節(jié)進(jìn)行識(shí)別定位,識(shí)別率為80%。張東紅等基于圖像處理的甘蔗莖節(jié)識(shí)別與蔗芽檢測(cè)[8],通過圖像處理得到莖節(jié)位置,多節(jié)識(shí)別率80%,識(shí)別時(shí)間為0.507 s。
目前,基于甘蔗莖節(jié)識(shí)別研究的情況[9-11]和甘蔗莖節(jié)識(shí)別算法還不能實(shí)現(xiàn)甘蔗切種的智能化及機(jī)械化,以及滿足日常生產(chǎn)需求。卷積神經(jīng)網(wǎng)絡(luò)[12-17]已被證明在圖像識(shí)別和分類等領(lǐng)域非常有效,不僅在日常生活中的人臉識(shí)別[18],車牌識(shí)別[19]等方面效果卓著[20],而且在農(nóng)業(yè)領(lǐng)域,已開始應(yīng)用在植物葉片分類[21]、果實(shí)識(shí)別[22-24]、動(dòng)物個(gè)體識(shí)別[25-26]以及農(nóng)作物病蟲害識(shí)別[27-29]等方面,為卷積神經(jīng)網(wǎng)絡(luò)在甘蔗表面特征的識(shí)別提供了理論參考。
根據(jù)自主開發(fā)的甘蔗切種機(jī)的切種需求,需對(duì)整根甘蔗進(jìn)行特征快速識(shí)別,本文提出使用改進(jìn)的YOLOv3網(wǎng)絡(luò)對(duì)甘蔗特征識(shí)別定位處理。通過網(wǎng)絡(luò)訓(xùn)練建立甘蔗莖節(jié)的識(shí)別模型,在原始網(wǎng)絡(luò)識(shí)別定位準(zhǔn)確率與速度的基礎(chǔ)上,進(jìn)一步提高識(shí)別檢測(cè)速度以及識(shí)別率,對(duì)切種機(jī)拍攝裝置采集的數(shù)據(jù)進(jìn)行實(shí)時(shí)處理,對(duì)動(dòng)態(tài)傳輸?shù)膫魉玩溕系母收徇M(jìn)行莖節(jié)特征信息識(shí)別定位,并把實(shí)際數(shù)據(jù)傳輸給后續(xù)的切種裝置,進(jìn)行實(shí)時(shí)切種,對(duì)整根甘蔗進(jìn)行快速實(shí)時(shí)地識(shí)別處理,結(jié)合課題組設(shè)計(jì)的智能甘蔗切種機(jī)系統(tǒng)其他部分,使整個(gè)切種過程實(shí)現(xiàn)機(jī)械化、智能化,以期提高甘蔗切種機(jī)生產(chǎn)效率,減少人工的勞動(dòng)強(qiáng)度與時(shí)間,為甘蔗預(yù)切種的工廠化生產(chǎn)提供研究基礎(chǔ)。
YOLOv3網(wǎng)絡(luò)是J Redmon,A Farhadi于2018年提出的一種物體識(shí)別定位算法YOLO的改進(jìn)版本[30]。比YOLOv2網(wǎng)絡(luò)[31]更加準(zhǔn)確,與其他網(wǎng)絡(luò)[32-34]相比,YOLOv3網(wǎng)絡(luò)具有識(shí)別定位速度更快,識(shí)別率高等特點(diǎn)。YOLOv3網(wǎng)絡(luò)通過不同尺度的特征圖計(jì)算損失函數(shù),分別檢測(cè)不同大小的目標(biāo)。損失函數(shù)如式(1)所示。
式中為莖節(jié)真實(shí)預(yù)測(cè)框置信度,^表示模型預(yù)測(cè)值,為模型輸出,為模型輸出總數(shù),為樣本,為樣本總數(shù),、為預(yù)測(cè)框的寬與高,、為預(yù)測(cè)框中心點(diǎn)坐標(biāo),表示莖節(jié)置信度,為類別即莖節(jié),為忽略的預(yù)測(cè)框。損失函數(shù)第一部分對(duì)預(yù)測(cè)框的中心坐標(biāo)進(jìn)行計(jì)算,優(yōu)化、的預(yù)測(cè)值;第二部分計(jì)算優(yōu)化預(yù)測(cè)框的寬與高;第三部分優(yōu)化莖節(jié)置信度,并減少計(jì)算量;第四部分計(jì)算類別損失,損失函數(shù)計(jì)算過程中使用二元交叉熵,用表示。損失函數(shù)在不同尺度的特征圖上計(jì)算損失,并在模型訓(xùn)練過程中使用了ADAM算法[35]優(yōu)化網(wǎng)絡(luò)損失函數(shù)并對(duì)參數(shù)進(jìn)行更新。
鑒于原始YOLOv3網(wǎng)絡(luò)層數(shù)較多,為進(jìn)一步提高網(wǎng)絡(luò)識(shí)別定位速度,本文在原有網(wǎng)絡(luò)基礎(chǔ)上做出修改,減少網(wǎng)絡(luò)層數(shù),改變最后輸出層的特征圖尺寸,并減少anchors數(shù)量,由9個(gè)減少為6個(gè),達(dá)到提升識(shí)別速度的效果。具體方式通過減少殘差結(jié)構(gòu),構(gòu)建改進(jìn)網(wǎng)絡(luò),網(wǎng)絡(luò)主要結(jié)構(gòu)如圖1所示,將原始網(wǎng)絡(luò)中殘差結(jié)構(gòu)3與殘差結(jié)構(gòu)4的殘差塊數(shù)量減為原來的1/4,并刪減掉殘差結(jié)構(gòu)5,網(wǎng)絡(luò)共減少70層。在正向傳播過程中,padding模式中的same模式通過設(shè)置卷積步長(zhǎng)為1,使得進(jìn)行卷積操作后的輸出與輸入尺寸相同,valid表示進(jìn)行卷積操作時(shí)對(duì)輸入數(shù)據(jù)不進(jìn)行填充,通過與卷積操作的步長(zhǎng)共同作用,使得運(yùn)算后特征圖維度減半。模型中各卷積層使用L2參數(shù)正則化對(duì)權(quán)重矩陣進(jìn)行正則化,并在每個(gè)批次的訓(xùn)練上對(duì)前一層的輸出進(jìn)行規(guī)范化,使其數(shù)據(jù)均值接近0,標(biāo)準(zhǔn)差為1,從而達(dá)到加速收斂、控制過擬合、降低網(wǎng)絡(luò)對(duì)初始權(quán)重不敏感并允許網(wǎng)絡(luò)使用較大的學(xué)習(xí)率的目的,并使用LeakyRelu函數(shù)作為激活函數(shù)。為了擴(kuò)大網(wǎng)絡(luò)對(duì)圖像特征的提取數(shù)量,網(wǎng)絡(luò)通過跳層將網(wǎng)絡(luò)早期特征數(shù)據(jù)與多次降維后采集的特征數(shù)據(jù)進(jìn)行連接。數(shù)據(jù)經(jīng)過4次步長(zhǎng)為2的卷積層降維后,通過卷積運(yùn)算得到第一個(gè)輸出,特征圖尺寸為26×26,將第一個(gè)輸出值前的卷積運(yùn)算結(jié)果通過上采樣與50層的輸出結(jié)合,經(jīng)過卷積運(yùn)算得到特征圖尺寸為52×52的輸出,之后將第二個(gè)輸出值前的卷積運(yùn)算結(jié)果通過上采樣與32層的輸出結(jié)合,經(jīng)過卷積運(yùn)算得到第三個(gè)輸出,特征圖尺寸為104×104。網(wǎng)絡(luò)在最后輸出層使用linear函數(shù)作為激活函數(shù)。改進(jìn)的YOLOv3網(wǎng)絡(luò)正向傳播過程如圖2所示。
圖1 改進(jìn)的YOLOv3網(wǎng)絡(luò)主要結(jié)構(gòu)
圖2 改進(jìn)的YOLOv3網(wǎng)絡(luò)正向傳播過程
樣本選取廣西甘蔗生產(chǎn)協(xié)同創(chuàng)新中心廣西大學(xué)扶綏甘蔗育種基地出產(chǎn)的甘蔗良種,甘蔗生長(zhǎng)期約為6~8個(gè)月,甘蔗平均直徑為30 mm,共3批,每批約180根甘蔗。在課題組自主開發(fā)的甘蔗預(yù)切種式智能橫向切種機(jī)的拍攝裝置下進(jìn)行圖像采集。
本課題組自主設(shè)計(jì)制作的甘蔗預(yù)切種智能橫向切種機(jī)整體結(jié)構(gòu)如圖3a所示,切種機(jī)整體由4部分構(gòu)成,圖中顯示其中3部分,分別為切種機(jī)液壓裝置,切種機(jī)切種裝置以及切種機(jī)拍攝裝置。拍攝裝置對(duì)甘蔗圖像進(jìn)行采集,如圖3b所示,裝置內(nèi)設(shè)有鏈?zhǔn)礁收嵴岱N連續(xù)傳送機(jī)構(gòu),傳送機(jī)構(gòu)由一臺(tái)57步進(jìn)電機(jī)驅(qū)動(dòng),甘蔗通過傳送系統(tǒng)可連續(xù)、依次橫向送入圖像采集系統(tǒng)進(jìn)行圖像采集,傳送系統(tǒng)由步進(jìn)電機(jī)驅(qū)動(dòng)、PLC控制,傳送速度在0.01~0.5 m/s范圍內(nèi),可根據(jù)需要連續(xù)調(diào)整送進(jìn)速度。
圖3 試驗(yàn)平臺(tái)
根據(jù)傳送速度對(duì)甘蔗圖像清晰度影響的統(tǒng)計(jì)試驗(yàn),如果傳送速度太低,影響效率,如速度太快,會(huì)影響圖像的清晰度。在0.1 m/s的蔗種傳送速度、以及平均光照度在430.7 lx時(shí),系統(tǒng)具有最好的識(shí)別精度,因此本試驗(yàn)采用傳送速度為0.1 m/s。通過內(nèi)置于視頻采集裝置上方的攝像頭采集視頻數(shù)據(jù),攝像頭在傳送鏈所在平面上方約1.5 m處,使用RMONCAM1080P攝像頭G200無畸變型,采集視頻圖像尺寸為1 920×1 080像素,視頻幀率為20 幀/s。將采集的視頻圖像數(shù)據(jù)截圖并進(jìn)行標(biāo)記,生成模型訓(xùn)練所需的訓(xùn)練集、驗(yàn)證集與測(cè)試集。試驗(yàn)采用拍攝裝置采集的整根的甘蔗圖片650張,莖節(jié)數(shù)量約12 000個(gè),分別作為訓(xùn)練集450張,驗(yàn)證集50張,測(cè)試集150張,訓(xùn)練集,驗(yàn)證集與測(cè)試集圖片分辨率為416×416像素。
本文處理平臺(tái)為臺(tái)式機(jī)電腦,處理器為AMD2700X,主頻為3.7 GHz,8 G內(nèi)存,顯卡GeForce GTX 1060 6 G。運(yùn)行環(huán)境為Windows 10,程序使用python編寫,調(diào)用Keras、OpenCV等庫并在Spyder上運(yùn)行。
通過構(gòu)建不同結(jié)構(gòu)的改進(jìn)YOLOv3網(wǎng)絡(luò),將訓(xùn)練用數(shù)據(jù)集輸入網(wǎng)絡(luò)進(jìn)行訓(xùn)練,訓(xùn)練過程每6張圖片作為一個(gè)批次輸入網(wǎng)絡(luò),訓(xùn)練過程根據(jù)驗(yàn)證集數(shù)據(jù)保留最優(yōu)模型,訓(xùn)練完成后通過測(cè)試集進(jìn)行模型性能測(cè)試,與原始網(wǎng)絡(luò)測(cè)試結(jié)果進(jìn)行對(duì)比,選取效果最優(yōu)的模型作為甘蔗莖節(jié)識(shí)別模型,并對(duì)模型識(shí)別情況分析,通過對(duì)識(shí)別后的數(shù)據(jù)分析處理,進(jìn)一步完善改進(jìn)的YOLOv3的甘蔗莖節(jié)動(dòng)態(tài)識(shí)別方法。試驗(yàn)流程如圖4所示。
圖4 試驗(yàn)流程圖
在機(jī)器學(xué)習(xí)領(lǐng)域,模型性能評(píng)估可以根據(jù)模型的準(zhǔn)確率、召回率以及模型的平均精確度AP進(jìn)行評(píng)價(jià)。AP值為準(zhǔn)確率以及召回率構(gòu)成的P-R曲線的面積,準(zhǔn)確率為測(cè)試模型中預(yù)測(cè)的實(shí)際正樣本與預(yù)測(cè)的正樣本(包括被錯(cuò)誤預(yù)測(cè)為正樣本的負(fù)樣本)的比值,表示模型的預(yù)測(cè)結(jié)果中真實(shí)正樣本的比例;而召回率為模型預(yù)測(cè)的真實(shí)正樣本數(shù)量與樣本中的真實(shí)正樣本數(shù)量的比值,表示模型預(yù)測(cè)的正樣本占實(shí)際正樣本數(shù)量的比例,具體計(jì)算如式(2)、(3)以及(4)所示,其中:TP為被檢測(cè)到的正樣本,即被正確檢測(cè)到的莖節(jié)樣本數(shù);TN為被檢測(cè)到的負(fù)樣本數(shù),即沒有框選的甘蔗其他部分;FN為被檢測(cè)為負(fù)樣本的正樣本數(shù),即沒有被檢測(cè)的莖節(jié)數(shù);FP為被檢測(cè)為正樣本的負(fù)樣本數(shù),即被檢測(cè)為莖節(jié)的甘蔗其他區(qū)域。為精確率下降點(diǎn)個(gè)數(shù)。通過準(zhǔn)確率、召回率以及AP值可以準(zhǔn)確表示模型的性能。
將驗(yàn)證集與訓(xùn)練集輸入網(wǎng)絡(luò)中進(jìn)行訓(xùn)練,在對(duì)網(wǎng)絡(luò)進(jìn)行訓(xùn)練過程中,經(jīng)過4 000次迭代,得到訓(xùn)練集與驗(yàn)證集損失函數(shù)下降曲線,如圖5所示。從圖中可以看出在網(wǎng)絡(luò)迭代訓(xùn)練過程中,500次迭代附近,驗(yàn)證集損失函數(shù)變化開始放緩,在2 000次迭代以后,訓(xùn)練集損失函數(shù)繼續(xù)呈現(xiàn)下降趨勢(shì),而驗(yàn)證集已不再下降,在45左右浮動(dòng),表明在2 000次迭代后,網(wǎng)絡(luò)對(duì)于訓(xùn)練數(shù)據(jù)出現(xiàn)過擬合現(xiàn)象,從而確定網(wǎng)絡(luò)訓(xùn)練迭代次數(shù)為2 000次。訓(xùn)練過程中網(wǎng)絡(luò)根據(jù)驗(yàn)證集損失函數(shù)保留最優(yōu)權(quán)重等參數(shù)。
圖5 甘蔗莖節(jié)識(shí)別損失函數(shù)曲線
根據(jù)實(shí)際切種需要,為了能夠獲取切刀位置,需對(duì)莖節(jié)完成定位,本文通過從150張測(cè)試圖片中隨機(jī)選取10組100張測(cè)試圖片進(jìn)行測(cè)試計(jì)算,圖片包含多種光照強(qiáng)度以及甘蔗表面泥土等因素的照片。測(cè)試莖節(jié)平均每組1 734個(gè),改進(jìn)模型識(shí)別莖節(jié)目標(biāo)平均每組1 622個(gè),其中50個(gè)目標(biāo)為FP,1 571目標(biāo)為TP,原始模型識(shí)別莖節(jié)目標(biāo)平均每組1 570個(gè),其中39個(gè)目標(biāo)為FP,1 531目標(biāo)為TP。模型的測(cè)試性能可由AP值,準(zhǔn)確率值以及召回率值表示。由于值與值之間的關(guān)系當(dāng)一個(gè)值變大時(shí),另一個(gè)將變小,本文要得到兩個(gè)的平衡點(diǎn),所以根據(jù)實(shí)際切種需求。通過設(shè)置不同的score值與IoU值測(cè)試結(jié)束后得到的最優(yōu)模型,10組模型性能試驗(yàn)平均值對(duì)比如表1所示,平均AP值為90.38%,平均值為96.89%,平均值為90.64%,平均識(shí)別時(shí)間為28.7 ms。相對(duì)于原始網(wǎng)絡(luò),平均AP提高2.26個(gè)百分點(diǎn),平均值降低0.61個(gè)百分點(diǎn),平均值提高2.33個(gè)百分點(diǎn),識(shí)別時(shí)間減少22.8 ms。
表1 2種模型測(cè)試對(duì)比
各種莖節(jié)識(shí)別方法對(duì)比如表2所示。通過試驗(yàn)得到的數(shù)據(jù)與其他莖節(jié)識(shí)別方法的比較,本文方法具有速度快,平均精確度高的特點(diǎn),在莖節(jié)每幅圖片中能識(shí)別整根莖節(jié)特征,并且完成圖片中所有莖節(jié)的識(shí)別用時(shí)最短。試驗(yàn)環(huán)境為實(shí)際切種條件下,具有較高的實(shí)用性。
表2 甘蔗莖節(jié)識(shí)別方法對(duì)比
經(jīng)過10組測(cè)試試驗(yàn),平均精確度AP為90.38%,平均每組1 734個(gè)目標(biāo),其中1 571個(gè)莖節(jié)目標(biāo)被識(shí)別出來,163個(gè)莖節(jié)沒有識(shí)別出來。試驗(yàn)結(jié)果如圖6所示。測(cè)試圖片中大方框?yàn)槟P蜆?biāo)記的莖節(jié)位置,小方框?yàn)槿斯?biāo)記的莖節(jié)位置,框內(nèi)有斜線的為模型預(yù)測(cè)的錯(cuò)誤莖節(jié)位置。通過對(duì)測(cè)試圖片的分析,模型測(cè)試結(jié)果如圖6a所示;未識(shí)別的莖節(jié),即FN的情況,用圓圈表示,如圖6b所示;識(shí)別率為90.38%的原因中,一是部分莖節(jié)沒有被識(shí)別出來,即FN,數(shù)量占大部分;二是因?yàn)榍o節(jié)其他部分被識(shí)別為莖節(jié),并且主要為網(wǎng)絡(luò)在莖節(jié)附近區(qū)域的識(shí)別錯(cuò)誤,即FP,如圖6c所示。
對(duì)于以上情況,在實(shí)際切種過程中,識(shí)別定位系統(tǒng)通過對(duì)實(shí)時(shí)視頻數(shù)據(jù)進(jìn)行處理,對(duì)單根甘蔗跟蹤識(shí)別定位,更新莖節(jié)信息以減少實(shí)際識(shí)別過程中的FN,提高識(shí)別率;對(duì)于FP的情況,在每根甘蔗識(shí)別完成后,對(duì)莖節(jié)數(shù)據(jù)進(jìn)行整合,根據(jù)甘蔗生長(zhǎng)情況,本文根據(jù)預(yù)測(cè)框的坐標(biāo)信息設(shè)置閾值為5像素值,當(dāng)兩個(gè)莖節(jié)預(yù)測(cè)框中心點(diǎn)之間的距離小于閾值時(shí),比對(duì)莖節(jié)預(yù)測(cè)框的概率值,舍棄小的莖節(jié)預(yù)測(cè)框,從而達(dá)到減小FP的目的,通過10組試驗(yàn),準(zhǔn)確率提高1%。莖節(jié)之間距離計(jì)算公式為式(5),其中,c為相鄰莖節(jié)預(yù)測(cè)框之間距離,u、u分別為2個(gè)相鄰莖節(jié)預(yù)測(cè)框中心點(diǎn)橫坐標(biāo),為預(yù)測(cè)框總數(shù)。
c=u?u(1≤<≤)(5)
圖6 識(shí)別結(jié)果分析
根據(jù)識(shí)別定位網(wǎng)絡(luò)的性質(zhì),在數(shù)據(jù)輸入網(wǎng)絡(luò)前,輸入圖片都被轉(zhuǎn)換為固定尺寸,整個(gè)模型對(duì)圖片的識(shí)別處理時(shí)間主要由2部分構(gòu)成,一部分是模型對(duì)輸入圖片的尺寸處理時(shí)間,另一部分為網(wǎng)絡(luò)模型對(duì)圖片的識(shí)別處理時(shí)間。其中網(wǎng)絡(luò)模型對(duì)圖片的識(shí)別處理時(shí)間恒定,只與網(wǎng)絡(luò)規(guī)模有關(guān),并不受輸入圖片尺寸影響。通過對(duì)1 000張圖片試驗(yàn)測(cè)試,改進(jìn)網(wǎng)絡(luò)在圖片處理速度上明顯高于原始網(wǎng)絡(luò),時(shí)間縮短約23 ms。在實(shí)際切種過程中,切刀只能處理1根甘蔗信息,傳送鏈上同時(shí)收集的多根甘蔗數(shù)據(jù)沒有使用,因此根據(jù)實(shí)際需要,在保證識(shí)別的情況下,對(duì)采集的圖像數(shù)據(jù),保持橫向尺寸,截取圖像中間部分區(qū)域。通過設(shè)置輸入圖片尺寸從1 920×149像素逐漸增大到1 920×1 080像素,進(jìn)行尺寸處理時(shí)間的統(tǒng)計(jì),結(jié)果如圖7所示,橫坐標(biāo)為圖片縱向尺寸,縱坐標(biāo)為模型縮放圖像的處理時(shí)間,隨著輸入圖像尺寸的增加而增大,時(shí)間由7 ms逐漸增大到35 ms。為了提高識(shí)別定位系統(tǒng)整體處理速度,采用對(duì)輸入圖像進(jìn)行預(yù)處理的方法,裁剪圖像中間部分,使其尺寸為1 920×150像素,可以使尺寸變化時(shí)間減少約28 ms,達(dá)到提升整體處理速度目的。
圖7 輸入圖片尺寸與識(shí)別系統(tǒng)處理時(shí)間的關(guān)系
本文提出了基于改進(jìn)的YOLOv3網(wǎng)絡(luò)的甘蔗莖節(jié)特征識(shí)別方法。本方法通過減少中間卷積層構(gòu)成的殘差結(jié)構(gòu)數(shù)量,改變輸出特征圖尺寸以及減少anchors數(shù)量,對(duì)原始網(wǎng)絡(luò)進(jìn)行改進(jìn),在甘蔗莖節(jié)特征識(shí)別速度以及識(shí)別率方面得到提高,并通過對(duì)改進(jìn)網(wǎng)絡(luò)的輸出數(shù)據(jù)進(jìn)行分析計(jì)算,進(jìn)一步提高方法的準(zhǔn)確率。
1)通過試驗(yàn)測(cè)試,基于改進(jìn)的YOLOv3網(wǎng)絡(luò)的莖節(jié)定位識(shí)別模型平均精確度達(dá)到90.38%,平均識(shí)別時(shí)間28.7 ms,相對(duì)于原始網(wǎng)絡(luò),時(shí)間縮短22.8 ms,平均精確度提高2.26個(gè)百分點(diǎn),表明改進(jìn)的YOLOv3網(wǎng)絡(luò)可以應(yīng)用于實(shí)際的甘蔗切種機(jī)中,實(shí)現(xiàn)實(shí)時(shí)快速準(zhǔn)確地識(shí)別與定位。
2)根據(jù)試驗(yàn)情況,在網(wǎng)絡(luò)輸出后增加數(shù)據(jù)處理部分,整合模型輸出數(shù)據(jù),根據(jù)甘蔗生長(zhǎng)情況,對(duì)甘蔗莖節(jié)預(yù)測(cè)框中心坐標(biāo)進(jìn)行計(jì)算,剔除錯(cuò)誤預(yù)測(cè)框,使得準(zhǔn)確率值提高1%。
3)為進(jìn)一步提高切種機(jī)整體處理速度,將攝像裝置拍攝的視頻數(shù)據(jù),通過預(yù)處理到1 920×150像素,使得整體處理時(shí)間縮短約28 ms。
[1]劉慶庭,莫建霖,李廷化,等. 我國甘蔗種植機(jī)技術(shù)現(xiàn)狀及存在的關(guān)鍵技術(shù)問題[J]. 甘蔗糖業(yè),2011(5):52-58.
[2]黃亦其,黃體森,黃媚章,等. 基于局部均值的甘蔗莖節(jié)識(shí)別[J]. 中國農(nóng)機(jī)化學(xué)報(bào),2017,38(2):76-80.
Huang Yiqi, Huang Tisen, Huang Meizhang, et al. Recognition of sugarcane nodes based on local mean[J]. Journal of Chinese Agricultural Mechanization, 2017, 38(2): 76-80. (in Chinese with English abstract)
[3]陸尚平,文友先,葛維,等.基于機(jī)器視覺的甘蔗莖節(jié)特征提取與識(shí)別[J]. 農(nóng)業(yè)機(jī)械學(xué)報(bào),2010,41(10):190-194.
Lu Shangping, Wen Youxian, Ge Wei, et al. Recognition and features extraction of sugarcane nodes based on machine vision[J]. Transactions of the Chinese Society for Agricultural Machinery, 2010, 41(10): 190-194. (in Chinese with English abstract)
[4]張衛(wèi)正,董壽銀,齊曉祥,等.基于圖像處理的甘蔗莖節(jié)識(shí)別與定位[J]. 農(nóng)機(jī)化研究,2016,38(4):217-221,257.
Zhang Weizheng, Dong Shouyin, Qi Xiaoxiang, et al. The identification and location of sugarcane internode based on image processing[J]. Journal of Agricultural Mechanization Research, 2016, 38(4): 217-221, 257. (in Chinese with English abstract)
[5]張衛(wèi)正,張偉偉,張煥龍,等.基于高光譜成像技術(shù)的甘蔗莖節(jié)識(shí)別與定位方法研究[J]. 輕工學(xué)報(bào),2017,32(5):95-102.
Zhang Weizheng, Zhang Weiwei, Zhang Huanlong, et al. Research on identification and location method of sugarcane node based on hyperspectral imaging technology[J]. Journal of Light Industry, 2017, 32(5): 95-102. (in Chinese with English abstract)
[6]韋相貴.基于智能算法的甘蔗定位切割方法[J]. 江蘇農(nóng)業(yè)科學(xué),2016,44(4):394-398.
[7]王盛,李明.基于計(jì)算機(jī)視覺識(shí)別技術(shù)的甘蔗種植機(jī)械化研究[J]. 農(nóng)機(jī)化研究,2017,39(6):198-201.
Wang Sheng, Li Ming. Mechanization of sugarcane planting based on computer vision identification technology[J]. Journal of Agricultural Mechanization Research, 2017, 39(6): 198-201. (in Chinese with English abstract)
[8]張東紅,吳玉秀,陳晨.基于圖像處理的甘蔗莖節(jié)識(shí)別與蔗芽檢測(cè)[J]. 洛陽理工學(xué)院學(xué)報(bào):自然科學(xué)版,2019,29(2):67-72.
Zhang Donghong, Wu Yuxiu, Chen Chen. Sugarcane stem section identification and sugarcane bud detection based on image processing[J]. Journal of Luoyang Institute of Science and Technology: Natural Science Edition, 2019, 29(2): 67-72. (in Chinese with English abstract)
[9]陸尚平. 基于機(jī)器視覺的甘蔗莖節(jié)識(shí)別與蔗芽檢測(cè)研究[D]. 武漢:華中農(nóng)業(yè)大學(xué),2011.
Lu Shangping. Research on Sugarcane Internodes and Sugarcane Buds Identification Based on Machina Vision[D]. Wuhai: Huazhong Agricultural University, 2011. (in Chinese with English abstract)
[10]張圓圓,何永玲,王躍飛,等.顏色空間圖像處理技術(shù)在蔗節(jié)識(shí)別上的應(yīng)用[J]. 農(nóng)機(jī)化研究,2020,42(1):231-236.
Zhang Yuanyuan, He Yongling, Wang Yuefei, et al. Application of color space image processing technology in sugarcane node recognition[J]. Journal of Agricultural Mechanization Research, 2020, 42(1): 231-236. (in Chinese with English abstract)
[11]石昌友,王美麗,劉欣然,等.基于機(jī)器視覺的不同類型甘蔗莖節(jié)識(shí)別[J]. 計(jì)算機(jī)應(yīng)用,2019,39(4):1208-1213.
Shi Changyou, Wang Meili, Liu Xinran, et al. Node recognition for different types of sugarcanes based on machine vision[J]. Journal of Computer Applications, 2019, 39(4): 1208-1213. (in Chinese with English abstract)
[12]Lecun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[13]Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[C]// International Conference on Neural Information Processing Systems. 2012.
[14]Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[R]. arXiv: 1409.1556, 2014.
[15]He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition, 2016: 770—778.
[16]Ren S, He K, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2015, 39(6): 1137-1149.
[17]Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[R]. arXiv: 1506.02640, 2015.
[18]Xiong C, Zhao X, Tang D, et al. Conditional convolutional neural network for modality-aware face recognition[C]// 2015 IEEE International Conference on Computer Vision (ICCV). IEEE Computer Society, 2015.
[19]Zhao Z, Yang S, Ma X. Chinese license plate recognition using a convolutional neural network[C]// Computational Intelligence and Industrial Application, 2008. PACIIA '08. Pacific-Asia Workshop on. IEEE, 2009.
[20]余永維,殷國富,殷鷹,等. 基于深度學(xué)習(xí)網(wǎng)絡(luò)的射線圖像缺陷識(shí)別方法[J]. 儀器儀表學(xué)報(bào),2014,35(9):2012-2019.
Yu Yongwei, Yin Guofu, Yin Ying, et al. Defect recognition for radiographic image based on deep learning network[J]. Journal of Instrumentation, 2014, 35(9): 2012-2019. (in Chinese with English abstract)
[21]龔丁禧,曹長(zhǎng)榮. 基于卷積神經(jīng)網(wǎng)絡(luò)的植物葉片分類[J].計(jì)算機(jī)與現(xiàn)代化,2014(4):12-15,19.
Gong Dingxi, Cao Changrong. Plant leaf classification based on CNN[J]. Computer and Modernization, 2014(4): 12-15, 19. (in Chinese with English abstract)
[22]薛月菊,黃寧,涂淑琴,等. 未成熟芒果的改進(jìn)YOLOv2識(shí)別方法[J]. 農(nóng)業(yè)工程學(xué)報(bào),2018,34(7):173-179.
Xue Yueju, Huang Ning, Tu Shuqin, et al. Immature mango detection based on improved YOLOv2[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(7): 173-179. (in Chinese with English abstract)
[23]傅隆生,馮亞利,Elkamil Tola,等.基于卷積神經(jīng)網(wǎng)絡(luò)的田間多簇獼猴桃圖像識(shí)別方法[J]. 農(nóng)業(yè)工程學(xué)報(bào),2018,34(2):205-211.
Fu Longsheng, Feng Yali, Elkamil Tola, et al. Image recognition method of multi-cluster kiwifruit in field based on convolutional neural networks[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(2): 205-211. (in Chinese with English abstract)
[24]趙德安,吳任迪,劉曉洋,等.基于YOLO深度卷積神經(jīng)網(wǎng)絡(luò)的復(fù)雜背景下機(jī)器人采摘蘋果定位[J]. 農(nóng)業(yè)工程學(xué)報(bào),2019,35(3):164-173.
Zhao Dean, Wu Rendi, Liu Xiaoyang, et al. Apple positioning based on YOLO deep convolutional neural network for picking robot in complex background[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2019, 35(3): 164-173. (in Chinese with English abstract)
[25]胡志偉,楊華,婁甜田,等.基于全卷積網(wǎng)絡(luò)的生豬輪廓提取[J]. 華南農(nóng)業(yè)大學(xué)學(xué)報(bào),2018,39(6):111-119.
Hu Zhiwei, Yang Hua, Lou Tiantian, et al. Extraction of pig contour based on fully convolutional networks[J]. Journal of South China Agricultural University, 2018, 39(6): 111-119. (in Chinese with English abstract)
[26]趙凱旋,何東健.基于卷積神經(jīng)網(wǎng)絡(luò)的奶牛個(gè)體身份識(shí)別方法[J]. 農(nóng)業(yè)工程學(xué)報(bào),2015,31(5):181-187.
Zhao Kaixuan, He Dongjian. Recognition of individual dairy cattle based on convolutional neural networks[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2015, 31(5): 181-187. (in Chinese with English abstract)
[27]楊國國,鮑一丹,劉子毅. 基于圖像顯著性分析與卷積神經(jīng)網(wǎng)絡(luò)的茶園害蟲定位與識(shí)別[J]. 農(nóng)業(yè)工程學(xué)報(bào),2017,33(6):156-162.
Yang Guoguo, Bao Yidan, Liu Ziyi. Localization and recognition of pests in tea plantation based on image saliency analysis and convolutional neural network[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(6): 156-162. (in Chinese with English abstract)
[28]劉媛. 基于深度學(xué)習(xí)的葡萄葉片病害識(shí)別方法研究[D]. 蘭州:甘肅農(nóng)業(yè)大學(xué),2018.
Liu Yuan. Research on Methods for Grape Leaf Disease Recognition Based on Deep Learning[D]. Lanzhou: Gansu Agricultural University, 2018. (in Chinese with English abstract)
[29]張建華,孔繁濤,吳建寨,等.基于改進(jìn)VGG卷積神經(jīng)網(wǎng)絡(luò)的棉花病害識(shí)別模型[J]. 中國農(nóng)業(yè)大學(xué)學(xué)報(bào),2018,23(11):161-171.
Zhang Jianhua, Kong Fantao, Wu Jianzhai, et al. Cotton disease identification model based on improved VGG convolution neural network[J]. Journal of China Agricultural University, 2018, 23(11): 161-171. (in Chinese with English abstract)
[30]Redmon J, Farhadi A. YOLOv3: An incremental improvement[R]. arXiv:1804.02767, 2018.
[31]Redmon J, Farhadi A. YOLO9000: Better, faster, stronger[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017: 6517-6525.
[32]Liu W, Anguelov D, Erhan D, et al. Ssd: Single shot multiBox detector[C]//European conference on computer vision. Spinger, Cham, 2016: 21-37.
[33]Fu C Y, Liu W, Ranga A, et al. DSSD: Deconvolutional single shot detector[R]. arXiv:1701.06659, 2017.
[34]Lin T Y , Goyal P , Girshick R , et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 1: 2999-3007.
[35]Kingma D P, Ba J. Adam: A method for stochastic optimization[R]. arXiv:1412.6980, 2014.
Increasing the real-time dynamic identification efficiency of sugarcane nodes by improved YOLOv3 network
Li Shangping1, Li Xianghui1, Zhang Ke2, Li Kaihua1, Yuan Honglei1, Huang Zongxiao2
(1.,,530006,; 2.,530004,)
To popularize the technology of sugarcane pre-cutting seed, and cultivation with good method, combine with the development of intelligent transverse sugarcane pre-cutting seed-cutting machine, and realize the continuous and dynamic intelligent recognition of sugarcane seed characteristics by sugarcane seed-cutting device, in this study, an intelligent recognition convolution neural network model based on improved YOLOv3 network was established by continuously and dynamically collecting the surface data of the whole sugarcane through the camera built in the black box of the sugarcane cutting machine. The real-time location and recognition of the image features of the whole sugarcane cane nodes in the input recognition system was carried out by the camera inside the system. Compared with the recognition information, the improved network timely updated the sugarcane nodes data, identified and marked the position of the sugarcane nodes, and then got real-time sugarcane nodes information through data processing, which was transmitted to the multi-tool cutting table for real-time cutting. In this paper, based on the improved YOLOv3 network, a sugarcane nodes recognition system was established. The image acquisition was carried out by the camera in the sugarcane cutting system. The video data of sugarcane was collected before the training of this network, and then the image data was processed to establish training set, validation set and test set. The training data sets of different improved models were tested and the best model was selected as the model in this paper. Through the training and test, the measured results showed that the recognition accuracy of the model for sugarcane nodes was 96.89%, the recall rate was 90.64%, average recognition accuracy AP was 90.38%, and the average recognition time of pictures was 28.7 ms. Compared with the original network, the AP was improved by 2.26percentage point, the accuracy was decreased by 0.61percentage point, and the recognition time was shortened by 22.8 ms. At present, the recognition of sugarcane nodes still remained in single or basic image processing and recognition, and there was still a lack of fast processing methods for the whole sugarcane image. In this study, we proposed to use the improved YOLOv3 network to recognize and locate sugarcane features, and to establish the recognition model of sugarcane nodes through network training. On the basis of the accuracy and speed of the original network identification and location, the speed of identification, detection and the recognition rate were further improved. The whole sugarcane can be identified and processed quickly in real time, which can meet the needs of various sugarcane seed cutting. Combining with the other parts of the intelligent sugarcane cutting machine system designed by our research group, the whole cutting process can be mechanized and intellectualized, which can greatly improve the quality of sugarcane cutting, reduce the labor intensity and time, and greatly improve the production efficiency. It provides a research basis for the industrialized production of sugarcane pre-cutting and realizes the sugarcane production. Continuous and real-time dynamic identification of sugarcane seeds lays the application foundation for the development of intelligent transverse cutting machine for sugarcane pre-cutting.
convolutional neural network; machine vision; models; YOLOv3 network; sugarcane nodes; identification
李尚平,李向輝,張 可,李凱華,袁泓磊,黃宗曉. 改進(jìn)YOLOv3網(wǎng)絡(luò)提高甘蔗莖節(jié)實(shí)時(shí)動(dòng)態(tài)識(shí)別效率[J]. 農(nóng)業(yè)工程學(xué)報(bào),2019,35(23):185-191.doi:10.11975/j.issn.1002-6819.2019.23.023 http://www.tcsae.org
Li Shangping, Li Xianghui, Zhang Ke, Li Kaihua, Yuan Honglei, Huang Zongxiao. Increasing the real-time dynamic identification efficiency of sugarcane nodes by improved YOLOv3 network[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2019, 35(23): 185-191. (in Chinese with English abstract) doi:10.11975/j.issn.1002-6819.2019.23.023 http://www.tcsae.org
2019-07-12
2019-10-27
廣西科技重點(diǎn)研發(fā)計(jì)劃(桂科AB16380199)
李尚平,博士,教授,博士生導(dǎo)師,研究領(lǐng)域:農(nóng)業(yè)機(jī)械化工程。Email:spli501@vip.sina.com
10.11975/j.issn.1002-6819.2019.23.023
TP391.4
A
1002-6819(2019)-23-0185-07