李善軍,胡定一,高淑敏,林家豪,安小松,朱 明
基于改進(jìn)SSD的柑橘實(shí)時(shí)分類檢測(cè)
李善軍1,2,3,4,5,胡定一1,2,高淑敏1,2,林家豪1,2,安小松1,2,朱 明1,2※
(1. 華中農(nóng)業(yè)大學(xué)工學(xué)院,武漢 430070;2. 農(nóng)業(yè)農(nóng)村部長(zhǎng)江中下游農(nóng)業(yè)裝備重點(diǎn)實(shí)驗(yàn)室,武漢 430070;3. 國(guó)家現(xiàn)代農(nóng)業(yè)(柑橘)產(chǎn)業(yè)技術(shù)體系,武漢 430070;4. 國(guó)家柑橘保鮮技術(shù)研發(fā)專業(yè)中心,武漢 430070;5. 農(nóng)業(yè)農(nóng)村部柑橘全程機(jī)械化科研基地,武漢 430070)
針對(duì)人工分揀柑橘過(guò)程中,檢測(cè)表面缺陷費(fèi)時(shí)費(fèi)力的問(wèn)題,該文提出了一種基于改進(jìn)SSD深度學(xué)習(xí)模型的柑橘實(shí)時(shí)分類檢測(cè)方法。在經(jīng)改裝的自制打蠟機(jī)試驗(yàn)臺(tái)架下采集單幅圖像含有多類多個(gè)柑橘的樣本2 500張,隨機(jī)選取其中2 000張為訓(xùn)練集,500張為測(cè)試集,在數(shù)據(jù)集中共有正常柑橘19 507個(gè),表皮病變柑橘9 097個(gè),機(jī)械損傷柑橘4 327個(gè)。該方法通過(guò)單階段檢測(cè)模型SSD-ResNet18對(duì)圖片進(jìn)行計(jì)算和預(yù)測(cè),并返回圖中柑橘的位置與類別,以此實(shí)現(xiàn)柑橘的分類檢測(cè)。以平均精度AP(average precision)的均值mAP(mean average precision)作為精度指標(biāo),平均檢測(cè)時(shí)間作為速度指標(biāo),在使用不同特征圖、不同分辨率和ResNet18、MobileNetV3、ESPNetV2、VoVNet39等4種不同特征提取網(wǎng)絡(luò)時(shí),進(jìn)行模型分類檢測(cè)效果對(duì)比試驗(yàn)研究。研究表明,該模型使用C4、C5特征圖,768×768像素的分辨率較為合適,特征提取網(wǎng)絡(luò)ResNet18在檢測(cè)速度上存在明顯優(yōu)勢(shì),最終該模型的mAP達(dá)到87.89%,比原SSD的87.55%高出0.34個(gè)百分點(diǎn),平均檢測(cè)時(shí)間為20.27 ms,相較于原SSD的108.83 ms,檢測(cè)耗時(shí)降低了436.90%。該模型可以同時(shí)對(duì)多類多個(gè)柑橘進(jìn)行實(shí)時(shí)分類檢測(cè),可為自動(dòng)化生產(chǎn)線上分揀表面缺陷柑橘的識(shí)別方面提供技術(shù)借鑒。
目標(biāo)識(shí)別;模型;無(wú)損檢測(cè);柑橘;表面缺陷;深度學(xué)習(xí);SSD;ResNet18
水果品質(zhì)分級(jí)是水果加工生產(chǎn)線的重要環(huán)節(jié),分級(jí)得當(dāng)可以創(chuàng)造更多的經(jīng)濟(jì)價(jià)值,而水果的表面缺陷情況是影響水果品質(zhì)的重要指標(biāo)之一[1]。但目前篩除表面有缺陷水果的工作多以人工為主,工作量大且耗費(fèi)人力、財(cái)力。因此,實(shí)現(xiàn)水果的自動(dòng)化分類檢測(cè)具有重要的意義。
目前,國(guó)內(nèi)外學(xué)者運(yùn)用多種方法對(duì)水果進(jìn)行表面缺陷識(shí)別。李江波等[2]通過(guò)建立照度-反射模型對(duì)臍橙表面缺陷進(jìn)行檢測(cè),總體正確率超過(guò)99%。趙杰文等[3]利用支持向量機(jī)識(shí)別缺陷紅棗,識(shí)別準(zhǔn)確率達(dá)到96.2%。章海亮等[4]應(yīng)用高光譜成像技術(shù)對(duì)柑橘缺陷進(jìn)行無(wú)損檢測(cè),識(shí)別率達(dá)到94%。Dong等[5]提出了一種基于高光譜成像技術(shù)結(jié)合主成分分析法和B-樣條光照校正技術(shù)的方法對(duì)柑橘缺陷進(jìn)行檢測(cè),準(zhǔn)確率達(dá)96.5%。Zou等[6]建立了1個(gè)由3臺(tái)彩色相機(jī)組成的系統(tǒng),從不同角度拍攝了9張?zhí)O果的照片,通過(guò)判斷是否有1張照片中的感興趣區(qū)域超過(guò)1個(gè)來(lái)識(shí)別表面缺陷的蘋果,誤分率為4.2%。Sharif等[7]通過(guò)優(yōu)化加權(quán)分割和特征選擇,選出最優(yōu)特征,然后將選中的特征輸入多類支持向量機(jī)進(jìn)行最終的柑橘病變類型的分類,該方法在其綜合數(shù)據(jù)集上準(zhǔn)確率為89%。
上述水果缺陷識(shí)別方法存在樣本較小問(wèn)題,且通常只能一次識(shí)別單個(gè)水果,識(shí)別效率不高。而近年來(lái)快速發(fā)展的深度學(xué)習(xí)技術(shù)可以較好的解決這些問(wèn)題。國(guó)內(nèi)外學(xué)者也對(duì)基于深度學(xué)習(xí)的農(nóng)業(yè)檢測(cè)方向做了相關(guān)研究,其方法主要分為語(yǔ)義分割[8-10]、目標(biāo)檢測(cè)[11-17]、實(shí)例分割[18-20]3種。趙德安等[21]使用YOLO模型對(duì)復(fù)雜背景下的蘋果進(jìn)行定位,其平均精度(AP,average precision)的均值(mAP,mean average precision)為87.71%,檢測(cè)視頻的幀率達(dá)到60幀/s。王丹丹等[22]利用R-FCN深度學(xué)習(xí)模型對(duì)疏果前期的蘋果進(jìn)行目標(biāo)識(shí)別,誤識(shí)率為4.9%,處理一張圖像的平均速度為0.187 s。Dias等[23]以蘋果花朵為研究對(duì)象,構(gòu)建了一種深度學(xué)習(xí)模型,實(shí)現(xiàn)了對(duì)蘋果花朵的實(shí)例分割。Tian等[24]通過(guò)改進(jìn)的YOLO-V3模型檢測(cè)不同生長(zhǎng)階段的蘋果,F(xiàn)1值達(dá)到81.7%。
本文提出一種基于SSD深度學(xué)習(xí)模型的采后柑橘實(shí)時(shí)分類檢測(cè)方法,利用該模型可達(dá)到同時(shí)識(shí)別多類多個(gè)柑橘的目的,為實(shí)現(xiàn)生產(chǎn)線上實(shí)時(shí)分揀有缺陷柑橘提供技術(shù)支持。
本文試驗(yàn)所采用的柑橘品種為紐荷爾,采集于宜昌市秭歸縣柑橘果園,相機(jī)使用小米8SE 1200萬(wàn)像素的后置攝像頭,于自然光照條件下,模擬產(chǎn)線背景,在經(jīng)改裝的小型打蠟機(jī)上拍攝,在打蠟機(jī)上方固定相機(jī),采取俯視視角拍攝,圖1為圖像采集裝置示意圖。每次拍攝前在打蠟機(jī)的滾筒間放置數(shù)量不等的柑橘,之后滾輪以0.72 r/s的轉(zhuǎn)速自動(dòng)旋轉(zhuǎn),以此帶動(dòng)柑橘的旋轉(zhuǎn);相機(jī)每隔1秒拍攝1張照片,因此在柑橘旋轉(zhuǎn)過(guò)程中可拍攝到柑橘的不同位面,以此獲取更加全面的數(shù)據(jù)信息,增加數(shù)據(jù)量,拍攝到的圖像分辨率為2 448×2 448像素,共拍攝圖像樣本2 500張,拍攝圖像示例如圖2所示。
1.滾筒 2.框架 3.手機(jī)固定夾具 4.高清手機(jī) 5.電機(jī)
圖2 圖像樣本示例
本文將柑橘分為3類,分別為正常柑橘、表皮病變柑橘、機(jī)械損傷柑橘。正常柑橘表皮基本無(wú)病變斑紋,可進(jìn)行正常販賣;表皮病變柑橘多有病變斑紋,其外觀受到損傷,但大部分該類柑橘內(nèi)質(zhì)并未受到損壞,通??韶溬u給榨汁廠與罐頭廠進(jìn)行加工,存在一定價(jià)值;機(jī)械損傷柑橘為表皮已破裂柑橘,該類柑橘極易腐爛,通常會(huì)被丟棄,不存在價(jià)值。圖3為3類柑橘示意圖。
a. 正常b. 表皮病變c. 機(jī)械損傷 a. Normalb. Skin lesionsc. Mechanical damage
本文試驗(yàn)在2 500張圖像樣本中隨機(jī)挑選2 000張作為訓(xùn)練集,剩余500張作為測(cè)試集,在2 500張圖像樣本中各類柑橘數(shù)量的信息如表1所示。
表1 柑橘數(shù)據(jù)集種類及其數(shù)量
本文使用LabelImg對(duì)圖像樣本進(jìn)行標(biāo)定,采用COCO數(shù)據(jù)集格式。在標(biāo)定時(shí),部分柑橘可能既存在表皮病變,又存在機(jī)械損傷,對(duì)于該類柑橘將其標(biāo)定為機(jī)械損傷柑橘;為增加訓(xùn)練后模型的容錯(cuò)率,對(duì)于柑橘表皮病變或裂紋特征不明顯的,如只在柑橘邊緣處有少量裂紋或病變特征的柑橘,此類柑橘將被標(biāo)定為正常柑橘。
為了提高訓(xùn)練效果,使模型的泛化性得以提升,對(duì)數(shù)據(jù)集使用數(shù)據(jù)增強(qiáng)方法。鑒于柑橘更換不同方向角度觀察都不會(huì)改變其特征的特點(diǎn),本文使用水平翻轉(zhuǎn)與垂直翻轉(zhuǎn)2種數(shù)據(jù)增強(qiáng)方法。
本文試驗(yàn)基于Windows10操作系統(tǒng),GPU為 GeForce GTX 1060(6 GB顯存),處理器為Inter(R) Core(TM) i5-8500 CPU@3.00GHz,運(yùn)行內(nèi)存為16 G。模型的搭建與訓(xùn)練驗(yàn)證通過(guò)Python語(yǔ)言實(shí)現(xiàn),基于PyTorch深度學(xué)習(xí)框架,并行計(jì)算框架使用CUDA 10.1版本。
SSD[25]是深度學(xué)習(xí)目標(biāo)檢測(cè)中經(jīng)典且有效的單階段檢測(cè)模型,其首先通過(guò)特征提取網(wǎng)絡(luò)(backbone)得到特征圖(feature map),再?gòu)奶卣鲌D中預(yù)測(cè)出眾多的邊界框,最后通過(guò)非極大值抑制(non-maximum suppression)最終確定圖像中物體的類別與位置。SSD訓(xùn)練時(shí)輸入的圖像分辨率一般為300×300像素或512×512像素,本文中后續(xù)與原SSD模型相關(guān)的試驗(yàn)都選用512×512像素的圖像分辨率。
2.1.1 ResNet18特征提取網(wǎng)絡(luò)
原SSD模型以VGG16[26]作為特征提取網(wǎng)絡(luò),但VGG16網(wǎng)絡(luò)參數(shù)量龐大,計(jì)算速度緩慢,無(wú)法達(dá)到生產(chǎn)線上實(shí)時(shí)分類檢測(cè)柑橘的要求。因此本文將SSD的特征提取網(wǎng)絡(luò)更改為ResNet18深度殘差網(wǎng)絡(luò),該網(wǎng)絡(luò)僅有18層,且具有更快的計(jì)算速度,其浮點(diǎn)計(jì)算量?jī)H為VGG16網(wǎng)絡(luò)的1/10[27],這樣可以更好滿足實(shí)時(shí)分類檢測(cè)柑橘的要求,且訓(xùn)練時(shí)可以使模型更快地收斂,從而減少訓(xùn)練時(shí)間。
2.1.2 特征圖選取
通過(guò)特征提取網(wǎng)絡(luò)得到的特征圖通常會(huì)選擇每次下采樣后,下次下采樣前的一層所得到的結(jié)果。本文以代表在網(wǎng)絡(luò)第次下采樣后所得到的特征圖,如第1次下采樣后所得到的特征圖就稱為1,在現(xiàn)階段的特征提取網(wǎng)絡(luò)中一般都會(huì)選擇3、4、5這3個(gè)特征圖。在目標(biāo)檢測(cè)模型中,對(duì)于不同的特征圖會(huì)先安排固定類型的先檢框(Anchor),通常為每個(gè)特征圖3種。本文使用-means聚類算法設(shè)置=9,通過(guò)2 500張數(shù)據(jù)集中的手動(dòng)標(biāo)框得到9種不同的先檢框,其相對(duì)百分比范圍為0.084 5~0.151 1,若以768×768像素的分辨率為例,則其先檢框的尺寸范圍為64.90~116.04像素,但3的有效感受野(Effective receptive field)一般為32,而4為64,5為128[28],所以3特征圖并不一定適用于本課題的柑橘的分類檢測(cè)。因此,本文選取4、5這2個(gè)特征圖,并使用-means聚類算法重新得到6種不同的先檢框,分別應(yīng)用于4、5。SSD-ResNet18的模型結(jié)構(gòu)如圖4所示。
圖4 SSD-ResNet18模型結(jié)構(gòu)
SSD-ResNet18模型訓(xùn)練時(shí)所使用的損失函數(shù)由置信度損失(L)和位置損失組成(L),其公式為
本文采用mAP作為模型的檢測(cè)精度的評(píng)價(jià)指標(biāo),AP作為每一類別的檢測(cè)精度的評(píng)價(jià)指標(biāo)。mAP和AP與準(zhǔn)確率(precision,)、召回率(recall,)有關(guān),準(zhǔn)確率和召回率的計(jì)算公式如下
式中TP代表被正確劃分到正樣本的數(shù)量,F(xiàn)P代表被錯(cuò)誤劃分到正樣本的數(shù)量,F(xiàn)N代表被錯(cuò)誤劃分到負(fù)樣本的數(shù)量。
通過(guò)計(jì)算所得準(zhǔn)確率與召回率可以繪制出準(zhǔn)確率—召回率曲線,該曲線以召回率為橫坐標(biāo),以準(zhǔn)確率為縱坐標(biāo),代表某一類別的準(zhǔn)確率與召回率情況,而AP則是對(duì)該曲線進(jìn)行積分所得,其積分公式為
mAP則為所有類別AP的平均值,其公式為
式中代表類別總數(shù),AP()代表第類的AP值。
速度評(píng)估指標(biāo)采用模型檢測(cè)一幅圖所耗費(fèi)的平均時(shí)間,即平均檢測(cè)時(shí)間,單位為ms。
為加快模型訓(xùn)練收斂速度和提高模型訓(xùn)練效果,本文所搭建的所有模型都加載了基于ImageNet的預(yù)訓(xùn)練參數(shù)。在模型的訓(xùn)練過(guò)程中,優(yōu)化器使用SGD(stochastic gradient descent)[29]算法,batch_size設(shè)置為12,動(dòng)量(momentum)設(shè)置為0.9,初始學(xué)習(xí)率(learning rate)設(shè)置為0.001,學(xué)習(xí)率調(diào)度器為余弦學(xué)習(xí)率衰減(cosine decay)[29],權(quán)重衰減系數(shù)(weight decay)設(shè)置為0.000 1,訓(xùn)練50個(gè)大循環(huán)(epoch)。由于本文訓(xùn)練使用余弦學(xué)習(xí)率衰減,學(xué)習(xí)率在最后會(huì)下降到0,因此在訓(xùn)練期間損失值會(huì)不斷下降,在訓(xùn)練后期,模型的精度變化會(huì)趨于平穩(wěn),所以在訓(xùn)練結(jié)束后使用最后保存的模型,即第50次大循環(huán)所得的模型,在測(cè)試集上進(jìn)行進(jìn)一步的驗(yàn)證分析。圖5為在訓(xùn)練過(guò)程中原SSD與SSD-ResNet18的訓(xùn)練損失與測(cè)試集mAP。
圖5 原SSD與SSD-ResNet18訓(xùn)練損失和測(cè)試集Map
從上文可知特征圖3不一定適用于SSD-ResNet18模型檢測(cè)本文的柑橘數(shù)據(jù)集,本文對(duì)用3、4、5特征圖的試驗(yàn)結(jié)果進(jìn)行比較研究,結(jié)果如表2所示。不使用3的模型的mAP比使用3的模型高出4.2個(gè)百分點(diǎn),平均檢測(cè)時(shí)間比其少1ms,這可能是因?yàn)?設(shè)置了符合橘子尺寸的先檢框后,有一部分柑橘會(huì)被分配到3進(jìn)行檢測(cè),而3的有效感受野僅為32,無(wú)法有效地檢測(cè)到整個(gè)柑橘的特征,因而會(huì)造成誤判,所以去除掉3后,mAP有明顯提升,而且減少了模型的計(jì)算量,檢測(cè)速度也得到了一定的提升。
表2 使用不同特征圖的試驗(yàn)結(jié)果
選擇較為合適的分辨率進(jìn)行訓(xùn)練,有利于模型分類檢測(cè)效果的提升,本文分別選取512×512像素,640×640像素,768×768像素,896×896像素,1 024×1 024像素5種分辨率進(jìn)行對(duì)比試驗(yàn),結(jié)果如表3所示。從表3可以看出,分辨率從512×512像素提升至768×768像素時(shí),雖然模型的平均檢測(cè)時(shí)間增加了7.33 ms,但mAP增長(zhǎng)了2.28個(gè)百分點(diǎn),有明顯提升。而分辨率從768×768像素提升至1 024×1 024像素的過(guò)程中mAP已在88%上下波動(dòng),變化很小,且平均檢測(cè)時(shí)間增加10.04 ms,說(shuō)明此時(shí)分辨率的增加已經(jīng)沒有意義。綜合來(lái)看,768×768像素的分辨率比較適合該模型。
將測(cè)試集在以768×768像素為分辨率,使用4、5特征圖的SSD-ResNet18模型與原SSD模型上分別進(jìn)行測(cè)試,得到兩種模型對(duì)各類柑橘的分類檢測(cè)效果,結(jié)果如表4所示。原SSD與SSD-ResNet18的mAP相差不多,但SSD-ResNet18的檢測(cè)時(shí)間卻是原SSD的1/5,而且對(duì)于機(jī)械損傷類別的柑橘其AP值比原SSD的高出1.56個(gè)百分點(diǎn),說(shuō)明SSD-ResNet18對(duì)于該類柑橘的分類檢測(cè)效果相比于原SSD有了一定的提升。另外,兩種模型對(duì)于正常柑橘的識(shí)別效果最好,對(duì)于機(jī)械損傷的柑橘識(shí)別效果最差,這應(yīng)該與各類柑橘在數(shù)據(jù)集中的數(shù)據(jù)量有關(guān)。若增加表皮病變與機(jī)械損傷這2類柑橘的數(shù)據(jù)量或者實(shí)現(xiàn)對(duì)柑橘個(gè)體的目標(biāo)追蹤,通過(guò)柑橘各個(gè)位面的表皮信息進(jìn)行綜合判定,可能可以進(jìn)一步地提升模型的識(shí)別效果,特別是對(duì)于病變與損傷的柑橘。
表3 不同分辨率的分類檢測(cè)效果對(duì)比
表4 原SSD與SSD-ResNet18的分類檢測(cè)結(jié)果
注:N:正常;SL:表皮病變;MD:機(jī)械損傷
Note: N: Normal; SL: Skin lesions; MD: Mechanical damage
圖6為2種模型分類檢測(cè)的具體效果。從圖6可以看出,2種模型對(duì)于柑橘位置的識(shí)別都非常準(zhǔn)確,沒有出現(xiàn)漏檢柑橘的情況。對(duì)于分類情況,SSD-ResNet18與原SSD效果也相差不多,分類正確率較高,而且隨著1張圖中柑橘數(shù)目的增多,2種模型的誤檢率并沒有上升,說(shuō)明2種模型均有較好的分類效果。
a. 原圖 a. Original imageb. 手動(dòng)標(biāo)定 b. Manual calibration c. 原SSD c. Original SSDd. SSD-ResNet18
MobileNetV3[30]、ESPNetV2[31]、VoVNet39[32]是當(dāng)前階段優(yōu)秀的特征提取網(wǎng)絡(luò)。其中MobileNetV3與ESPNetV2為輕量級(jí)網(wǎng)絡(luò),參數(shù)量少,適合應(yīng)用于移動(dòng)端;VoVNet39為重量級(jí)網(wǎng)絡(luò),層數(shù)較多,參數(shù)量大,對(duì)復(fù)雜困難的分類檢測(cè)問(wèn)題有較好的效果。本文在保證模型其他部分不改動(dòng)的情況下,分別更換了這3種特征提取網(wǎng)絡(luò)與ResNet18進(jìn)行分類檢測(cè)效果的對(duì)比試驗(yàn),結(jié)果如表 5所示。通過(guò)4種特征提取網(wǎng)絡(luò)所得的mAP差別很小,說(shuō)明在分類檢測(cè)的精準(zhǔn)度上,4種特征提取網(wǎng)絡(luò)的效果相近,但在檢測(cè)時(shí)間上,ResNet18優(yōu)于其他的特征提取網(wǎng)絡(luò),比MobileNetV3快10.52 ms,比ESPNetV2快16.78 ms,比VoVNet39快36.76 ms,說(shuō)明ResNet18在提升實(shí)時(shí)檢測(cè)的速度上有明顯優(yōu)勢(shì)。
表5 4種特征提取網(wǎng)絡(luò)的效果對(duì)比
本文提出了一種基于SSD-ResNet18深度學(xué)習(xí)模型的柑橘生產(chǎn)線實(shí)時(shí)分類檢測(cè)方法??蓞^(qū)分正常柑橘、表皮病變柑橘、機(jī)械損傷柑橘。本文選取浮點(diǎn)計(jì)算量少的ResNet18作為特征提取網(wǎng)絡(luò),加快了模型的檢測(cè)速度,使用C4、C5特征圖進(jìn)行預(yù)測(cè),同時(shí)調(diào)整數(shù)據(jù)集分辨率為768×768像素,增加了模型的分類檢測(cè)精度。最終本文模型的mAP達(dá)到87.89%,平均檢測(cè)時(shí)間為20.72 ms。與原SSD模型相比,分類檢測(cè)精度相近,但平均檢測(cè)時(shí)間減少了88.11ms,檢測(cè)速度明顯提高。該模型有較高的分類檢測(cè)精度,檢測(cè)速度提高明顯,從而為生產(chǎn)線上自動(dòng)化分揀表面缺陷柑橘提供參考。
[1]應(yīng)義斌,饒秀勤,趙勻,等. 機(jī)器視覺技術(shù)在農(nóng)產(chǎn)品品質(zhì)自動(dòng)識(shí)別中的應(yīng)用(Ⅰ)[J]. 農(nóng)業(yè)工程學(xué)報(bào),2000,16(1):103-108. Ying Yibin, Rao Xiuqin, Zhao Yun, et al. Application of machine vision technique to quality automatic identification of agricultural products(Ⅰ)[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2000, 16(1): 103-108. (in Chinese with English abstract)
[2]李江波,饒秀勤,應(yīng)義斌. 基于照度-反射模型的臍橙表面缺陷檢測(cè)[J]. 農(nóng)業(yè)工程學(xué)報(bào),2011,27(7):338-342. Li Jiangbo, Rao Xiuqin, Ying Yibin, et al. Detection of navel surface defects based on illumination-reflectance model[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2011, 27(7): 338-342. (in Chinese with English abstract)
[3]趙杰文,劉少鵬,鄒小波,等. 基于支持向量機(jī)的缺陷紅棗機(jī)器視覺識(shí)別[J]. 農(nóng)業(yè)機(jī)械學(xué)報(bào),2008,39(3):113-115. Zhao Jiewen, Liu Shaopeng, Zou Xiaobo, et al. Recognition of defect chinese dates by machine vision and support vector machine[J]. Transactions of the Chinese Society for Agricultural Machinery, 2008, 39(3): 113-115. (in Chinese with English abstract)
[4]章海亮,高俊峰,何勇. 基于高光譜成像技術(shù)的柑橘缺陷無(wú)損檢測(cè)[J]. 農(nóng)業(yè)機(jī)械學(xué)報(bào),2013,44(9):177-181. Zhang Hailiang, Gao Junfeng, He Yong. Nondestructive detection of citrus defection using hyper-spectra imaging technology[J]. Transactions of the Chinese Society for Agricultural Machinery, 2013, 44(9): 177-181. (in Chinese with English abstract)
[5]Dong C, Yang Y, Zhang J, et al. Detection of thrips defect on green-peel citrus using hyperspectral imaging technology combining PCA and B-spline lighting correction method[J].Journal of Integrative Agriculture, 2014, 13(10): 2229- 2235.
[6]Zou X B, Zhao J W, Li Y X, et al. In-line detection of apple defects using three color cameras system[J]. Computers and Electronics in Agriculture, 2010, 70(1): 129-134.
[7]Sharif M, Khan M A, Iqbal Z, et al. Detection and classification of citrus diseases in agriculture based on optimized weighted segmentation and feature selection[J].Computers and Electronics in Agriculture, 2018, 150: 220-234.
[8]Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 3431-3440.
[9]Kestur R, Meduri A, Narasipura O. MangoNet: A deep semantic segmentation architecture for a method to detect and count mangoes in an open orchard[J]. Engineering Applications of Artificial Intelligence, 2019, 77: 59-69.
[10]Li Y, Cao Z, Lu H, et al. In-field cotton detection via region-based semantic image segmentation[J]. Computers and Electronics in Agriculture, 2016, 127: 475-486.
[11]Ren S, He K, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[C]//Advances in Neural Information Processing Systems. 2015: 91-99.
[12]Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 779-788.
[13]孫哲,張春龍,葛魯鎮(zhèn),等. 基于Faster R-CNN的田間西蘭花幼苗圖像檢測(cè)方法[J]. 農(nóng)業(yè)機(jī)械學(xué)報(bào),2019,50(7):216-221. Sun Zhe, Zhang Chunlong, Ge Luzhen, et al. Image detection method for broccoli seedlings in fieldbased on Faster R-CNN[J]. Transactions of the Chinese Society for Agricultural Machinery, 2019, 50(7): 216-221. (in Chinese with English abstract)
[14]劉慧,張禮帥,沈躍,等. 基于改進(jìn)SSD的果園行人實(shí)時(shí)檢測(cè)方法[J]. 農(nóng)業(yè)機(jī)械學(xué)報(bào),2019,50(4):29-35. Liu Hui, Zhang Lishuai, Shen Yue, et al. Real-time pedestrian detection in orchard based on improved SSD[J]. Transactions of the Chinese Society for Agricultural Machinery, 2019, 50(4): 29-35. (in Chinese with English abstract)
[15]畢松,高峰,陳俊文,等. 基于深度卷積神經(jīng)網(wǎng)絡(luò)的柑橘目標(biāo)識(shí)別方法[J]. 農(nóng)業(yè)機(jī)械學(xué)報(bào),2019,50(5):181-186. Bi Song, Gao Feng, Chen Junwen, et al. Detection method of citrus based on deep convolution neural network[J]. Transactions of the Chinese Society for Agricultural Machinery, 2019, 50(5): 181-186. (in Chinese with English abstract)
[16]Tian Yunong, Yang Guodong, Wang Zhe, et al. Detection of apple lesions in orchards based on deep learning methods of CycleGAN and YOLOV3-Dense[J]. Journal of Sensors, 2019: 1-13.
[17]彭紅星,黃博,邵園園,等. 自然環(huán)境下多類水果采摘目標(biāo)識(shí)別的通用改進(jìn)SSD模型[J]. 農(nóng)業(yè)工程學(xué)報(bào),2018,34(16):155-162. Peng Hongxing, Huang Bo, Shao Yuanyuan, et al. General improved SSD model for picking object recognition of multiple fruits in natural environment[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(16): 155-162. (in Chinese with English abstract)
[18]He K, Gkioxari G, Dollár P, et al. Mask R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision. 2017: 2961-2969.
[19]Qiao Y, Truman M, Sukkarieh S. Cattle segmentation and contour extraction based on Mask R-CNN for precision livestock farming[J]. Computers and Electronics in Agriculture, 2019, 165: 1-9.
[20]高云,郭繼亮,黎煊,等. 基于深度學(xué)習(xí)的群豬圖像實(shí)例分割方法[J]. 農(nóng)業(yè)機(jī)械學(xué)報(bào),2019,50(4):186-194. Gao Yun, Guo Jiliang, Li Xuan, et al. Instance-level segmentation method for group pig images based on deep learning[J]. Transactions of the Chinese Society for Agricultural Machinery, 2019, 50(4): 186-194. (in Chinese with English abstract)
[21]趙德安,吳任迪,劉曉洋,等. 基于YOLO深度卷積神經(jīng)網(wǎng)絡(luò)的復(fù)雜背景下機(jī)器人采摘蘋果定位[J]. 農(nóng)業(yè)工程學(xué)報(bào),2019,35(3):164-173. Zhao Dean, Wu Rendi, Liu Xiaoyang, et al. Apple positioning based on YOLO deep convolutional neural network for picking robot in complex background[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2019, 35(3): 164-173. (in Chinese with English abstract)
[22]王丹丹,何東健. 基于R-FCN深度卷積神經(jīng)網(wǎng)絡(luò)的機(jī)器人疏果前蘋果目標(biāo)的識(shí)別[J]. 農(nóng)業(yè)工程學(xué)報(bào),2019,35(3):156-163. Wang Dandan, He Dongjian. Recognition of apple targets before fruits thinning by robot based on R-FCN deep convolution neural network[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2019, 35(3): 156-163. (in Chinese with English abstract)
[23]Dias P A, Tabb A, Medeiros H. Apple flower detection using deep convolutional networks[J]. Computers in Industry, 2018, 99: 17-28.
[24]Tian Y, Yang G, Wang Z, et al. Apple detection during different growth stages in orchards using the improved YOLO-V3 model[J]. ComputersandElectronicsinAgriculture, 2019, 157: 417-426.
[25]Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multibox detector[C]//European Conference on Computer Vision. 2016: 21-37.
[26]Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv: 2014, 1409. 1556v6.
[27]He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770-778.
[28]ZhangS, ZhuX, LeiZ, etal. S3fd: Singleshotscale-invariantfacedetector[C]//ProceedingsoftheIEEEInternationalConferenceonComputerVision. 2017: 192-201.
[29]Loshchilov I, Hutter F. SGDR: stochastic gradient descent with warm restarts[J]. arXiv: 2016, 1608. 03983.
[30]Andrew H, Mark S, Grace C, et al. Searching for mobilenetv3[J]. arXiv: 2019, 1905. 02244.
[31]Mehta S, Rastegari M, Shapiro L, et al. Espnetv2: A light-weight, power efficient, and general purpose convolutional neural network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019: 9190-9200.
[32]Lee Y, Hwang J, Lee S, et al. An energy and GPU-Computation efficient backbone network for real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.2019.
Real-time classification and detection of citrus based on improved single short multibox detecter
Li Shanjun1,2,3,4,5, Hu Dingyi1,2, Gao Shumin1,2, Lin Jiahao1,2, An Xiaosong1,2, Zhu Ming1,2※
(1.,,430070,; 2.-,,430070,; 3.(),430070,; 4.,430070,; 5.,,430070,)
Manually classifying citrus based on its surface defects is tedious and time-consuming and a new real-time method is proposed in this paper based on the improved SSD deep learning model. In the testing bench of the waxing machine, 2 500 images of a variety of citrus species were taken, of which 2 000 were randomly selected as training set and 500 as testing set. Among them, the method classified 19 507 as normal, 9 097 skin defects and 4 327 mechanically damaged. Considering that traditional methods using near-infrared spectra, support vector machines, HSV and RGB color space model are inefficient to detect surface defects of citrus and can only identify one, we proposed an improved method to calculate the image using the one-stage detection model - SSD-ResNet18. The method gets the feature maps through backbone first, and then predicts the number of boundary boxes from the feature maps before determining the location and category of citrus using confidence and non-maximum suppression. This can detect a batch of citrus. In the proposed method, we used the mAP (mean average precision) as the precision index and the mean detection time as the speed index. Optimization in the proposed method was solved using the SGD (stochastic gradient descent) algorithm. The learning scheduler was based on cosine decay, enabling the learning rate to drop to 0 at the end of the training period. This ensures the lost value during the training period to continuously decline. As the model was stable at the end of the training period, it can be saved at the end of the training for further use. While the VGG16 was used as the original SSD backbone, it needs a multitude of parameters and is hence computationally inefficient. We replaced it with the ResNet18, which is approximately 100 times more efficient than the VGG16. An improved feature map was obtained from the analysis of the effective sensory field of different feature maps and the size of citrus in the map, the anchor in which was obtained using the-means clustering algorithm from the manual label box. The suitable image resolution for the proposed model was obtained by comparing images taken at five resolutions: 512×512 pixels, 640×640 pixels, 768×768 pixels, 896×896 pixels and 1024×1024 pixels. The results showed that the accuracy of the mAP of SSD-ResNet18 was 87.89%, improving 0.34 percentage pointshigher than the original SSD. The average detecting time of the SSD-ResNet18 was 20.72 ms, reduced by 436.90% compared to the original SSD's 108.83 ms. The accuracy of the AP of SSD-ResNet18 was 94.72%, 85.79% and 83.17%, respectively, for detecting normal, skin lesion and mechanical damage. We compared MobileNetV3, ESPNetV2, VoVNet39 and ResNet18 as backbones and did not find significant difference between their accuracy, but ResNet18 was 10.52 ms, 16.78 ms and 36.76 ms less than MobileNetV3, ESPNetV2 and VoVNet39 in detection time, respectively. The method proposed in the paper meets the requirement on detecting speed in real-time citrus production lineand can effectively classify and detect a multitude of citrussimultaneously.
object recognition; models; nondestructive detection; citrus; surface defects; deep learning; SSD; ResNet18
李善軍,胡定一,高淑敏,林家豪,安小松,朱 明. 基于改進(jìn)SSD的柑橘實(shí)時(shí)分類檢測(cè)[J]. 農(nóng)業(yè)工程學(xué)報(bào),2019,35(24):307-313. doi:10.11975/j.issn.1002-6819.2019.24.036 http://www.tcsae.org
Li Shanjun, Hu Dingyi, Gao Shumin, Lin Jiahao, An Xiaosong, Zhu Ming. Real-time classification and detection of citrus based on improved single short multibox detecter[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2019, 35(24): 307-313. (in Chinese with English abstract) doi:10.11975/j.issn.1002-6819.2019.24.036 http://www.tcsae.org
2019-10-06
2019-12-07
現(xiàn)代農(nóng)業(yè)(柑橘)產(chǎn)業(yè)技術(shù)體系建設(shè)專項(xiàng)資金項(xiàng)目(CARS-26);國(guó)家重點(diǎn)研發(fā)計(jì)劃(2017YFD0202001);柑橘全程機(jī)械化科研基地建設(shè)項(xiàng)目(農(nóng)計(jì)發(fā)[2017]19號(hào));湖北省農(nóng)業(yè)科技創(chuàng)新行動(dòng)項(xiàng)目
李善軍,副教授,博士,主要從事水果生產(chǎn)機(jī)械化技術(shù)與智能裝備研究。Email:shanjunlee@163.com
朱 明,研究員,博士生導(dǎo)師,主要從事農(nóng)產(chǎn)品加工與智能農(nóng)業(yè)裝備研究。Email:13801392760@163.com
10.11975/j.issn.1002-6819.2019.24.036
TP391.4
A
1002-6819(2019)-24-0307-07