李大湘,滑翠云,劉 穎
面向蘋果葉部病害識(shí)別的細(xì)粒度蒸餾模型
李大湘,滑翠云,劉 穎
(西安郵電大學(xué)通信與信息工程學(xué)院,西安 710121)
為了提高輕型卷積神經(jīng)網(wǎng)絡(luò)(convolutional neural networks,CNN)在蘋果葉部病害識(shí)別中的精度,使其更加適于布署到智慧農(nóng)業(yè)移動(dòng)終端,該研究設(shè)計(jì)了一種細(xì)粒度知識(shí)蒸餾(fine-grained knowledge distillation,F(xiàn)GKD)模型。首先,利用上下文信息與空間-語(yǔ)義關(guān)系分別設(shè)計(jì)了上下文空間注意力(spatial attention,SA)與細(xì)粒度特征提?。╢ine-grained feature extraction,F(xiàn)GFE)模塊,且將它們嵌入到Resnet50與設(shè)計(jì)的輕型CNN,分別作為教師與學(xué)生網(wǎng)絡(luò);然后,構(gòu)造SA與FGFE知識(shí)蒸餾損失函數(shù),以將教師網(wǎng)絡(luò)中的特征提取與細(xì)粒度知識(shí)表示能力遷移到學(xué)生網(wǎng)絡(luò)之中,以增強(qiáng)其對(duì)蘋果葉部病害圖像的局部特征提取能力與高層語(yǔ)義表達(dá)能力,使輕型學(xué)生網(wǎng)絡(luò)在參數(shù)量很小的條件下,其性能接近復(fù)雜的教師網(wǎng)絡(luò)?;跇?biāo)準(zhǔn)蘋果葉部病害數(shù)據(jù)集的對(duì)比試驗(yàn)結(jié)果表明,經(jīng)知識(shí)蒸餾之后的學(xué)生網(wǎng)絡(luò)精度為98.60%,模型參數(shù)量?jī)H0.75 MB,平均推理時(shí)間為25.51 ms,能夠有效地滿足實(shí)際智慧農(nóng)業(yè)移動(dòng)端對(duì)模型的需求,快速準(zhǔn)確地實(shí)現(xiàn)蘋果葉部病害自動(dòng)識(shí)別。
計(jì)算機(jī)視覺(jué);圖像處理;蘋果樹葉病害識(shí)別;細(xì)粒度知識(shí)蒸餾;上下文空間注意力
在蘋果樹生長(zhǎng)的過(guò)程中,容易受到天氣、環(huán)境和微生物等的影響而產(chǎn)生各種病害。植株葉部是病癥最常出現(xiàn)的部位,由于病害區(qū)域較小且癥狀具有一定的相似性,僅靠種植者肉眼觀察和經(jīng)驗(yàn)判斷不能及時(shí)診斷病害類型,造成巨大損失。因此,基于計(jì)算機(jī)視覺(jué)技術(shù)研究面向蘋果葉部病害的識(shí)別算法,是確保蘋果高效生產(chǎn)且可持續(xù)發(fā)展的一種重要手段[1]。
近年來(lái),由于深度學(xué)習(xí)技術(shù)可自動(dòng)提取病害特征,具有避免人工依賴的特點(diǎn),在作物病害識(shí)別中取得了一系列的研究成果,SHIN等[2]使用6種不同的深度學(xué)習(xí)模型來(lái)檢測(cè)草莓?dāng)?shù)據(jù)集的白粉疾病;張善文等[3]提出一種基于多尺度融合卷積神經(jīng)網(wǎng)絡(luò)(convolutional neural networks, CNN)的黃瓜病害葉片分割方法,平均分割準(zhǔn)確率為93.12%;AGARWAL等[4]提出了一種基于CNN體系結(jié)構(gòu)的番茄葉部疾病檢測(cè)方法;李子茂等[5]提出一種基于遷移學(xué)習(xí)的SE-DenseNet-FL茶葉病害識(shí)別方法,利用SE-Net及Focal Loss方法,在小樣本及樣本分布不均情景下,對(duì)茶葉病害的識(shí)別準(zhǔn)確率達(dá)到92.66%;李大湘等[6]提出全局與局部特征交互耦合的方法,以提升模型對(duì)蘋果葉部病害圖像的特征提取能力,其識(shí)別準(zhǔn)確率達(dá)到98.23%;HU等[7]提出了一種用于玉米葉部疾病識(shí)別的CNN算法,通過(guò)使用數(shù)據(jù)擴(kuò)充來(lái)增強(qiáng)訓(xùn)練集,并使用遷移學(xué)習(xí)技術(shù)來(lái)提高CNN模型的準(zhǔn)確性,優(yōu)化后CNN在包含4類玉米葉片的Plant Village數(shù)據(jù)集的子集上平均準(zhǔn)確率達(dá)到97.6%。盡管深度學(xué)習(xí)的方法在特定病害識(shí)別任務(wù)上取得了理想的識(shí)別精度,但也存在網(wǎng)絡(luò)參數(shù)多、計(jì)算量大且模型復(fù)雜的問(wèn)題,實(shí)用性較差。
為解決深度學(xué)習(xí)模型的移動(dòng)部署問(wèn)題,研究者們提出了各種輕量級(jí)架構(gòu),為作物病害識(shí)別方法落地部署提供強(qiáng)有力的技術(shù)支撐,SEMBIRING等[8]提出了一種輕量級(jí)CNN,用于對(duì)番茄植物的葉片圖像病害進(jìn)行分類識(shí)別;DURMUS等[9]利用SqueezeNet檢測(cè)番茄葉部疾??;BIR等[10]利用預(yù)訓(xùn)練的EfficientNet-B0對(duì)番茄葉部病害識(shí)別,在保持模型尺寸和計(jì)算量較低的同時(shí),實(shí)現(xiàn)了與最先進(jìn)技術(shù)相當(dāng)?shù)木?;王春山等[11]在ResNet18的基礎(chǔ)上,通過(guò)增加多尺度特征提取模塊改變殘差層的連接方式,分解大卷積核并進(jìn)行群卷積運(yùn)算,提出了改進(jìn)的多尺度殘差(Multi-scale ResNet)模型,顯著降低了模型參數(shù)和存儲(chǔ)空間,在Plant Village數(shù)據(jù)集上取得了95.95%的準(zhǔn)確率,在自己收集的7個(gè)真實(shí)環(huán)境疾病數(shù)據(jù)集上取得了93.05%的準(zhǔn)確率;LIU等[12]提出了一種新的CNN結(jié)構(gòu)來(lái)識(shí)別蘋果葉部疾病,該網(wǎng)絡(luò)由AlexNet-precursor網(wǎng)絡(luò)和初始網(wǎng)絡(luò)級(jí)聯(lián)而成,用Inception網(wǎng)絡(luò)取代了傳統(tǒng)AlexNet模型中的全連接層,顯著減少可訓(xùn)練參數(shù)的數(shù)量,從而降低了存儲(chǔ)需求。但是,上述CNN模型只是直接運(yùn)用或者改進(jìn)現(xiàn)有的輕型卷積網(wǎng)絡(luò),未能針對(duì)蘋果葉部病害“類間方差小、類內(nèi)方差大”的細(xì)粒度問(wèn)題[13-14]作進(jìn)一步優(yōu)化。
為兼顧蘋果葉部病害粗粒度全局特征和細(xì)粒度局部病害的特點(diǎn),現(xiàn)有方法通常采用多重CNN框架或者注意力模塊[15-16],額外添加了參數(shù)量,不符合智慧農(nóng)業(yè)對(duì)移動(dòng)端部署的要求。本文提出面向蘋果葉部病害識(shí)別的細(xì)粒度知識(shí)蒸餾(fine-grained knowledge distillation,F(xiàn)GKD)模型,即利用上下文信息和空間語(yǔ)義關(guān)系分別設(shè)計(jì)了空間注意力和細(xì)粒度提取模塊,且將它們嵌入到教師網(wǎng)絡(luò)和設(shè)計(jì)的輕型學(xué)生網(wǎng)絡(luò)之中,旨在通過(guò)教師網(wǎng)絡(luò)去指導(dǎo)學(xué)生網(wǎng)絡(luò)的學(xué)習(xí),使輕型學(xué)生網(wǎng)絡(luò)在參數(shù)量很小的條件下,更能關(guān)注葉片中的病害區(qū)域,且提取到更具鑒別能力的細(xì)粒度特征而提高識(shí)別精度。
試驗(yàn)數(shù)據(jù)來(lái)源于Plant Village[17]和西北農(nóng)林科技大學(xué)采集數(shù)據(jù)[18],其基本信息如下:1)Plant Village:植物病害數(shù)據(jù)集包含來(lái)自14種植物的54 305張植物葉片圖,共38種常見病害,葉片圖像均為實(shí)驗(yàn)室環(huán)境下拍攝的單一背景圖像,本文只取其中的4類(黑星病、褐腐病、檜膠銹病和健康)蘋果葉部病害圖像;2)西北農(nóng)林科技大學(xué)蘋果病害數(shù)據(jù)集:在晴天光線良好的條件下獲取,部分圖像在陰雨天進(jìn)行采集,不同的采集條件進(jìn)一步增強(qiáng)了數(shù)據(jù)集的多樣性,包含5類(花葉病、銹病、灰斑病、斑點(diǎn)落葉病、褐斑?。┨O果葉部病害。
為保證模型的訓(xùn)練效果,避免因訓(xùn)練數(shù)據(jù)不足導(dǎo)致過(guò)擬合,對(duì)圖像集進(jìn)行增強(qiáng),即采用隨機(jī)翻轉(zhuǎn)、尺度及亮度變換等方法對(duì)每張?jiān)瓐D像進(jìn)行擴(kuò)充,圖像集增強(qiáng)前后詳細(xì)信息如表1所示。
表1 數(shù)據(jù)集信息
1)準(zhǔn)確率
準(zhǔn)確率為分類正確的樣本占總樣本個(gè)數(shù)的比例[11]。
2)混淆矩陣
混淆矩陣是總結(jié)分類模型預(yù)測(cè)結(jié)果的情形分析表,以矩陣形式將數(shù)據(jù)集中的記錄按照真實(shí)的類別與分類模型預(yù)測(cè)的類別判斷兩個(gè)標(biāo)準(zhǔn)進(jìn)行匯總[7]。
3)參數(shù)量
參數(shù)量是衡量深度學(xué)習(xí)算法的重要指標(biāo),對(duì)應(yīng)的是算法的空間復(fù)雜度,參數(shù)量的減少能降低計(jì)算機(jī)內(nèi)存資源的消耗。深度卷積神經(jīng)網(wǎng)絡(luò)的參數(shù)量主要是卷積層和全連接層[12]。
4)平均推理時(shí)間
在深度學(xué)習(xí)中,推理指的是神經(jīng)網(wǎng)絡(luò)的一次前向傳播過(guò)程,也就是將輸入數(shù)據(jù)送入神經(jīng)網(wǎng)絡(luò),然后從中得到輸出結(jié)果的過(guò)程。使用平均推理時(shí)間來(lái)觀察模型是否適合加載到移動(dòng)端[12]。
本節(jié)設(shè)計(jì)了面向蘋果葉部病害識(shí)別的細(xì)粒度蒸餾模型(FGKD),如圖1所示。該模型主要由三大部分組成,即:復(fù)雜的教師網(wǎng)絡(luò)、輕型的學(xué)生網(wǎng)絡(luò)與蒸餾函數(shù)模塊,旨在通過(guò)設(shè)計(jì)的知識(shí)蒸餾函數(shù)迫使教師網(wǎng)絡(luò)教授學(xué)生網(wǎng)絡(luò)如何通過(guò)知識(shí)蒸餾提取細(xì)粒度特征,使其在參數(shù)量適宜移動(dòng)端的條件下,具有與教師網(wǎng)絡(luò)同等的局部特征提取與高層語(yǔ)義表達(dá)能力,以獲得更高的病害識(shí)別精度。
在知識(shí)蒸餾模型中,由于教師網(wǎng)絡(luò)要用于指導(dǎo)學(xué)生網(wǎng)絡(luò)的訓(xùn)練,其性能直接影響到整個(gè)模型的精度。如圖1中的教師網(wǎng)絡(luò)所示,本研究利用上下文信息與局部語(yǔ)義關(guān)系,分別設(shè)計(jì)了空間注意力(spatial attention, SA)與細(xì)粒度特征提取(fine-grained feature extraction, FGFE)模塊,且將它們分別加入到ResNet50[19]的第一層和第五層卷積塊之后作為教師網(wǎng)絡(luò),具體來(lái)說(shuō),為了使教師網(wǎng)絡(luò)的前端卷積更能聚焦到蘋果葉部圖像的病變區(qū)域,故將SA模塊加入到第一層卷積塊之后;為增強(qiáng)FGKD模型對(duì)蘋果葉部病害圖像的高層語(yǔ)義表達(dá)能力,故將FGFE模塊加入到第5層卷積塊之后。
2.1.1 基于上下文信息的SA模塊
由于具有判別性的病害部位通常分布在蘋果葉部圖像中的局部區(qū)域,散亂且形狀不統(tǒng)一。為了使教師網(wǎng)絡(luò)的前端卷積操作更能聚焦到葉部圖像的病變區(qū)域,最常用的方法是使用最大及平均池化來(lái)生成SA圖譜[20-21]。當(dāng)這些方法用于蘋果葉部病害識(shí)別時(shí),存在的缺點(diǎn)是:在產(chǎn)生注意力時(shí),只利用通道維度的最大值或平均值,沒(méi)有考慮到相鄰像素的上下文信息,且對(duì)病變區(qū)域的旋轉(zhuǎn)性不具有適應(yīng)能力。
圖1 面向蘋果葉部病害識(shí)別的細(xì)粒度蒸餾模型(FGKD)結(jié)構(gòu)示意圖
Fig.1 Schematic diagram of a fine-grained distillation model (FGKD) for apple leaf disease identification
注:F是輸入SA模塊的特征圖譜,W、H與C分別代表其寬度、高度和通道數(shù),F(xiàn)1是經(jīng)過(guò)1×1卷積后的特征圖譜,S(i, j)是基于上下文的相似性矩陣,α是SA權(quán)重矩陣,是經(jīng)過(guò)SA之后的特征圖譜。Note: F is the feature map of the input SA module, W、H and C is the width, height and number of channels, F1 is characteristic map after 1 × 1 convolution, S(i, j) is context-based similarity matrix,αis the SA weight matrix, is the feature map after SA.
綜上所述,SA加權(quán)過(guò)程可總結(jié)為
2.1.2 基于空間-語(yǔ)義關(guān)系的FGFE模塊
由于不同的蘋果病害只在葉部某個(gè)局部有細(xì)微差異,具有“高類內(nèi)方差、低類間方差”的特點(diǎn),屬于典型的細(xì)粒度圖像識(shí)別問(wèn)題,則如圖3所示,本節(jié)基于空間-語(yǔ)義關(guān)系設(shè)計(jì)了一個(gè)FGFE模塊。
注:是第5個(gè)殘差模塊輸出的特征圖譜,W'、H'、C'分別代表其寬度、高度和通道數(shù),Global feature是全局特征,x、y是橫縱坐標(biāo)軸,是顯著特征,vL是局部特征,SR是空間依賴關(guān)系,CR是通道依賴關(guān)系,SCR是空間-語(yǔ)義關(guān)系,是融合細(xì)粒度特征,MP是最大池化。Note: is the feature map output by the fifth residual module, W', H', C' is the width, height and number of channels, global feature is global feature, x, y are the horizontal and vertical axes,is distinctive features, vL are local features, SR is a spatial dependency, and CR is a channel dependency,SCR is a spatial-semantic relationship andis a fusion of fine-grained features, MP is max pooling .
1)顯著特征篩選
為了從特征圖譜中篩選出有鑒別能力的局部區(qū)域從而獲得其蘊(yùn)含的細(xì)粒度特征,將中每個(gè)位置沿通道維度的數(shù)據(jù)抽取出來(lái),稱之為“局部特征”,記為
2)聚合空間-語(yǔ)義關(guān)系的FGFE方法
基于空間依賴關(guān)系SR與語(yǔ)義依賴關(guān)系CR,定義“空間-語(yǔ)義關(guān)系”SCR如下:
2.2.1 學(xué)生網(wǎng)絡(luò)設(shè)計(jì)
復(fù)雜的教師網(wǎng)絡(luò)精度優(yōu)良,但無(wú)法滿足模型加載在移動(dòng)設(shè)備上的需求,因此,參照Resnet18網(wǎng)絡(luò)[4]結(jié)構(gòu),如表2所示,設(shè)計(jì)了一個(gè)包含5個(gè)卷積模塊的輕型學(xué)生網(wǎng)絡(luò),為了進(jìn)一步減少網(wǎng)絡(luò)參數(shù),設(shè)計(jì)的學(xué)生網(wǎng)絡(luò)除第一個(gè)卷積模塊之外,將另4個(gè)卷積模塊中的普通卷積替換為深度可分離卷積(depthwise separable convolution, DSC)[22],同時(shí)在第一和第五個(gè)卷積模塊之后也各自添加了與教師網(wǎng)絡(luò)相同的SA和FGFE模塊,使其能在保持自身輕量級(jí)框架的同時(shí),可蒸餾教師網(wǎng)絡(luò)的特征提取和高層語(yǔ)義表示能力。
表2 學(xué)生網(wǎng)絡(luò)設(shè)計(jì)
2.2.2 知識(shí)蒸餾設(shè)計(jì)
知識(shí)蒸餾是一種有效的模型壓縮方法,首次被HITON等[23]提出,其利用復(fù)雜的教師網(wǎng)絡(luò)向輕型的學(xué)生網(wǎng)絡(luò)遷移知識(shí),有效地改善了學(xué)生網(wǎng)絡(luò)性能高度依賴于模型復(fù)雜度的問(wèn)題,降低了其訓(xùn)練與應(yīng)用成本而更有利于CNN模型的實(shí)施與部署。隨后,ROMERO等[24]在知識(shí)蒸餾的基礎(chǔ)上進(jìn)行擴(kuò)展,將教師網(wǎng)絡(luò)的輸出層和中間層特征作為指導(dǎo)信息,遷移到學(xué)生網(wǎng)絡(luò)中;YIM等[25]將蒸餾的知識(shí)看作成一種解決問(wèn)題的流,它是由不同層之間的特征通過(guò)內(nèi)積計(jì)算得到,該方法可以使學(xué)生網(wǎng)絡(luò)學(xué)的更快,同時(shí)使其性能超過(guò)教師網(wǎng)絡(luò),且適用于遷移學(xué)習(xí);除此之外,還有ZAGORUYKO等[26]引入注意力、ZHOU[27]等提出教師和學(xué)生網(wǎng)絡(luò)共享權(quán)重與AHN等[28]引入互信息知識(shí)的蒸餾方法。為了將教師網(wǎng)絡(luò)中的SA與細(xì)粒度知識(shí)遷移到學(xué)生網(wǎng)絡(luò)之中,本節(jié)設(shè)計(jì)了兩種知識(shí)蒸餾函數(shù),具體方法如下:
1)SA知識(shí)蒸餾
2)細(xì)粒度知識(shí)蒸餾
算法:知識(shí)蒸餾學(xué)生網(wǎng)絡(luò)訓(xùn)練及測(cè)試
預(yù)處理:根據(jù)批量大小Q,對(duì)中的圖像進(jìn)行分批;
Step 1:訓(xùn)練教師網(wǎng)絡(luò)
For epoch in Epochs:
3)采用余弦衰減策略更新學(xué)習(xí)率l。
End for
Step 2:訓(xùn)練學(xué)生網(wǎng)絡(luò)
For epoch in Epochs:
7)采余弦衰減策略更新學(xué)習(xí)率l。
End for
Step 3:識(shí)別測(cè)試圖像
在試驗(yàn)過(guò)程中,采用的軟硬件平臺(tái)配置如表3所示。首先,為了適應(yīng)模型需求,將所有圖片尺寸統(tǒng)一調(diào)整為224×224,然后按照8:2的比例將數(shù)據(jù)集隨機(jī)劃分為訓(xùn)練集和測(cè)試集,分別用于模型的訓(xùn)練與測(cè)試。在訓(xùn)練與測(cè)試過(guò)程中,選擇Adam優(yōu)化器,初始學(xué)習(xí)率l設(shè)置為0.001,且采用余弦退火衰減策略進(jìn)行更新,批量大小Q設(shè)置為32,epochs設(shè)置為300。
表3 試驗(yàn)平臺(tái)配置
為了驗(yàn)證所提FGKD模型的有效性,基于上述蘋果葉部病害數(shù)據(jù)集與試驗(yàn)方法,與近幾年最新的病害識(shí)別算法以及經(jīng)典的CNN模型,進(jìn)行了對(duì)比試驗(yàn),其中包括深度學(xué)習(xí)、細(xì)粒度識(shí)別與輕量級(jí)網(wǎng)絡(luò)等相關(guān)方法。試驗(yàn)過(guò)程中,所有網(wǎng)絡(luò)均在Plant Village數(shù)據(jù)集上完成預(yù)訓(xùn)練,然后將參數(shù)遷移到西北農(nóng)林?jǐn)?shù)據(jù)集上進(jìn)行試驗(yàn),識(shí)別精度與模型參數(shù)量如表4所示。
表4 模型結(jié)果對(duì)比
在表4中,對(duì)比模型參數(shù)量可知,所提FGKD模型的學(xué)生網(wǎng)絡(luò)的參數(shù)量為0.75 M,少于其他所有模型,在模型復(fù)雜度方面是最低的,較之經(jīng)典的MobileNet V3[33]與Sufficient V2[34]網(wǎng)絡(luò),其參數(shù)量降低了50.66%與40.48%,相比于教師網(wǎng)絡(luò),參數(shù)量降低了97.10%;對(duì)比識(shí)別精度可知,除教師網(wǎng)絡(luò)之外,所提FGKD模型是最高的,達(dá)到了98.60%,均優(yōu)于其他識(shí)別方法。綜上所述,所提FGKD模型在蘋果病害識(shí)別任務(wù)中是有效的,能在參數(shù)量很小的條件下,具備接近教師網(wǎng)絡(luò)的識(shí)別精度,更加適合部署到資源受限的農(nóng)業(yè)物聯(lián)網(wǎng)終端設(shè)備。為觀察所提FGKD模型在訓(xùn)練過(guò)程中的損失變化情況,如圖4所示,展示了1~70次迭代的損失變化曲線。
圖4 模型訓(xùn)練過(guò)程中的損失變化曲線
從圖4所示的兩條損失曲線可知,教師模型在訓(xùn)練的時(shí)候波動(dòng)大,隨著訓(xùn)練輪數(shù)的增加損失趨于平穩(wěn),在約117輪后達(dá)到收斂;所提FGKD模型的學(xué)生網(wǎng)絡(luò)在訓(xùn)練過(guò)程中波動(dòng)小且在約60輪后達(dá)到收斂,相比于教師網(wǎng)絡(luò)收斂速度較快,這主要得益于FGKD模型中學(xué)生網(wǎng)絡(luò)構(gòu)造兩種知識(shí)蒸餾,以將教師網(wǎng)絡(luò)中訓(xùn)練成功的特征提取與細(xì)粒度知識(shí)表示能力遷移到學(xué)生網(wǎng)絡(luò)之中,有效地提高了學(xué)生網(wǎng)絡(luò)的訓(xùn)練效率。
為了進(jìn)一步觀察所提FGKD模型的推理時(shí)間,使用perf_counter函數(shù)每隔10次迭代,依次輸出多個(gè)CNN模型在第1~100次迭代內(nèi)對(duì)蘋果葉部病害識(shí)別的平均推理時(shí)間(ms),如圖5所示。
圖5 不同模型平均推理時(shí)間
從圖5所示的平均推理時(shí)間曲線可知,所提FGKD模型的學(xué)生網(wǎng)絡(luò)每次的平均推理時(shí)間在25.51 ms左右,少于其他所有模型,總體上來(lái)說(shuō)是平穩(wěn)的且可滿足實(shí)時(shí)性要求,且較之經(jīng)典的Sufficient V2[34]網(wǎng)絡(luò),其平均推理時(shí)間降低了60.66%,相比于教師網(wǎng)絡(luò),平均推理時(shí)間降低了72.65%,這主要得益于設(shè)計(jì)的FGFE模塊復(fù)雜度不高,即設(shè)計(jì)的學(xué)生網(wǎng)絡(luò)是輕型CNN網(wǎng)絡(luò)(從表4可知它的參數(shù)量?jī)H為0.75 MB)。
為了進(jìn)一步觀察所提FGKD模型在蘋果葉部病害每個(gè)類別中的具體表現(xiàn),如圖6所示,繪制了5種病害預(yù)測(cè)的混淆矩陣。該矩陣中的每列表示預(yù)測(cè)標(biāo)簽,每列數(shù)據(jù)之和表示預(yù)測(cè)為該類的圖像概率;每行表示測(cè)試圖像的真實(shí)標(biāo)簽,每行數(shù)據(jù)之和代表該類別的圖像概率,對(duì)角線單元格中的數(shù)值表示預(yù)測(cè)正確率,而非對(duì)角單元格中的數(shù)據(jù)則表示不同類別預(yù)測(cè)的錯(cuò)誤率。
圖6 蘋果病害模型混淆矩陣
圖6中5類蘋果病害的測(cè)試集中,平均識(shí)別精度可達(dá)98.60%,比較容易發(fā)生混淆的類別在斑點(diǎn)落葉病與銹病之間,這是因?yàn)閮烧呔哂邢嗨频牟『y理形狀,在發(fā)生病害的時(shí)候,斑點(diǎn)落葉病葉片表面呈現(xiàn)褐色圓形枯死斑,rust葉片表面出現(xiàn)橘紅色小圓點(diǎn),容易混淆。
在所提FGFE模型中,為了驗(yàn)證設(shè)計(jì)的SA模塊在蘋果葉部病害識(shí)別過(guò)程中,所關(guān)注的是局部區(qū)域,如圖7所示,使用Grad-CAM[35]將經(jīng)過(guò)訓(xùn)練的學(xué)生網(wǎng)絡(luò)進(jìn)行了可視化,即利用梯度來(lái)計(jì)算最后一個(gè)卷積層中每個(gè)神經(jīng)元的重要性,可視化結(jié)果顯示圖像中的哪個(gè)區(qū)域是模型做出分類決策的重要特征,熱圖中顯示越紅的區(qū)域表示這些地方更具辨別力,是模型在對(duì)該圖像進(jìn)行分類時(shí)最感興趣的區(qū)域。從圖7所示熱圖可見,較之經(jīng)典的Mobile Net V3和VIT網(wǎng)絡(luò),學(xué)生網(wǎng)絡(luò)FGKD在擁有復(fù)雜背景的病害識(shí)別中,其注意力聚焦的感興趣區(qū)域更加精準(zhǔn)且完整,這主要原因是:SA模塊使得學(xué)生網(wǎng)絡(luò)自適應(yīng)病害區(qū)域;FGFE模塊幫助學(xué)生網(wǎng)絡(luò)提取信息更加豐富的細(xì)粒度區(qū)域;SA和FGFE知識(shí)蒸餾,以將教師網(wǎng)絡(luò)中的特征提取與細(xì)粒度知識(shí)表示能力遷移到學(xué)生網(wǎng)絡(luò)之中,增強(qiáng)其對(duì)蘋果葉部病害圖像的局部特征提取能力與高層語(yǔ)義表達(dá)能力,三個(gè)模塊相輔相成,促進(jìn)學(xué)生網(wǎng)絡(luò)的性能更接近教師網(wǎng)絡(luò)。
在蘋果葉部病害識(shí)別中,為了驗(yàn)證所設(shè)計(jì)的SA與FGFE知識(shí)蒸餾模塊的有效性,本節(jié)對(duì)設(shè)計(jì)的FGKD模型進(jìn)行了消融試驗(yàn),當(dāng)不同的知識(shí)從教師網(wǎng)絡(luò)中被蒸餾到學(xué)生網(wǎng)絡(luò)之中,測(cè)試結(jié)果如表5所示。其中:“Baseline”指學(xué)生網(wǎng)絡(luò)在訓(xùn)練時(shí)只使用標(biāo)簽作為監(jiān)督信息,即式(20)中LSHTL()只保留LSCE(),學(xué)生網(wǎng)絡(luò)的訓(xùn)練不使用任何蒸餾知識(shí),且5個(gè)卷積模塊中采用的是普通卷積操作;“Student_1”、“Student_2”、“Student_3”與“Student_4”指在學(xué)生網(wǎng)絡(luò)中分別采用“DSC”、“DSC+FGFE”、“DSC +SA”及“DSC +SA+FGFE”模塊,以指導(dǎo)學(xué)生網(wǎng)絡(luò)的訓(xùn)練;“Teacher”指用“算法”中“Step 1”訓(xùn)練的教師網(wǎng)絡(luò)。
注:權(quán)重越大越有利于蘋果葉病害的正確分類。
從表5所示消融試驗(yàn)結(jié)果可以發(fā)現(xiàn),在“Student_1”中采用DSC替換普通的卷積操作,學(xué)生網(wǎng)絡(luò)的參數(shù)量降低到“Baseline”的84.96%,且它們的識(shí)別精度幾乎相同;同時(shí)也可發(fā)現(xiàn),較之無(wú)任何知識(shí)蒸餾的原始學(xué)生網(wǎng)絡(luò)“Student_1”,采用SA蒸餾的學(xué)生網(wǎng)絡(luò)“Student_3”平均精度提高了1.59個(gè)百分點(diǎn),而采用FGFE蒸餾的學(xué)生網(wǎng)絡(luò)“Student_2”精度提高了7.35個(gè)百分點(diǎn)。顯然,本文設(shè)計(jì)的DSC、FGFE及SA模塊是有效的,能提高病害分類精度,主要原因是:SA模塊將上下文信息與空間注意力相結(jié)合,且設(shè)計(jì)了一個(gè)SA蒸餾函數(shù),能將教師網(wǎng)絡(luò)的SA知識(shí)有效地傳遞給學(xué)生網(wǎng)絡(luò),以提升其提取病害局部信息的能力;FGFE模塊在學(xué)生網(wǎng)絡(luò)訓(xùn)練過(guò)程中,能將每幅訓(xùn)練圖像的細(xì)粒度特征利用空間-語(yǔ)義關(guān)系進(jìn)行聚合,且利用FGFE蒸餾函數(shù)可將教師網(wǎng)絡(luò)中FGFE知識(shí)遷移到學(xué)生網(wǎng)絡(luò)之中,增強(qiáng)其對(duì)病害的高層語(yǔ)義提取與表達(dá)能力。同時(shí)也可看出,兩個(gè)知識(shí)蒸餾模塊同時(shí)使用的“Student_4”平均精度提高,比用任意一個(gè)模塊提升更高,其識(shí)別精度達(dá)到98.60%,接近教師網(wǎng)絡(luò)(但教師網(wǎng)絡(luò)的參數(shù)量遠(yuǎn)高于學(xué)生網(wǎng)絡(luò)),這證實(shí)了兩個(gè)蒸餾模塊可以相互補(bǔ)充,在蘋果病害識(shí)別中是有效的。
表5 消融試驗(yàn)?zāi)P蛥?shù)量和精度
注:√表示試驗(yàn)中采用了該模塊,×表示試驗(yàn)中未采用該模塊,*表示對(duì)比對(duì)象,↑表示增加量。
Note: √means the module was used in the experiment, × means the module was not used in the experiment, *represent the comparison object,↑indicates an increase in quantity.
為了進(jìn)一步驗(yàn)證所提FGKD模型的實(shí)際應(yīng)用性能,采用2022年7-10月在陜西省銅川市耀州區(qū)小丘鎮(zhèn)移村秦脆蘋果園種植基地拍攝的圖像建立試驗(yàn)數(shù)據(jù)集。采集工具為華為nova8智能手機(jī),拍攝距離10~15 cm,圖像分辨率為2 268×4 032,共采集到病害圖像2 213張(含535張褐斑病、669張黑星病、515張花葉病和494張銹?。?。增強(qiáng)后的數(shù)據(jù)集被隨機(jī)分為訓(xùn)練集(80%)和測(cè)試集(20%),再使用綜合試驗(yàn)所述的方法對(duì)模型進(jìn)行訓(xùn)練與測(cè)試,混淆矩陣如圖8所示。
Fig.8 Example of disease in Qin crisp apple orchard
圖8 蘋果病害實(shí)際應(yīng)用驗(yàn)證混淆矩陣
從圖8所示的混淆矩陣可知,在實(shí)際采集的數(shù)據(jù)集中,4類蘋果病害的測(cè)試集平均識(shí)別精度可達(dá)98.38%,其中比較容易發(fā)生混淆的類別在褐斑病與黑星病之間,這是因?yàn)閮烧叩牟∽儏^(qū)域具有相似的顏色特征與葉斑形態(tài),即它們?cè)陬伾矫嫱ǔ>拾岛稚?,在葉斑形態(tài)方面通常均由數(shù)個(gè)近圓紋理結(jié)構(gòu)連接在一起而形成的病理圖案。
針對(duì)蘋果葉部病害形態(tài)各異且占比小的特點(diǎn),本研究設(shè)計(jì)了一種面向蘋果葉部病害識(shí)別的細(xì)粒度蒸餾模型,通過(guò)對(duì)比試驗(yàn)與分析,得出如下結(jié)論:
1)設(shè)計(jì)的FGKD(fine-grained knowledge distillation)模型的學(xué)生網(wǎng)絡(luò)參數(shù)量為0.75M,在標(biāo)準(zhǔn)數(shù)據(jù)集上的識(shí)別準(zhǔn)確率達(dá)98.60%,平均推理時(shí)間在25.51ms左右,試驗(yàn)結(jié)果表明,對(duì)比其他主流的細(xì)粒度和輕量級(jí)卷積神經(jīng)網(wǎng)絡(luò),F(xiàn)GKD模型的識(shí)別效果更好,參數(shù)量更少,推理時(shí)間更短;
2)利用上下文信息與空間注意力結(jié)合設(shè)計(jì)SA模塊與 SA蒸餾函數(shù),有效提升了模型提取病害局部信息的能力,消融試驗(yàn)表明,對(duì)比無(wú)任何知識(shí)蒸餾的原始學(xué)生網(wǎng)絡(luò)“Student_1”,采用SA蒸餾的學(xué)生網(wǎng)絡(luò)“Student_3”平均精度提高了1.59個(gè)百分點(diǎn);
3)利用空間-語(yǔ)義關(guān)系聚合細(xì)粒度特征,增強(qiáng)其對(duì)病害的高層語(yǔ)義提取與表達(dá)能力,消融試驗(yàn)表明,對(duì)比無(wú)任何知識(shí)蒸餾的原始學(xué)生網(wǎng)絡(luò)“Student_1”,采用FGFE訓(xùn)練的學(xué)生網(wǎng)絡(luò)“Student_2”精度提高了7.35個(gè)百分點(diǎn)。
綜上所述,在本研究設(shè)計(jì)的SA和FGFE知識(shí)蒸餾模塊,可以將教師網(wǎng)絡(luò)中的特征提取與細(xì)粒度知識(shí)表示能力遷移到學(xué)生網(wǎng)絡(luò)之中,使輕型學(xué)生網(wǎng)絡(luò)在參數(shù)量很小的條件下,其性能接近復(fù)雜的教師網(wǎng)絡(luò),識(shí)別精度優(yōu)于其他各種先進(jìn)方法,有利于后期模型在移動(dòng)端和其他小型設(shè)備的部署和應(yīng)用,提升了深度學(xué)習(xí)模型在病害識(shí)別領(lǐng)域的實(shí)用性。在后續(xù)工作中,將重點(diǎn)研究復(fù)雜背景下細(xì)粒度病害識(shí)別的可解釋性分析,進(jìn)一步降低模型的參數(shù)量和推理時(shí)間,并把研究成果延伸到其他作物病害中,以提升模型的泛化應(yīng)用價(jià)值。
[1] 邵明月,張建華,馮全,等. 深度學(xué)習(xí)在植物葉部病害檢測(cè)與識(shí)別的研究進(jìn)展[J]. 智慧農(nóng)業(yè)(中英文),2022,4(1):29-46. SHAO Mingyue, ZHANG Jianhua, FENG Quan, et al. Research Progress of deep learning in detection and recognition of plant leaf diseases[J]. Smart Agriculture, 2022, 4(1): 29-46. (in Chinese with English abstract)
[2] SHIN J, CHANG Y K, HEUNG B, et al. A deep learning approach for RGB image-based powdery mildew disease detection on strawberry leaves[J]. Computers and Electronics in Agriculture, 2021, 183: 106042.
[3] 張善文,王振,王祖良. 多尺度融合卷積神經(jīng)網(wǎng)絡(luò)的黃瓜病害葉片圖像分割方法[J]. 農(nóng)業(yè)工程學(xué)報(bào),2020,36(16):149-157.
ZHANG Shanwen, WANG Zhen, WANG Zuliang. Method for image segmentation of cucumber disease leaves based on multi-scale fusion convolutional neural networks[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2020, 36(16): 149-157. (in Chinese with English abstract)
[4] AGARWAL M, SINGH A, ARJARIA S, et al. ToLeD: Tomato leaf disease detection using convolution neural network[J]. Procedia Computer Science, 2020, 167: 293-301.
[5] 李子茂,徐杰,鄭祿,等. 基于改進(jìn)DenseNet的茶葉病害小樣本識(shí)別方法[J]. 農(nóng)業(yè)工程學(xué)報(bào),2022,38(10):182-190.
LI Zimao, XU Jie, ZHENG Lu, et al. Small sample recognition method of tea disease based on improved DenseNet[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(10): 182-190. (in Chinese with English abstract)
[6] 李大湘,曾小通,劉穎. 耦合全局與局部特征的蘋果葉部病害識(shí)別模型[J]. 農(nóng)業(yè)工程學(xué)報(bào),2022,38(16):207-214.
LI Daxiang, ZENG Xiaotong, LIU Ying. Apple leaf disease identification model by coupling global and patch features[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(16): 207-214. (in Chinese with English abstract)
[7] HU R, ZHANG S, WANG P, et al. The identification of corn leaf diseases based on transfer learning and data augmentation[C]//Proceedings of the 2020 3rd International Conference on Computer Science and Software Engineering. Beijing, China: 2020: 58-65.
[8] SEMBIRING A, Away Y, ARNIA F, et al. Development of concise convolutional neural network for tomato plant disease classification based on leaf images[C]// International Conference on Industrial Automation, Smart Grid and its Application (ICIASGA) 2020. Jawa Timur, Indonesia: IOP Publishing, 2021, 1845(1): 012009.
[9] DURMUS H, GVNES E O, KIRCI M. Disease detection on the leaves of the tomato plants by using deep learning[C]// 2017 6th International Conference on Agro-geoinformatics. Fairfax, VA, USA: IEEE, 2017: 1-5.
[10] BIR P, KUMAR R, SINGH G. Transfer learning based tomato leaf disease detection for mobile applications[C]// 2020 IEEE International Conference on Computing, Power and Communication Technologies (GUCON). Greater Noida Fairfax: IEEE, 2020: 34-39.
[11] 王春山,趙春江,吳華瑞,等. 采用雙模態(tài)聯(lián)合表征學(xué)習(xí)方法識(shí)別作物病害[J]. 農(nóng)業(yè)工程學(xué)報(bào),2021,37(11):180-188.
WANG Chunshan, ZHAO Chunjiang, WU Huarui, et al. Recognizing crop diseases using bimodal joint representation learning[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(11): 180-188. (in Chinese with English abstract)
[12] LIU B, ZHANG Y, HE D J, et al. Identification of apple leaf diseases based on deep convolutional neural networks[J]. Symmetry, 2017, 10(1): 11.
[13] XIANG X, ZHANG Y, JIN L, et al. Sub-region localized hashing for fine-grained image retrieval[J]. IEEE Transactions on Image Processing, 2021, 31: 314-326.
[14] LIU X, MIN W, MEI S, et al. Plant disease recognition: A large-scale benchmark dataset and a visual region and loss reweighting approach[J]. IEEE Transactions on Image Processing, 2021, 30: 2003-2015.
[15] WU Y, FENG X, CHEN G. Plant leaf diseases fine-grained categorization using convolutional neural networks[J]. IEEE Access, 2022, 10: 41087-41096.
[16] ZHENG X, SUN H, LU X, et al. Rotation-Invariant Attention Network for Hyperspectral Image Classification[J]. IEEE Transactions on Image Processing, 2022,4251-4265.
[17] HUGHES D, SALATHE M. An open access repository of images on plant health to enable the development of mobile disease diagnostics [EB/OL]. (2015-11-25). [2022-11-24]. https://arxiv.org/abs/1511.08060.
[18] 周敏敏. 基于遷移學(xué)習(xí)的蘋果葉面病害Android檢測(cè)系統(tǒng)研究[D]. 楊凌:西北農(nóng)林科技大學(xué),2019.
ZHOU Minmin. Apple Foliage Diseases Recognition in Android System with Transfer Learning-based[J]. Yangling: Northwest A&F University, China, 2019. (in Chinese with English abstract)
[19] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 770-778.
[20] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 7132-7141.
[21] WOO S, PARK J, LEE J Y, et al. Cbam: Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision (ECCV). Munich, Germany: Springer, Cham, 2018: 3-19.
[22] CHOLLET F. Xception: Deep learning with depthwise separable convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 1251-1258.
[23] HINTON G, VINYALS O, DEAN J. Distilling the knowledge in a neural network [EB/OL]. (2015-03-09). [2022-11-24]. https://arxiv.org/abs/1503.02531.
[24] ROMERO A, BALLAS N, KAHOU S E, et al. Fitnets: Hints for thin deep nets [EB/OL]. (2014-12-19). [2022-11-24]. https://arxiv.org/abs/1412.6550.
[25] YIM J, JOO D, BAE J, et al. A gift from knowledge distillation: Fast optimization, network minimization and transfer learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA, IEEE, 2017: 4133-4141.
[26] ZAGORUYKO S, KOMODAKIS N. Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer [EB/OL]. (2016-12-12). [2022-11-24]. https://arxiv.org/abs/1612.03928.
[27] ZHOU G, FAN Y, CUI R, et al. Rocket launching: A universal and efficient framework for training well-performing light net[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto, California USA: Computer Science, 2018, 32(1).
[28] AHN S, HU S X, DAMIANOU A, et al. Variational information distillation for knowledge transfer[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Beach, CA, USA: 2019: IEEE, 9163-9171.
[29] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale [EB/OL]. (2020-10-22). [2022-11-24]. https://arxiv.org/abs/2010.11929.
[30] MEHTA S, RASTEGARI M. Mobilevit: light-weight, general-purpose, and mobile-friendly vision transformer [EB/OL].(2021-10-05).[2022-11-24].https://arxiv.org/abs/2110.02178.
[31] DU R, CHANG D, BHUNIA A K, et al. Fine-grained visual classification via progressive multi-granularity training of jigsaw patches[C]//European Conference on Computer Vision. Glasgow, UK: Springer, Cham, 2020: 153-168.
[32] CHANG D, DING Y, XIE J, et al. The devil is in the channels: Mutual-channel loss for fine-grained image classification[J]. IEEE Transactions on Image Processing, 2020, 29: 4683-4695.
[33] HOWARD A, SANDLER M, CHU G, et al. Searching for mobilenetv3[C]//Proceedings of the IEEE/CVF international conference on computer vision. Seoul, Korea (South): IEEE,2019: 1314-1324.
[34] MA N, ZHANG X, ZHENG H T, et al. Shufflenet v2: Practical guidelines for efficient cnn architecture design[C]//Proceedings of the European Conference on Computer Vision (ECCV). Munich, Germany: Springer, Cham,2018: 116-131.
[35] SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-cam: Visual explanations from deep networks via gradient-based localization[C]//Proceedings of the IEEE International Conference on Computer Vision.Venice, Italy:IEEE, 2017: 618-626.
Identifying apple leaf disease using a fine-grained distillation model
LI Daxiang, HUA Cuiyun, LIU Ying
(,,710121,)
Apple trees are susceptible to various diseases that caused by weather, environment and microorganisms. The leaves of plants are the most common parts of the disease. The small area and similar symptoms of diseases have also posed great challenges on the manual observation and experience judgment in recent years. The disease type cannot be diagnosed in time, resulting in the huge losses in apple production. Deep learning can automatically extract features in crop diseases, but it also suffers from an excessive number of parameters and high computational effort. Furthermore, various lightweight architectures have been constructed to provide the strong technical support for the deployment of crop disease identifications, such as less network parameters, less computation, simple models, and low practicability of deep learning models. However, the direct application or improvement of the existing light convolutional neural network (CNN) can fail to further optimize the fine-grained problem in "small variance between classes and large variance within classes" of apple leaf diseases. Multiple CNN frameworks or attention modules can be utilized to consider the coarse-grained global and fine-grained local features of apple leaf diseases. It is necessary for the small number of parameters to meet the requirements of smart agriculture for mobile deployment. In this study, a fine-grained knowledge distillation (FGKD) model was proposed to improve the CNN accuracy in the disease identification of apple leaf suitable for the deployment to smart agricultural mobile terminals. Firstly, contextual information and spatial-semantic relations were used to design the spatial attention (SA) and fine-grained feature extraction (FGFE) modules respectively, and they were embedded into Resnet50 and the designed light CNN as teacher and student networks. Secondly, the SA and FGFE knowledge distillation loss functions were constructed to transfer the feature extraction and fine-grained knowledge representation of the teacher to the student network, in order to enhance the local feature extraction and high-level semantic expression of apple leaf disease images. Finally, the performance of the light student network was close to that of the complex teacher network under the condition of a small number of parameters. The comparative test was carried out on the standard apple leaf disease dataset. The results show that the accuracy of the student network was 98.60% after knowledge distillation, while the number of model parameters was only 0.75 MB, and the average inference time was 25.51 ms. The automatic identification of apple leaf diseases was be rapidly and accurately realized to fully meet the needs of the model of the actual smart agriculture mobile terminals. The SA module and SA distillation function were designed to combine the contextual information and spatial attention, in order to effectively improve the extraction of local information about the disease. The spatial-semantic relationship aggregation of fine-grained features was used to enhance the extraction and expression of high-level semantic information about the disease.
computer vision; image processing; apple tree leaf disease identification; fine grain knowledge distillation; contextual spatial attention
2022-11-24
2023-03-22
國(guó)家自然科學(xué)基金(62071379);陜西省自然科學(xué)基金(2017KW-013);西安郵電大學(xué)創(chuàng)新基金(CXJJYL2022014)
李大湘,博士,副教授,碩士生導(dǎo)師,研究方向?yàn)檫b感圖像分類、病害圖像識(shí)別與機(jī)器學(xué)習(xí)。Email:www_ldx@163.com
10.11975/j.issn.1002-6819.202211209
S24; TP391.4; S431.9
A
1002-6819(2023)-07-0185-10
李大湘,滑翠云,劉穎. 面向蘋果葉部病害識(shí)別的細(xì)粒度蒸餾模型[J]. 農(nóng)業(yè)工程學(xué)報(bào),2023,39(7):185-194. doi:10.11975/j.issn.1002-6819.202211209 http://www.tcsae.org
LI Daxiang, HUA Cuiyun, LIU Ying. Identifying apple leaf disease using a fine-grained distillation model[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2023, 39(7): 185-194. (in Chinese with English abstract) doi:10.11975/j.issn.1002-6819.202211209 http://www.tcsae.org
農(nóng)業(yè)工程學(xué)報(bào)2023年7期