LI Yang ,JIANG Bitao,* ,LI Xiaobin ,TIAN Jing ,and SONG Xiaorui
1.Department of Space Information,Space Engineering University,Beijing 101400,China;2.Beijing Institute of Remote Sensing Information,Beijing 100192,China
Abstract: Considering the sparsity of hyperspectral images(HSIs),dictionary learning frameworks have been widely used in the field of unsupervised spectral unmixing.However,it is worth mentioning here that existing dictionary learning method-based unmixing methods are found to be short of robustness in noisy contexts.To improve the performance,this study specifically puts forward a new unsupervised spectral unmixing solution.For the reason that the solution only functions in a condition that both endmembers and the abundances meet non-negative constraints,a model is built to solve the unsupervised spectral unmixing problem on the account of the dictionary learning method.To raise the screening accuracy of final members,a new form of the target function is introduced into dictionary learning practice,which is conducive to the growing robustness of noisy HSI statistics.Then,by introducing the total variation (TV) terms into the proposed spectral unmixing based on robust nonnegative dictionary learning (RNDLSU),the context information under HSI space is to be cited as prior knowledge to compute the abundances when performing sparse unmixing operations.According to the final results of the experiment,this method makes favorable performance under varying noise conditions,which is especially true under low signal to noise conditions.
Keywords:hyperspectral image (HSI),nonnegative dictionary learning,norm loss function,unsupervised unmixing.
In the process of hyperspectral remote sensing imaging,each captured pixel in the image always has a mixed spectrum of several pure constituent spectra caused by the low spatial resolution of hyperspectral cameras and the diversity of spectral signatures in nature scenes [1?3].To a great extent,the mixed pixels limit the application of hyperspectral images (HSIs),such as classification [4?7],target detection [8?11],and change detection [11?13].In many real-world applications,a subpixel-level accuracy is often required to improve the performance;thus,the unmixing process is essential in the analysis of HSIs.In most cases,it is quite difficult to comprehend the spectral characteristics of HSIs,that is,termed endmembers.Thus the unmixing process under unsupervised circumstances is composed of two steps at least,namely endmember extraction and abundance coefficient estimation.In this paper,we assume that the generation of mixed pixels in the HSI is based on a linear mixing model.In addition,the observed HSI,endmembers in the image,and the corresponding abundances are all nonnegative according to their physical meaning.
Unsupervised hyperspectral unmixing has recently become a popular research topic,and many methods have been proposed to address this problem.All methods can be classified into three categories including geometricalbased methods,statistical-based methods and machine learning based methods.Among them,geometrical-based methods,such as vertex component analysis (VCA) [14],minimum volume constrained nonnegative matrix factorization [15],and robust collaborative NMF (RCoNMF) [16],usually constrain the convex simplex volume to estimate the endmembers in HSIs.The idea of statistical-based approaches,such as gradient descent maximum entropy(GDME) [17],is to turn the problem of unmixing into a statistical reasoning problem.
In recent years,machine learning based approaches have received extensive attention because they make full use of the sparsity of HSIs [18?22].The theoretical foundation is dictionary learning and sparse representation theories.A new online dictionary learning-based hyperspectral sparse unmixing method is raised here to convert the endmember to a dictionary learning question [23].Despite its favorable performance in highly mixing HSI statistics,the method proves to have great sensitivity towards noise and high complexity in computing.Subsequently,a novel endmember extraction method based on online robust dictionary learning (EEORDL) is raised to enhance the robustness of the unmixing process in relation to noise [24].This method also has quite satisfactory performance in HSI reconstruction.Eventually,a united denoising autocoder with sparsity (uDAS) for spectral unmixing has been developed,which is conductive to the growing strong adaptability of the autoencoder to noise in a sense [25].
However,machine learning based approaches are not robust enough for noisy HSI data because a least square loss function,which is sensitive to noise,is always used for endmember extraction in these approaches.To overcome the above drawbacks,we can improve the robustness of the unmixing approaches by choosing a new loss function that is more robust.
In the current work,we propose a novel unsupervised spectral unmixing based on robust nonnegative dictionary learning (RNDLSU).In this method,we modify the objective function for dictionary learning when performing the endmember extraction to enhance the robustness for noisy HSI data.We also replace the projected blockcoordinate descent algorithm used for updating the dictionary atoms with an efficient multiplicative updating to reduce the computational complexity.
The contributions of this study are presented as follows.First,we model the unsupervised unmixing task as a nonnegative dictionary learning problem and use thel1norm on the reconstruction error term to enhance the adaptability to noise and the unmixing performance under low signal to noise ratio (SNR) conditions.Second,a multiplicative updating algorithm is used for dictionary learning to address the nonsmoothness of the error term and the sparsity regularization term and to reduce the computational complexity.
In accordance with the linear spectral mixture model,the observed HSIXcan be formulated as
whereX=[x1,x2,···,xi,···,xn]∈RL×nis the original noisy mixture and each columnxi∈RLdenotes one mixed pixel withLspectral bands;xi=x1i,x2i,···,xji,···,xLi∈RLandD=[d1,d2,···,dk,···,dm]∈RL×mrepresents the endmember matrix with each column denoting one ofmendmember signatures;A=[a1,a2,···,an]∈Rm×nis the corresponding abundance matrix with each column denoting the mixing coefficient of themendmember signatures in making one mixed pixel;andN∈RL×nstands for the noise in the image.
Moreover,due to the physical constraints of the endmembers and the corresponding abundances in the HSI,the endmember matrixD≥0 and the abundance matrixA≥0 should be satisfied.
Based on the theory of dictionary learning and sparse representation,the endmember matrixDcan be regarded as a dictionary learned from the original HSI data with each dictionary atom corresponding to one endmember in making mixed pixels.The abundance coefficient matrixAcan be regarded as the sparse code matrix.Thus,the endmember matrix can be obtained by using dictionary learning approaches.Then,with the obtained endmembers,the abundances can be solved by sparse coding.
Given aforementioned results of analysis,existing dictionary learning-based hyperspectral unmixing methods mostly conclude dictionary or endmember through optimizing the target function of least absolute shrinkage and selection operator (LASSO) [23]:
where λ>0 is the regularization parameter.
In real-world scenarios,observed HSI data can be noisy.The high sensitivity of the least square loss in (2) towards noise could be fully evidenced by the large bias under noisy contexts [26].Thus,in this paper,we replace thel2norm on the reconstruction error term in (2) with a more robustl1norm.Considering that the original HSI dataX,endmembersD,and abundancesAare all nonnegative,robust dictionary learning used for endmember extraction becomes
Optimization (4) is the objective function that must be optimized to obtain the dictionary or endmembers in the proposed RNDLSU.
It is known that the optimization in (4) is nonconvex.However,the global optimal solution of one variable (DorA) can be obtained when the other variable is fixed.Before solving theDandA,we must first initialize these two variables.We first initialize the endmember matrixDwith VCA and then initialize abundance matrixAwith the nonnegative least squares algorithm by using the initializedD.Considering the two nonsmoothl1norm terms,it is somewhat challenging to solve the optimization problem (4).Fortunately,thel1norm term in (4) is combined with a nonnegative constraint,so the secondl1term can be represented as
whereE∈Rm×nis an all-one matrix.
Therefore,to solve the optimization problem in (4),a matrixWis defined by
where δ is a small value to prevent the overflow ofWijwhenand to make the objective function derivable.In the current work, δ is set to machine precision in the experiments of both synthetic and real-world data sets.Thus,when δ →0,we have
Therefore,considering that δ is nonzero,the final objective function minimized to obtain the dictionary or endmembers in the proposed RNDLSU is as follows:
According to the theory of nonnegative dictionary learning,when the variablesA,Dare fixed,the multiplicative updating rules for optimization (9) are as follows:
where ⊙ stands for the Hadamard product.It is assumed that the operator precedence of the Hadamard product is higher than that of the regular matrix product,such as
By solving the optimization in (9),the dictionaryDcan be obtained,which is also the endmember matrix analysis at the beginning of this subsection.
With the obtained endmembers,the corresponding abundances can be solved.In consideration of the material distribution in real scenarios,single mixing pixel is most certainly a limit number of endmembers,that is,two or three,although the size of all spectral signatures is quite large in a particular HSI data set.In consequence,the abundance coefficientaiof each pixel in the scene could be rendered sparse.Then a sparse unmixing method is proposed to solve the abundance coefficient matrixA.
Moreover,considering the correlation between each mixed pixel and its neighbors,the neighboring pixels should have similar coefficients for the same spectral signature.Therefore,we introduce the total variation (TV)regularizer into the sparse unmixing.As mentioned above,the abundance coefficient matrix could be created by dealing with the optimization problem as below.
is a vector extension of the nonisotropic TV which guarantees the abundance coefficients of the same endmembers in adjacent domains change smoothly.The regularization parameters λcand λTVare both nonnegative. φ denotes the neighborhood subsets in the image horizontally and vertically.
It is quite feasible to solve the minimization problem(11) by sparse unmixing via variable splitting augmented Lagrangian and TV [27].That is because it belongs to a constrained basis pursuit denoising problem in which the spatial-contextual information in the HSI database is involved in essence.According to the analysis in [27],we explicitly enforce the abundance non-negativity constraint but not the abundance sum-to-one constraint when preforming the abundance estimation here.Finally,we can obtain the abundancesA.
Algorithm 1 presents the pseudocode for spectral unmixing based on RNDLSU.
In order to attest the validity and competitiveness of the unsupervised unmixing RNDLSU proposed in the research,the experiment uses synthetic and real data.The assessment for RNDLSU performance should be performed from the perspective of visual observations and quantitative measures.
For a quantitative comparison,three categories of evaluation indexes are used in this paper.The index used to measure the quality of the endmember extraction is the spectral angle distance (SAD) [25].
Similarly,the performance discriminator used to evaluate the accuracy of the abundance estimation is the abundance angle distance (AAD) [25].The metric which can be employed to evaluate the performance of the reconstruction of spectral mixtures is the signal to reconstruction error (SRE) [25].
This synthetic data set is extensively used for unmixing[27].Generated by nine spectral signatures,the data set contains pixels.Just like noises in real scenarios,white noise and correlated noises unanimously arise from the independent and identically distributed (i.i.d) low-pass filtering.The experiment takes Gaussian noise into account in an experiment in which normalized cutoff frequency is set as 5π/L.There are three levels of SNR,namely 20 dB,25 dB,and 30 dB.
We compare the proposed RNDLSU with four representative methods.Among them,RCoNMF is a typical geometrical-based method;GDME is a statistical-based algorithm;EEORDL is a dictionary learning-based algorithm;uDAS is a machine learning based approach.
Table 1 demonstrates the unmixing assessment made by different methods for compound statistics under white noise and correlated noise conditions.Most methods prove to be highly efficient under high SNR conditions.In the condition that SNR decreases,the performance of different algorithms may also decrease to varying degrees.Concretely,RCoNMF and GDME have relatively stable performance under various SNR conditions,but the effect is slightly worse.Both EEORDL and uDAS show superior performance under low SNR conditions in particular.The latter approach incorporates the denoising constraint into the network to avoid additional error.RNDLSU can gain best or comparable consequences under entire SNR conditions,in particular low SNR conditions.
To evaluate the effectiveness of RNDLSU on handling outliers,we perform RNDLSU and the comparisons on the synthetic data set with five outliers added here [29].Table 2 shows the experimental results on handling outliers in the HSI data.Although the results of RNDLSU with outliers are not as good as those without outliers(Table 1),RNDLSU outperforms the compared approaches on handling outliers and has better robustness.
Table 1 Quantitative indexes of different unsupervised unmixing algorithms on synthetic data (without outliers)
Table 2 Quantitative indexes of different unsupervised unmixing algorithms on synthetic data (with five outliers)
Fig.1 and Fig.2 show the endmember estimates and the corresponding abundance maps,respectively,obtained by different algorithms for SNR=20 dB without outliers.Fig.1 shows that the endmember signature extracted by RNDLSU presents the highest degree of similarity to the ground truth signature,qualitatively illustrating the accuracy of the endmember extraction of the proposed approach.As shown in Fig.2,the abundance map obtained by RNDLSU is the most similar to the ground truth compared with the other algorithms.Qualitatively,RNDLSU solution obtains better spatial consistency and more accurate spatial distribution of the materials.Specifically,the regions with low fractional abundance are more homogeneous,and the regions with high abundance are better represented.
Fig.1 Endmember estimates obtained by different methods when SNR=20 dB without outliers (correlated noise)
Fig.2 Abundance maps obtained by different methods when SNR=20 dB without outliers (correlated noise)
To analyze the influence of regularization parameters λ, λc,and λTVon the performance of the proposed RNDLSU,we give a set of results for SAD when λ takes different values as shown in Table 3,and show the relationship between the regularization parameters λc,λTV,and SRE for RNDLSU with fixed λ.As shown in Table 3,when the value of λ is smaller,the SAD value is smaller which means the endmember extraction of RNDLSU has better performance.The reason is that,the value of λ determines the degree of learning of the endmembers for the spectral details of the HSI.Thus,in practical applications,the regularization parameter λ usually takes a small value to ensure that the endmember extraction is more accurate.
Table 3 SAD values of the proposed RNDLSU under different λ on synthetic data (SNR=20 dB)
From Fig.3,when we choose a smaller λcand a larger λTV,the SRE can achieve a high value.And the values of λc,λTVshould generally satisfy λTV<10?1and λTV>λc.The reason is that,same as the sparsity constraint parameter,the selection rules of λcis similar to that of λ.And λTVdirectly affects the degree of the spatial information constraint on the proposed method.The value of λTVshould be determined according to the spatial resolution of the specific observation HSI and the mixture of ground features.Generally,the value of λTVshould be a bit larger.
Fig.3 SRE as functions of parameters λc and λTV for RNDLSU when SNR=20 dB
Real statistics adopted in the research come from popular urban data set [30].The data set is composed of 210 spectral channels,in which the nominal spectral resolution is 10 nm with a wavelength ranging from 0.4 μm to 2.5 μm.In pre-data processing,water vapor absorption bands are removed respectively,and 162 spectral bands are retained.Four endmembers,namely asphalt,grass,trees,and roofs,are latent in the image.
Table 4 illustrates the SAD value between endmember estimates and the reference spectral signatures of different methods in urban data set.Obviously,with all of the four endmembers in the image,RNDLSU obtains the best mean SAD results,which illustrates its unmixing performance on real-world data.
Table 4 SAD values of different unmixing methods with the urban data set
Fig.4?Fig.7 show the corresponding abundance maps of endmember asphalt,grass,tree,and roof,respectively,achieved by different algorithms,which shows that the proposed RNDLSU achieves the best unmixing results.
Fig.4 Abundance maps obtained by different methods for endmember asphalt in urban data set
Fig.5 Abundance maps obtained by different methods for endmember grass in urban data set
Fig.6 Abundance maps obtained by different methods for endmember tree in urban data set
Fig.7 Abundance maps obtained by different methods for endmember roof in urban data set
In this paper,we propose an RNDLSU algorithm to apply the unsupervised hyperspectral unmixing method under a noisy environment.Different from present dictionary learning-based unmixing methods,RNDLSU introduces a more robust norm as the loss function so as to improve the accuracy of endmember extraction during the dictionary learning process.In the process of solving the abundance coefficients,the spatial information of the HSI is taken into consideration,by introducing the TV term to restrict the smooth transition of the abundance values of the same endmembers between adjacent pixels,which improves the spatial consistency of the abundance estimation.Synthetic data and real data experiments show that compared with other state-of-the-art unmixing methods,RNDLSU has better unmixing performance under different SNR conditions,especially when the SNR value is low,and can achieve higher endmember extraction and abundance estimation accuracy.In addition,we also analyze the parameter selection rules to improve the practicality of the proposed RNDLSU.
Journal of Systems Engineering and Electronics2022年2期