WANG Juan,ZHAO Jie
(College of Science,Zhongyuan University of Technology,Zhengzhou,Henan Province,450007,China)
Abstract:In order to further improve the effectiveness of image processing,it is necessary that an efficient invariant representation is stable to deformation applied to images.This motivates the study of image representations defining an Euclidean metric stable to these deformation.This paper mainly focuses on two aspects.On the one hand,in this paper,two properties of expected scattering and averaged scattering,i.e.,Lipschitz continuity and translation invariance,are proved in detail.These properties support that excepted scattering and averaged scattering are invariant,stable and informative representations.On the other hand,the issue of texture classification based on expected scattering and averaged scattering has been analyzed respectively in this study.Energy features,which are based on expected scattering and averaged scattering,are calculated and used for classification.Experimental results show that starting with the seventh feature,the two approaches can achieve good performance in texture image classification.
Key words:Translation invariance;Lipschitz continuity;Texture image classification;Expected scattering;Averaged scattering
Image processing has been an important research topic.Feature extraction is one of basic techniques of image processing.However,general datasets are exposed to variability,such as shape,clutter,appearance,etc.Hence,image processing requires a feature extraction step which can eliminates these variabilities,while building stability to deformations and keeping enough information to discriminate.In 2012,scattering transform,which iterates over wavelet transform and modulus operators,was introduced by S.Mallat[16].It is an invariant,stable and informative representation.Scattering transform has been successfully applied to many tasks of classification and recognition.These tasks include audio classification(e.g.,[2],[3]),texture image classification(e.g.,[7],[20],[21],etc.)and handwritten digit recognition(e.g.,[7]).Their efficiency can be partly explained by their stability to deformation in the Euclidean norm.
In 2013,S.Mallat and I.Waldspurger[17]advanced two models,i.e.,expected scattering and averaged scattering.These two models are mathematical models of general scattering models of deep neural networks with l2pooling.S.Mallat and I.Waldspurger[17]not only proved that expected scattering is contractive and preserves the mean-square norm but also completed the following proofs:averaged scattering is contractive and preserves norms.
Expected scattering and averaged scattering are constructed from the general scattering introduced by S.Mallat[16].Through analyzing expected scattering and averaged scattering,we discover that they have certain useful properties,i.e.,Lipschitz continuity and translation invariance,which are helpful to improve experimental results in image processing.Lipschitz continuity declares that the transform has small changes for small deformations in signal information processing.Lipschitz continuity ensures that the transform is stable to deformations.Translation invariance clarifies that the asymptotic scattering metrics of expected scattering and averaged scattering are translation invariant.Stability and invariants play a major role in physics[9],and they are being applied to signal information processing.Therefore,these properties we obtain are valuable for signal processing.
The analysis of texture is one of important steps in image processing.The methods for analyzing texture are very diverse,and differ from each other chie fl y by the extraction method of texture features.Texture analysis approaches used can be usually grouped into four categories,i.e.,model-based methods,statistical methods,structural methods and transform methods.Model based texture analysis[18],using fractal model,attempts to interpret an image texture by use of generative image model.Statistical methods of texture analysis techniques mainly describe texture of regions in an image through higher-order moments of their grayscale histograms[22].Structural approaches[10]represent texture by well-defined primitive and provide a good symbolic description of the image.Texture analysis techniques based on transforms,such as Fourier[19],Gabor[8]and Wavelet transforms[12],convert the image into a new form using the spatial frequency properties of the pixel intensity variations.Compared with the Fourier and Gabor transform,the wavelet transforms have several advantages which make the wavelet transforms attractive for texture analysis.However,the problem with wavelet transforms is that it is not translation-invariant[13].
This paper ensured theoretically that expected scattering and averaged scattering are translation invariant.They overcome the disadvantage of wavelet transform,that is,not translation invariance.Expected scattering and averaged scattering can provide an image representation which is stable to elastic deformation.In this study,we advance new feature extraction methods which are based on expected scattering and averaged scattering.Furthermore,these methods are applied to texture classification.Comparing with wavelet transform,experimental results show that the two approaches allow obtaining good performance in texture image classification.
The rest of this paper is structured as follows.Section 2 reviews expected scattering which is a representation of high dimensional probability distribution.We also show that expected scattering satisfies Lipschitz continuity property and translation invariant property.In section 3,we consider averaged scattering that can be estimated by a block averaging.We prove Lipschitz continuity property and translation invariant property of the averaged scattering.The features extraction and texture classification are explained in section 4.In section 5,texture classification experimental results using energy features are discussed in detail.Finally,concluding remarks are given in section 6.
We begin by specifying our notations.Throughout this paper,‖·‖ is L-2 norm.suppose t∈ R and ξ(t) ∈ R.We denote|Hξ|the norm of the Hessian tensor,Suppose that X(t)is a random process,satisfying that|E(X(t- ξ(t)))-E(X(t))| ≤ ε and|σX(t-ξ(t))- σX(t)|≤ ε.Let LξX(t)=X(t-ξ(t))denote the deformation of X(t)by ξ(t).Suppose that x is a signal,{ψλm}is a series of dilations and rotations of mother wavelet ψ,whereis the mean of m,ψ2jr(x)=2djψ(2jr-1x).[A,B]=AB-BA,ab=max{a,b}.Letters C and M denote two positive constants,these values of which may vary from place to place.
A scattering transform provides a model for feed-forward deep networks with l2pooling[14]-[15].Let us suppose that X is input signal X0,and make N0=N.An expected scattering computes each network layer Xm+1∈RNmby transforming the previous layer Xm,that is,
where Wav[λ]X={X ? ψλ}λ∈∧∞,Nm+1> Nm=JmN.So elements of propagated layers m+1 is
We can compute Xm+1from Xm-E(Xm)by iteratively computing
SinceRφ2J(X)dX=1,it results that
The wavelet commutator applied to X is
where
The operator[WavJ,Lξ]?[WavJ,Lξ]has a singular kernel along the diagonal,but its norm is bounded.
Lemma 1 There exists a constant C > 0 such that for allsatisfying
Proof The proof of this lemma follows essentially the same steps as S.Mallat[16]with the obvious modi fi cations.
The following lemma indicates the scattering distance produced by a random deformation has a upper bound.
Lemma 2 There exist constants C and M such that for all independent stationary processes ξ and X satisfying ‖▽ξ‖∞≤with probability 1,ifthen
with
Proof Let
We decompose
Since admissible scattering wavelets satisfy
then we obtain that
Now we shall prove that
with
where
From the known formula,it gives
where
Hence,we shall first prove that for any stationary process X,
is true,where
It follows from(1),(2)and(3)that
Hence the proof of(1)is ended by verifying that
and B(ξ)=E(D2(ξ))with
We can get ‖[WavJ,Lξ]‖ ≤ D(ξ)from Lemma 1.By applying to
it results that
Hence,we obtain that
where K(ξ) ≥ B(ξ).This completes the proof.
A characteristic of deformed stationary processes is that small stationary deformations of stationary processes have small modi fi cations of the scattering distance[4].The following lemma proves that if X is bounded then expected value of^UX is Lipschitz continuous.
Lemma 3 There exist constants C and M such that for X and ξ satisfying ‖▽ξ‖∞≤with probability 1,ifthen
with
It is enough to prove that
Since formula(1),then for this purpose,we shall first prove that
The above formula is established from Lemma 2.This completes the proof of Lemma 3.
Expected scattering provides a representation of the probability distribution of X.The following theorem proves that if X is bounded then expected scattering is Lipschitz continuous.
Theorem 1 There exist constants C and M such that for X and ξ satisfyingwith probability 1,ifthen
with
Proof
Through Lemma 1,Lemma 2 and Lemma 3,we know that Theorem 1 is true.
Let c be a constant.If ξ=c,thenis a translation ofWe show thatis translation invariant in the following theorem.
Theorem 2 Let c be translation variable.There exists a constant M such that for X satisfying
then
Proof:From Theorem 1 we know that
Let ξ=c,then ‖▽ξ‖∞=0 and ‖Hξ‖∞=0.Let J go to ∞,an easy calculation shows that
i.e.,‖ELcX-EX‖=0.The proof is finished.
To classify a signal x,which is the realization of any unknown class Xl,E(is estimated from an averaged scattering transform.
Suppose that x is a random vector defined in RN.Make=x.So elements of propagated layers m+1 is
where Wavm+1x=x ? ψλm+1.Each expected value is estimated by a block averaging Am.x is randomly divided into m-1 parts,letbe averaged over blocks Bj,mof size Bj?,m,which defines a partition of{1,···,Nm}:
Where j={1,2,···,m-1}.
The averaged scattering transform outputs the block averages of all layers
The averaged scattering is non-expansive and preserves the mean-square norm[23].The following theorem proves that the averaged scattering is Lipschitz continuous.
Theorem 3 There exist constants C and M such that for x and ξ satisfyingwith probability 1,ifthen
with
By a direct computation,we have
We now write
This,together with Theorem 1,gives the desired result.
Let c be a constant.If ξ=c,thenis a translation of.The following theorem proves that ALcis translation invariant.
Theorem 4 Let c be translation variable.There exists a constant M such that for x satisfying
then
Proof From Theorem 2,we know that
Let ξ=c.Then ‖▽ξ‖∞=0 and ‖Hξ‖∞=0.Let J go to ∞,we can get K(ξ)=0.It follows from Theorem 3,we have
that is,
Thus,the proof is complete.
Texture classification involves two phases,i.e.,learning and classification.In the learning phase,the original image is decomposed using expected scattering and averaged scattering,respectively.Energy features[11]of all the subbands are calculated by using the equation
where Ekis the energy for the kth subband of dimension M ×N and coefficients are xk(i,j),i=1,2,···,M,j=1,2,···,N.These features are stored in the database for the purpose of classification.
In the classification phase,an unknown texture image is decomposed using expected scattering and averaged scattering,respectively.Its features are obtained by equation(4).The feature vector derived from the unknown image is compared with the corresponding feature vectors in the database using the distance formula,given in the following equation
where P is the total number of features used,i=1,2,···,Q,Q is the number of images in the database,fj(x)denotes the jth feature of unknown texture image x,fj(i)represents the jth feature belonging to ith texture image.
Let minimum distance of i be
given in references[5].In classification,if Diminis obtained at i,then it can be said that x be regarded as ith texture image.The success of classification is assessed by the classification success rate and calculated using the following formula[1]
where S is the number of sub-images correctly classified and T represents the total number of sub-images which are derived from texture image database.
In this study,experiments are conducted with 20 monochrome texture images,each of size 512×512,which are obtained from VisTex image database[24].Each texture image of dataset is randomly subdivided into 84 64×64 image regions,so that a total of 1680 images regions will be in the database.The feature vector of each image region is calculated from the subbands of expected scattering and averaged scattering decomposition.
The number of subbands obtained in the decomposition varies with parameters of expected scattering and averaged scattering.Taking into account the experimental effect and computational complexity,the maximum number of scattering layer is 2 in our experiment.Other parameters of expected scattering and averaged scattering are set as follows.The number of different orientations is 4 and the maximum scale of transform is 2.The resulting the number of decomposition matrices the zeroth layer,the first layer and the second layer of transform is 1,8 and 16,respectively.
For comparative analysis,the expected scattering transform and averaged scattering transform are substituted by wavelet transform.Each image region is decomposed into subbands up to the third level of decomposition,so that a total of 12 subbands are created by using wavelet transform.Energy features are calculated for coefficients of every subbnad and also stored in features library for the classification.
In this paper,the experimental results are acquired by averaging over results of 20 trials.From Fig.1,it is observed that the maximum mean accuracy rate for energy features of expected scattering transform is 100%.The maximum success rates are 100%after 2 energy features.The lowest value of minimum success rate is only 10.71%,however,with the increase of the number of features involved,the minimum success rate eventually reaches up to 100%.
Figure 1:Texture Classification Results Obtained From Energy Features of Expected Scattering Transform.
The experimental results,which are achieved by using features of averaged scattering transform,are shown in Fig.2.From Fig.2,it is found that the highest value of averaged accuracy rate is almost 100%.Similar to expected scattering transform,beginning with the third energy features,the maximum success rates touch 100%.Fig.2 indicates that with the increase of the number of features,the minimum success rate increased from 10.71%to 98.81%.
Fig.3 demonstrates that starting with the seventh feature,the average success rates obtained from features of expected scattering transform is higher than the mean success rate achieved by features of wavelet transform,and the largest gap is 2.62%.Through analysis of Fig.4,it is found that from the sixth feature,the mean success gain acquired by using features of averaged scattering transform is higher than the average success rate get by using features of wavelet transform,and the maximum improvement is 2.44%.
Figure 2:Texture Classification Results Obtained From Energy Features of Averaged Scattering Transform.
Figure 3:Texture Classification Results.They are Obtained From Energy Features of Expected Scattering Transform and Wavelet Transform,Respectively.
Figure 4:Texture Classification Results.They are Obtained From Energy Features of Averaged Scattering Transform and Wavelet Transform,Respectively.
In this paper,Lipschitz continuity and translation invariance of expected scattering and averaged scattering are proved in detail,respectively.And,we report a new approach to extracting features of texture images based on expected scattering and averaged scattering,respectively.Furthermore,these features are applied to texture image classification.From the experiments conducted with texture images,it concluded that as far as expected scattering and averaged scattering are concerned,the classification results are not only improved by increasing the number of features,but also increasing significantly when the features of the first layer are involved.
AcknowledgementsWe would like to express gratitude to Prof.Jiangshe Zhang for his valuable comments and suggestions which lead to a substantial improvement of this paper.
Chinese Quarterly Journal of Mathematics2019年2期