Zhuo He ,Zhaohui Tang ,*,Zhihao Yan ,Jinping Liu
1 College of Information Science and Engineering,Central South University,Changsha 410083,China
2 Key Laboratory of High Performance Computing and Stochastic Information Processing of Ministry of Education of China,College of Mathematics and Computer Science,Hunan Normal University,Changsha 410081,China
Keywords:DTCWT Working condition Integrated classification model Zinc fast roughing
ABSTRACT The surface texture of mineral flotation froth is well acknowledged as an important index of the flotation process.The surface texture feature closely relates to the flotation working conditions and hence can be used as a visual indicator for the zinc fast roughing working condition.A novel working condition identification method based on the dual-tree complex wavelet transform(DTCWT)is proposed for process monitoring of zinc fast roughing.Three-level DTCWT is implemented to decompose the froth image into different directions and resolutions in advance,and then the energy parameter of each sub-image is extracted as the froth texture feature.Then,an improved random forest integrated classification(iRFIC)with 10-fold cross-validation model is introduced as the classifier to identify the roughing working condition,which effectively improves the shortcomings of the single model and overcomes the characteristic redundancy but achieves higher generalization performance.Extensive experiments have verified the effectiveness of the proposed method.
Froth flotation is the most widely used beneficiation method,but it is difficult to achieve optimal control of the method because of the long flotation process,the ambiguous internal mechanism and the key process parameters which cannot be detected online[1].For a long time,the working conditions were mainly determined by experienced operators through observing the froth states of flotation,which lacks a uniform standard,resulting in low utilization of the ore and high mineral production costs.Effective,low-cost,and on-line inspection technologies on the assembly production line are urgently needed in mass production processes[2].In recent years,with the development of the image processing technology,computer technology,and other related fields,the discrimination of abnormal conditions by machine vision monitoring has inspired various research topics[3].
At present,the domestic and foreign experts have done some research on the analysis of the working condition of the flotation process.A performance recognition method for the antimony flotation process is proposed by Tang,where mutual coupling and obvious importance differences exist in froth image features[4].Zhang developed a novel method for determining the complex working conditions of flotation through statistical modeling of froth images[5].Peng proposed a working condition recognition method based on an improved neighboring gray level dependence matrix(NGLDM)and an interval data classifier for the antimony roughing process[6].The above mentioned methods analyzed flotation conditions from different angles,and obtained the certain effect,which illustrates the feasibility of using machine vision for the flotation condition analysis.
Dual-Tree Complex Wavelet Transform(DTCWT)has related applications in many fields,such as image processing and fault diagnosis,because it solves the problems of shift variance and low directional selectivity in two and higher dimensions found with the commonly used discrete wavelet transform(DWT)[7,8].This paper first puts forward an image feature extraction method of froth flotation based on DTCWT,then proposes an improved random forest model to monitor how well the process of zinc flotation is working.Finally,the accuracy of the proposed method is verified by industrial data.
The zinc flotation process is an important part of the lead–zinc flotation process.According to the difference of the physical and chemical properties of the mineral surface,the useful zinc particles are selectively attached to the bubbles under the action of flotation reagents,and then rise to the top of the flotation tank.The useless ore particles fall to the bottom of the pulp in the flotation cell,to achieve effective separation of minerals.The flow diagram of the zinc flotation process is shown in Fig.1.(See Table 1.)
As is shown in Fig.1,zinc fast roughing is the upstream portion of the zinc flotation process.The froth layer of zinc fast roughing process enters the cleaning,while the bottom flow of zinc fast roughing enters the roughing process.So zinc fast roughing conditions can directly affect the status of the follow-up process.A good control of the zinc fast roughing module can ensure concentrate grade and improve the recovery rate.Therefore,research on of the zinc fast roughing module has great economic significance.
Fig.1.Zinc flotation process.
Recently,an improved algorithm of the discrete wavelet trans formation called the Dual-Tree Complex Wavelet Transform is put forward[9].By using two real wavelet decompositions,the two-dimensional DTCWT is implemented respectively by two trees used for the rows of the image and by two trees for the columns.The resulting wavelet coefficients are then combined by simple sum and difference operations to give real and imaginary wavelet coefficients.This gives six sub-images which are approximately shift-invariant and orient at ±15°,±45°,±75°.This method deals with the shift variance phenomenon using parallel wavelet filtering with directionality support(six planes)[10].Furthermore,this filtering provides a 1/2 sample delay in the wavelet branches of the dual trees allowing near-shift invariance and perfect reconstruction[11].
The wavelet function isψ(t)= ψh(t)+jψg(t),whereψh(t),ψg(t)represents a real wavelet.A 2-dimensional DTCWT is represented through the image analysis of Eq.(3)and by one dimension DTCWT in Eq.(1)and(2)respectively,then:
Table 1 The steps of heterogeneous integration model
The real component of six directions of the two-dimensional dualtree complex wavelet transform is:(i=1,2,3)
where ?h(?)and ?g(?)is a low-pass function,ψh(?)and ψg(?)is a highpass function.ψ1,1(x,y)= ?h(x)ψh(y),ψ1,2(x,y)= ψh(x)?h(y),ψ1,3(x,y)= ψh(x)ψh(y),ψ2,1(x,y)=?g(x)ψg(y),ψ2,2(x,y)= ψg(x)?g(y),ψ2,3(x,y)=ψg(x)ψg(y).
The imaginary part of the six directions of the two-dimensional dual tree complex wavelet is:
where ψ3,1(x,y)= ?g(x)ψh(y),ψ3,2(x,y)= ψg(x)?h(y),ψ3,3(x,y)=ψg(x)ψh(y),ψ4,1(x,y)=?h(x)ψg(y),ψ4,2(x,y)=ψh(x)?g(y),ψ4,3(x,y)=ψh(x)ψg(y).
The six-direction selective bandpass image is composed of the real component obtained by Eq.(4)and the imaginary part obtained by Eq.(5).
Energy,variance and entropy are used as the texture feature to represent the work condition of the flotation process in zinc fast roughing.After the process of multi-scale decomposition of DTCWT,the energy,variance and entropy of the decomposed sub-image are defined as:where R(ik)is energy,V(ik)is variance and E(dk)is entropy.M represents the width of the subgraph,N represents the height of the subgraph,f(m,n)is the gray value of pixels.Texture feature obtained by a k-level wavelet decomposition is
As shown in Fig.2,(a)is a picture of a typical froth image,(b)is the images after 3-level wavelet decomposition of(a).Texture feature can be obtained by the energy,variance and entropy analysis of(b).
After the extraction of the texture feature is completed,the classification can be realized by analyzing the energy of the froth image.A machine learning method called random forest used for classification and regression was proposed by Leo Breiman and Cutler Adele in 2001,which is widely applied in many areas[12].
Random forest,a collection of tree classifiers,is a kind of ensemble learning method.The base classifier is built by using the classification and regression tree(CART)algorithm without pruning.For every base classifier,random forests use bootstrap for self-help sampling and use sample data out of the bag to calculate error estimates.The nodes of each tree are randomly generated and only a few variables are chosen to be the segmentation variables for each node.In this method,many different base classifiers are produced.Therefore,it is called“random forest.”Computations performed by integrating multiple base classifiers improve the generalization,effectiveness and reliability of the model.Despite this,it is not ideal to have multiple base classifiers.
where G(ai,bi,x)is an output function of the i th hidden nodes;aiand biare the hidden layer parameters;βis the weights of the output of the i th hidden layer nodes;h(x)=[G(a1,b1,x),…,G(ai,bi,x)]is a kernel function of hidden layer;G(?)is an excitation function which may be Sigmoid,Sine or RBF.
Fig.2.Dual-Tree complex wavelet transform.
ELM solves the output weight βby minimizing the training error and output weight norm at the same time,namely solving the following constrained optimization problem:
s.t.h(xi)β =yi? ξi,i=1,2,…,N.where ξiis the training error,C is penalty parameter.
Eq.(7)can be turned into the dual problem in Eq.(8):
where every Lagrange function corresponds to the i th sample.Derivation of Eq.(8):
where α =[α1,…,αN].Solved by Eq.(9):
For a small sample of training data,Eq.(8)can be represented as:
The output function of ELM can be obtained by Eqs.(10)and(11):
By replacing the hidden layer mapping of ELM with the kernel function satisfying the Mercer condition(such as RBF and the polynomial kernel),the output of the kernel extreme learning machine(KELM)can be represented as:
He went home and from then on, he went to that store every day and bought a CD, and she wrapped it for him. He took the CD home and put it in his closet3. He was still too shy to ask her out and he really wanted to but he couldn t. His mother found out about this and told him to just ask her. So the next day, he took all his courage and went to the store as usual. He bought a CD like he did every day and once again she went to the back of the store and came back with it wrapped. He took it and when she wasn t looking, he left his phone number on the desk and ran out...
This research uses the KELM with the RBF kernel because the RBF kernel function has a good local characteristic and strong learning ability.Adjacent sample data has a large impact on kernel function value.
The number of training sets after resampling in this work is J,and every CART or KELM is trained by using each training set.Then,a sub model with higher accuracy is used as the j th model output:
where Xjis the j th training data subset.fjis the j th sub-model obtained by KELM or CART.
In conclusion,the steps of the heterogeneous integration model are:
Fig.3.10-Fold cross-validation.
The performance and general error estimation of the entire random forest was assessed using stratified 10-fold cross-validation which is currently the preferred technique in data mining[13].The advantage of this method compared with the split-sample(test and validation)approach is that it decreases the variance in prediction error leading to a more accurate estimate of model prediction performance.Furthermore,it maximizes the use of data for both training and validation,instead of over fitting or overlap between test and validation data[14].
Briefly,the dataset is randomly divided into 10 equal folds.Each fold will be used as the validation set with the rest folds as the training set.The validation result of the overall performance is the average of 10 experimental models(Fig.3).
In this section,the method proposed in is simulated and validated by using industrial production data obtained in the zinc fast roughing process of a lead and zinc flotation plant in China.We collected data,including zinc grade and image,every 20 min for 2 months(24 h a day).Based on the experience of operators in the plant,4 typical images are selected,and each kind has 100 froth images.As shown in Fig.3,they are the under flow image,the over flow image,the higher-grade image under the normal working conditions and the lower-grade image under the same normal conditions.
The three-level Dual-Tree Complex Wavelet Transform is used because of its precision and computational complexity compared to others level in Fig.4.The training data set is X280×18,the category of working condition is T1,T2,T3and T4,the category vector of working condition is Y=[y1,y2,…,y280],where yi∈ {1,2,3,4}.The penalty parameter and kernel parameter of KELM used in the improved random forest is determined by the grid search method,and the value of the penalty parameter is 28,the value of the kernel parameter is24.The model number is 100 and the number of feature subsets is 10 according to repeated test results and model complexity.
Fig.4.Classified accuracy in different level of DTCWT.
Fig.5.Froth images of four different typical conditions.
The data present in the three layers of the froth image texture feature of DTCWT decomposed dimension is very large;each sub-image has 6 directions in 3 layers,and each sub-image has 3 texture features,which is 54 features.This large number of features not only increases the difficulty of calculation but also contains redundant elements.It brings great difficulty to the subsequent industrial control systems.An important feature of the random forest algorithm itself is that it can evaluate the importance of variables.It provides four strategies for measuring the importance of variables.We use Out-off Bag data to evaluate the importance of every feature(Figs.4–9).
The classification ability of common GLCM[15]is used to compare the texture feature classification ability with the method proposed in this paper.The performance of classification with the random forest algorithm and the improved randomalgorithm is compared as well in Fig.10.
Fig.6.The importance of every character of energy.
Fig.7.The importance of every character of variance.
Fig.8.The importance of every character of entropy.
Fig.9.The feature weight of every feature.
Fig.10.The accuracy of working condition recognition.
In the 400 different images,the traditional image texture classification method can reach the highest accuracy of 86%.But,in our method,89 images were correctly predicted as the under flow image;92 images were correctly predicted as the over flow image;87 images were predicted as the higher grade image;and 88 images were correctly predicted as the lower grade image.An average predicted accuracy of 89%was achieved by DTCWT and improved random forest method.
As shown in Fig.10,the texture feature of the zinc fast roughing froth image obtained by the method in this paper can accurately reflect the change of working condition of zinc fast roughing,and this proves the feasibility and validity of this method.The improved random forest algorithm can reduce the single model error and enhance the reliability of the working condition recognition.
This article proposes a texture feature extraction method with DTCWT in an attempt to overcome the limitations of the single-rate method.Our method has the advantages of approximate invariance of translation and selectivity of direction.Furthermore,an improved random forest integrated classification model,which combines KEML and a classification regression tree as a basic classifier,is introduced to eliminate the classification error of the single model.This method,which has been validated through the use of industrial data,can effectively present the process characteristics under different conditions,and can accurately identify the production conditions of zinc flotation,which provides effective guidance for field production.
Chinese Journal of Chemical Engineering2018年8期