,,,,
1.College of Air Traffic Management,Civil Aviation University of China,Tianjin 300300,P.R.China;2.Research Institute of Civil Aviation Safety,Civil Aviation University of China,Tianjin 300300,P.R.China;3.Polaris Aviation Technology Co.,Ltd.,Tianjin 300300,P.R.China
Abstract: The prediction process often runs with small samples and under-sufficient information. To target this problem,we propose a performance comparison study that combines prediction and optimization algorithms based on experimental data analysis. Through a large number of prediction and optimization experiments,the accuracy and stability of the prediction method and the correction ability of the optimization method are studied. First,five traditional single-item prediction methods are used to process small samples with under-sufficient information,and the standard deviation method is used to assign weights on the five methods for combined forecasting. The accuracy of the prediction results is ranked. The mean and variance of the rankings reflect the accuracy and stability of the prediction method. Second,the error elimination prediction optimization method is proposed. To make,the prediction results are corrected by error elimination optimization method(EEOM),Markov optimization and two-layer optimization separately to obtain more accurate prediction results. The degree improvement and decline are used to reflect the correction ability of the optimization method. The results show that the accuracy and stability of combined prediction are the best in the prediction methods,and the correction ability of error elimination optimization is the best in the optimization methods. The combination of the two methods can well solve the problem of prediction with small samples and under-sufficient information. Finally,the accuracy of the combination of the combined prediction and the error elimination optimization is verified by predicting the number of unsafe events in civil aviation in a certain year.
Key words:small sample and poor information;prediction method performance;optimization method performance;combined prediction;error elimination optimization model;Markov optimization
Managers often predict the future trends based on small samples and poor information. In order to achieve the expected prediction effect,it is particu?larly important to select appropriate forecasting methods,and to optimize the predict results if nec?essary. This paper focuses on the problems of infor?mation prediction with small samples.
The problem of poor sample information is characterized by a lack of information and a small number of samples. Lei et al.[1]established the GM(1,1)model of time-interval prediction for soft foundation settlement by using the grey theory,modified it with GM(1,1)model of residual error,and compared the results with the logarithmic curve estimation method. The results showed that the model was more accurate and more consistent with the reality. Chen et al.[2]used the grey fuzzy dynam?ic model to predict the production of municipal solid waste based on limited samples,and the prediction precision was higher than that of the traditional grey dynamic model. Bruno et al.[3]studied the coastal dynamics with the method of polynomial prediction and compared it with linear regression prediction.The result showed that polynomial prediction model is more suitable for this problem. Using the time se?ries neural network method and the rolling weight adjustment method,Yang et al.[4]predicted the wind speed,and the precision of the prediction re?sults was higher than that of the time series predic?tion method. Barbounis et al.[5]used the prediction model of local recurrent neural network with inter?nal dynamics to study on the wind speed prediction problem. The simulation results showed that this model had better performance than other network models. Niu et al.[6]used the support vector ma?chine(SVM) method to predict short-term load based on data mining,and the prediction results showed that this method had higher prediction preci?sion than the ordinary back propagation(BP)neural network model. Muzaffar et al.[7]used a special re?cursive neural network,the long and short term memory network,to predict short-term loads. Com?pared with the traditional root mean square error(RMSE) and mean absolute percentage error(MAPE) methods,the prediction precision was higher and could be further improved. Dudek[8]pro?posed a single predictive variable linear regression prediction method to predict short-term power load,and compared the performance of the proposed method with autoregressive integrated moving aver?age model(ARIMA),exponential smoothing mod?el,neural network model and other models,con?firming the high-precision capability of the method.Li et al.[9]used an adaptive exponential smoothing model to predict the short-term travel time of urban arterial street,and the model could deal with almost all kinds of traffic conditions. Combining with the Markov madel,Pourmousavi et al.[10]used the artifi?cial neural network prediction method to predict the wind speed,which improved the prediction precision.
Although these prediction methods can achieve the purpose of prediction and the optimization model can effectively improve the prediction precision,they are only suitable for specific research problems.For different prediction problems,it is necessary to re-select the prediction methods. Therefore,this pa?per studies the prediction problem of poor informa?tion events with small samples. We establish the combined prediction model based on the prediction error,and the optimization model of prediction re?sults,as well as analyze the performance of the pre?diction model and optimization model. In order to test and verify the performance of the model,we conduct a large number of prediction experiments to assess the accuracy and stability of the prediction method and the correction ability of the optimization model. Varieties of different samples are used in the prediction experiment to ensure the universality of the prediction samples and the generality of the pre?diction model and the optimization model.
Combinatorial prediction method is a prediction method that comprehensively analyzes and com?bines the results of different methods for the same problem. The purpose of combinatorial prediction is to improve the prediction precision as much as possi?ble by synthetically utilizing the information provid?ed by different methods. In the developing period of event,it is often difficult for a single prediction mod?el to fit closely to the frequent fluctuations. Com?pared with the single prediction model,the combina?torial prediction model can obtain a better prediction result than that of any single prediction model,re?duce the systematic error of prediction,and signifi?cantly improve the prediction effect.
The combinatorial prediction is shown as
whereyis the result of combination prediction,yithe prediction result of theith traditional single pre?diction method,withe weight coefficient of theith traditional single prediction method,sithe standard deviation of the prediction result of theith tradition?al single prediction method,andqthe number of tra?ditional single prediction methods.
In any case,there are always errors in the re?sult of predictions. The prediction error cannot be completely eliminated by any kind of optimization model. Therefore,the error elimination refers to re?ducing the overall error of the predictions as much as possible to a level that is accepted by the forecast?er. Error elimination is defined as reducing the aver?age prediction error to an acceptable level. Based on this definition,a new prediction result optimization model is established and named as error elimination optimization model(EEOM). The model is de?scribed as follows.
The initial predicted value is processed as
whereis the relative error of the initial predicted value,the initial prediction result;the true value,the overall average error level of the initial prediction result,andKthe number of data of the sample.
The acceptable average prediction error level isε. If,the prediction resultother wise the iteration is given as
wherethe optimized value of the predicted result after thelth iteration,l=1,2,3,…,n,the overall average error level after the optimization of thelth iteration,the prediction error of thekth data of the sample after thelth optimization itera?tion,k=1,2,3,…,K,andthe optimization re?sult of the prediction value of thek-th sample data af?ter thelth optimization iteration. Whenthe iteration ends,and the prediction optimization result is shown as
Markov optimization studies the transfer law between states according to the division of data states to predict the future trend of the system[11].
For each prediction method,the relative values of the original sequence and the prediction sequence are calculated as
whereis the value of the original sequence val?ue,andthe predicted sequence value.
The relative values of predicted results are di?vided intonkinds of states that are denoted asE1,E2,…,En. The interval of each state is [eis,eir](i=1,2,…,n),whereeisis the minimum value of the interval,andeirthe maximum value of the inter?val. Each relative valueCis distributed in one of the statesEi. The probability of transferring from stateEi(i=1,2,…,n)to another stateEj(j=1,2,…,n)isPij,which is called state transfer probability. The calculation ofPijis as
whereCiis the total number of occurrences of stateEiandCijthe number of transfer from stateEitoEj.Then the state transition probability matrixPis shown as
By using the state transition probability matrixP,the possible future states and trends can be pre?dicted from the current states. The relative valueEican be obtained from the matrixPand the predicted result of the prediction model. The median valueeiof the relative value state interval[eis,eir] that is,the relative value of stateEi,is used as the optimization coefficient of the predicted result. Then the optimi?zation resultycan be calculated as
2.1.1 Evaluation principle
A variety of prediction models,including the combinatorial prediction model,are used to conduct prediction experiments,and the prediction precision of the prediction results of each prediction model is calculated as
whereis the prediction precision of theith predic?tion model for thejth item of eventl,y(l)jthe statisti?cal true value of thejth item of eventl,the val?ue predicted by using theith prediction model ofjth item of experimentl.
According to the prediction results of each ex?periment,the prediction models are ranked accord?ing to the order of prediction precisions from high to low. After a large number of experiments,the mean and variance of the ranking of each prediction meth?od are calculated. The mean of the ranking reflects the accuracy of the method,and the variance deter?mines the stability of the method. The accuracy and stability of a prediction model can reflect the advan?tages and disadvantages of the model.
2.1.2 Verification analysis
This paper uses historical data to predict the number of unsafe accidents of civil aviation,take-off and landing flights,turnover of passenger traffic,to?tal mail volume,etc.in five years,involving 150 pre?diction experiments,each of which involves six pre?diction models. In the combinatorial prediction mod?el,the weight of each single prediction model is de?termined by the standard deviation of its prediction result. Through the establishment and application of the prediction model,the prediction precision rank?ing of each prediction model is shown in Table 1.
The ranking data in Table 1 can be used to cal?culate the rank mean and variance of each prediction method.The results are shown in Table 2.
Although the variance of the prediction results of the exponential prediction model is the lowest and the stability is the best,its accuracy is the worst among all the prediction models,so the exponential prediction model cannot be used for prediction in most cases. Although the accuracy of the polynomi?al prediction is close to that of the combinatorial pre?diction,its stability is poor;so the polynomial pre?diction model is not suitable for a general problem.Therefore,through the mean and variance of the ranking of each prediction method in this paper,it can be seen that the combinatorial prediction model can be used as a prediction method for general poor information events,for its accuracy and stability.
Table 1 Ranking statistics of forecasting methods
Table 2 Ranking mean and variance of forecasting methods
The optimization models that are involved in the comparison include EEOM,the Markov optimi?zation model and the two-layer optimization model.The two-layer optimization model combines the er?ror elimination optimization with the Markov optimi?zation. Based on the first optimization model,anoth?er optimization model is utilised to further modify the first optimization result.
2.2.1 Evaluation principles
The set of prediction data isn={1,2,3,…,N}.The set of the optimization model ism={1,2,3,…,M}. The accuracy of the optimization model is. The differenceε(n)between the two op?timization models can be expressed as
Ifε(n)is positive,the optimization modelihas higher precision and better correction performance.Otherwise,it means that the optimization modeljhas better correction performance. The correction performance of the optimization model refers to the ability that can make the predicted result close to the real value.
The statistical function of times with higher precision of prediction modelithan prediction mod?eljis shown as
And the precision difference sequenceε(n)is processed as
whereε′is the set with elements of small precision difference,ε″the set with elements of medium preci?sion difference,?the set with elements of large pre?cision difference,and[·]the integer operator.
After the optimization of the predicted results by using the optimization model,the precision of the prediction model is usually improved,but occa?sionally the precision decreases. Therefore,when the precision is used to evaluate the prediction mod?el,the correction ability of the optimization model can also be evaluated by using the degree of preci?sion decline.
2.2.2 Verification analysis
In order to further verify the correction ability of the two-layer optimization model after obtaining a superior optimization model that is get by comparing EEOM with the Markov optimization model,it is also necessary to compare the degree of precision change before and after using the superior model with that of using two-layer optimization model.
(1)EEOM and Markov optimization model
In this paper,78 optimization experiments are conducted. In each optimization experiment,EEOM and the Markov optimization model are used to optimize the predicted results,and the times of higher precision of the optimization results of the two optimization models are counted seperately.The higher precision of EEOM occurs higher 43 times,and that of the Markov optimization mod?el occurs 35 times.
Comparing the optimization precision of EEOM with that of the Markov optimization mod?el,the optimization precision difference of the two optimization models is divided into three categories:Small difference,large difference and great differ?ence. The interval between the small difference val?ues is[0.000,0.014),that between the large differ?ence values is[0.014,0.025),and that between the great difference values is[0.025,0.205]. After the precision difference is classified,the comparison results of accuracy between EEOM and the Markov optimization model are shown in Table 3.
Table 3 Comparison of higher precision times of EEOM and the Markov optimization model
There are 36 times of EEOM resulting in de?cline of precision,and 28 times of the Markov opti?mization model. The precision reduction of the EEOM and the Markov optimization model is shown in Fig.1.
Fig.1 Precision reduction of the EEOM and the Markov optimization
It can be seen from Table 3 that the correction ability of EEOM is better than that of the Markov optimization model. When the accuracy of EEOM is higher than that of the Markov optimization,there are 17 times that the precision of the two methods has a great difference. When the precision of the Markov optimization model is higher than that of EEOM,there are only 9 times that the precision dif?ference is great. It can also be seen from Fig.1 that the stability of EEOM is better than that of the Mar?kov optimization model. Therefore,when perform?ing two-layer optimization,EEOM is the first-layer optimization method.
(2) EEOM and the two-layer optimization model
In this part,72 optimization experiments are conducted. In each optimization experiment,EEOM and the two-layer optimization model are used to optimize the predicted results,and the times of higher precision of the optimization results after using the two optimization models are counted seperately. The higher precision of EEOM is 35 times,and that of the Markov optimization model is 29 times. The optimization precision of the two models is the same as 8 times.
Comparing the optimization precision of EEOM with that of the two-layer optimization mod?el,the optimization precision difference of the two is divided into three categories:Small difference,large difference and great difference. The interval between the small difference values is [0.000,0.010),that between the large difference values is
[0.010,0.019),and that between the great differ?ence values is[0.019,0.340]. After the precision difference is classified,the comparison results of ac?curacy between EEOM and the two-layer optimiza?tion model are shown in Table 4.
Table 4 Comparison of higher precision times of EEOM and the two?layer optimization model
There are 29 times of EEOM resuling in de?cline of precision,and 27 times of the two-layer opti?mization model. Precision reduction of EEOM and the two-layer optimization model was shown in Fig.2.
Fig.2 Precision reduction of EEOM and the two-layer optimization
It can be seen from Table 4 that the times when the optimization precision of the two-layer op?timization model is higher than that of EEOM is few. And the times of great difference of the twolayer optimization is lesser than that of EEOM. In the experiment,times when EEOM has higher pre?cision is 15,while that for the two-layer optimiza?tion model is only 9. In addition,it is can be seen from Fig.2 that the stability of EEOM is better than that of the two-layer optimization model. The opti?mization stability will reduce while using the Mar?kov optimization model after using EEOM. There?fore,after using EEOM,it is not necessary to carry out the two-layer optimization.
In this paper,a lot of prediction and optimization experiments are conducted to analyze the perfor?mance of prediction models and optimization models.It can be found that the stability of the polynomial pre?diction model is low,the precision of the exponential prediction model is poor,and the precisions and sta?bilitiec of the grey prediction,the linear prediction and the logistic curve prediction are bad.It also can be found that the correction ability of the Markov optimi?zation model and the two-layer optimization model are poorer than that of EEOM.Therefore,in order to solve the prediction problem of poor information events with small samples,the combinatorial predic?tion model with good precision and stability can be used to predict ,and EEOM with good correction ability can be used to optimize the prediction results.
By inquiring the Civil Aviation Administration of China(CAAC)production bulletin,the data of CAAC unsafe events from 2000 to 2017 can be ob?tained,as shown in Table 5. The data from 2000 to 2016 are used as forecast data,and the data of 2017 are validation data. The results of this prediction and optimization example can further verify the above conclusions.
Table 5 Data sample
The exponential forecast model,the grey pre?diction model,the polynomial prediction model,the logistic curve prediction model and the linear re?gression prediction model are used to obtain the pre?dicted results. According to Eq.(1),the weight of the single prediction model involved in the combina?torial prediction model are calculated to establish the combinatorial prediction model. Further,the pre?dicted result of the combinatorial prediction model is obtained. According to Eq.(9),the prediction preci?sions of the six prediction models are calculated.The comparison of the predicted results and the pre?diction precisions of the six prediction models are shown in Fig.3.
Fig.3 Predicted result and prediction precision
EEOM,the Markov optimization model and the two-layer optimization model are used to opti?mize the predicted results that are obtained by the ex?ponential forecast model,the grey prediction model,the polynomial prediction model,the logistic curve prediction model,the linear regression prediction model and the combinatorial prediction model.Then,the optimization precision of each optimiza?tion model is calculated. The optimized results and the optimization precisions are shown in Table 6.
It can be seen from Table 6 that the two combi?nations,to use the combinatorial prediction model to predict and then the Markov optimization model to optimize,and to use the combinatorial prediction model to predict and then EEOM to optimize,have the same high accuracy. But from the previous 300 prediction experiments and optimization experi?ments,it can be found that using the combinatorial prediction model to predict first and then using EEOM to optimize has the highest accuracy and sta?bility. The correction ability of EEOM is the best among the optimize models analyzed in this paper.Therefore,the combinatorial prediction model and EEOM are suitable for solving the small sample and poor information prediction problem.
Several prediction and optimization experi?ments are conducted to analyze the performance of prediction models and optimization models.
This paper randomly selected 25 events con?taining small sample and poor information to carry out 150 prediction experiments by six prediction models. It can be found from the experiments that the stability of the polynomial prediction model is low,the precision of the exponential prediction model is poor,and the grey prediction,the linear prediction and the logistic curve prediction have bad precision. The combinatorial prediction model is su?perior to other prediction models at both the stability and the prediction accuracy.
One hundred and fifty optimization experiments are conducted in this paper to analyze the perfor?mance of the optimization models. It can be found from the optimization experiments that the correc?tion ability of the Markov optimization model and the two-layer optimization model are poorer than EEOM.EEOM has both high stability and good cor?rection ability. It can also be found that the precision can not be further improved by using the Markov op?timization model after EEOM. Therefore,EEOM proposed in this paper is suitable for the optimization of predicted results of small sample and poor infor?mation events.
Therefore,in order to solve the prediction problem of poor information events with small sam?ples,the combinatorial prediction model with good precision and stability can be used to predict and EEOM with good correction ability can be used to optimize the prediction results.
Transactions of Nanjing University of Aeronautics and Astronautics2021年2期