XIONG Pingping ,CHEN Shiting ,and YAN Shuli
1.School of Management Science and Engineering,Nanjing University of Information Science and Technology,Nanjing 210044,China;2.College of Mathematics and Statistics,Nanjing University of Information Science and Technology,Nanjing 210044,China
Abstract: In this paper,an optimization model is proposed to simulate and predict the current situation of smog.The model takes the interval grey number sequence with the known possibility function as the original data,and constructs a time-delay nonlinear multivariable grey model MGM (1,m|τ,γ) based on the new kernel and degree of greyness sequences considering its time-delay and nonlinearity.The time-delay parameter is determined by the maximum value of the grey time-delay absolute correlation degree,and the nonlinear parameter is determined by the minimum value of average relative error.In order to verify the feasibility of the model,this paper uses the smog related data of Nanjing city for simulation and prediction.Compared with the other four models,the new model has higher simulation and prediction accuracy.
Keywords:kernel,degree of greyness,time-delay,nonlinear,smog.
Smog is essentially a general term of fog and smog,and it is also one of the important assessment indexes of air quality.Natural,industrial,and urbanization factors make the emission of a large number of fine particles exceed the maximum carrying capacity of the atmospheric cycle,resulting in large-scale smog in time and space [1].In recent years,smog has always been the focus of people’s attention,and its harm is also very significant,including the significant impact on the quality of China’s economic development [2],irreversible harm to human health [3],and so on.By summarizing and analyzing the smog data in recent years,it can be found that the smog has the characteristics of “l(fā)ow in summer and high in winter,middle in spring and autumn” for seasonal distribution [4].The occurrence frequency of northern cities is higher than that of southern cities in the geographical distribution.And the impact degree has the characteristics of long duration and wide impact range.How to deal with smog efficiently has always been a problem that needs to be tackled at home and abroad.It is our goal to find a balance between economic development and environmental protection.In recent years,although many scholars have made in-depth research on the causes of smog [5,6],prediction [7?9],prevention and control [10?12]and other aspects,making the smog control effect significant,it still has a long way to go.From the aspect of smog simulation and prediction,many scholars have mastered many effective methods,such as coupled four-dimensional local ensemble transform Kalman filter (4D-LETKF) and weather research and forecasting model coupled with chemistry (WRF-Chem) system [13],autoregressive integrated moving average model (ARIMA) [14],nonparametric panel model [15]and so on.All the above models provide theoretical support for the forecast of smog weather.In the collection of smog related data,it can be found that the degree of greyness of such data is large,that is,there is error or lack of monitoring data.Thus it is the key to select a targeted model for modeling decision.The grey system is a theory which takes the uncertain system with small samples and poor information as the object.It can solve the problem of large degree of greyness of smog data effectively.Its feasibility and effectiveness have been proved in many works [16,17].
The grey system,that is,the system with incomplete information,was put forward by Deng aiming at the system with small samples and poor information.It has been applied in water consumption [18],medicine [19],energy [20],and so on.The most widely used models in the grey system theory include grey model GM(1,N),multivariable grey model MGM(1,m),Verhulst model,and so on.The most typical model in all models is the GM(1,N) model,which aims to study the modeling process of a system characteristic behavior sequence andNrelated factors’ behavior sequences by differential fitting.Since the development of the GM(1,N) model,the improvement of support vector machine regression [21],the research of non-equidistant time series [22]and others have provided the basis for further exploration of the follow-up model.In view of the contradiction between the GM(1,N) model with only one system characteristic behavior sequence and more than one system characteristic behavior sequence in practical problems,the MGM(1,m)model is proposed.And its prediction accuracy is higher than that of the GM(1,1) model acting on multiple interacting variables [23].Through the discussion of the MGM(1,m) model,the scholars optimized and solved the problems of background value [24],unequal time interval [25],etc.
Most of the grey models mentioned above are based on real numbers.However,in the process of data collection,it can be found that the data of the existing statistical yearbook is often missing,which may be due to system error,accidental error,etc.Thus the research on interval grey number has practical significance.And its feasibility on the GM(1,N) model and the MGM(1,m) model has also been proved [26,27].The existing research methods to solve the interval grey number operation include describing and calculating the failure probability ranges and reliability based on the universal grey number to carry out the interval operation [28],transforming the grey number into “white part” and “grey part” to simulate and forecast [29],expanding the calculation of “kernel and grey degree” of the interval grey number [30],taking the “center of gravity” as the dividing point and decomposing the grey interval into upper and lower units to calculate the interval grey number [31]and so on.In addition,the traditional grey model often does not consider the time-delay effect of the past data on the current data and the nonlinear relationship between variables.This is in contradiction with the current situation of complex systems in real life and production,and the most direct contradiction is that it will lead to large deviations in model simulation and prediction.Therefore,it is necessary to introduce time-delay parameter and nonlinearity parameter.By considering the effect of time-delay,the application scope of the grey model is fully expanded[32],and the stability of the grey model with time-delay has been demonstrated [33].The discussion of non-linear relation leads to deep discussion in the grey Bernoulli model [34?36],and its unbiased property has been proved and applied [37].At present,there are grey timedelay correlation analysis [38],recursive least squares algorithm [39],geometric similarity grey relational grade(GRG) model [40],and other methods to solve the timedelay parameters,and genetic algorithm [41],selfmemory algorithm [42]and other methods to solve the nonlinear parameters.At present,there are few researches on introducing time-delay parameters and nonlinear parameters into the MGM(1,m) model,which needs to be improved.
The innovation of this paper is that for the interval grey number sequence with known distribution information,a new model is established by considering the influence of previous data on current data and the nonlinear relationship between variables in the construction of the grey model.It is the first time to introduce the time-delay term and nonlinearity term for the new kernel and degree of greyness sequences.Firstly,the possibility function of the interval grey number is determined by the expert scoring method,and then the new kernel and degree of greyness sequences are calculated.Next,the time-delay nonlinear multivariable grey model MGM(1,m|τ,γ) is established to simulate and predict the kernel sequence and the degree of greyness sequence respectively.The time-delay parameters τ are determined by the absolute correlation degree of grey time-delay,and the nonlinear parameters γ are determined by the minimum average relative error.Finally,the upper and lower bounds of the interval grey number are obtained and the error is tested.In order to verify the feasibility of the model,this paper selects the smog data of Nanjing city for case analysis.And the results show that the simulation and prediction accuracy of the model proposed in this paper is better.
The limitation of cognitive behavior and the complexity of objective things lead to the fact that variables cannot be described with accurate real numbers.Therefore,the introduction of interval grey numbers can provide a new direction for the effective solution of the problem,in which the interval grey number is marked as ? and the number with incomplete information in a certain interval is introduced.Kernel and degree of greyness are two representative basic attributes of the interval grey number.It is helpful to transform grey number operation into real number operation by using kernel and degree of greyness to represent the interval grey number.When the distribution information of the interval grey number is known,the possibility function is known.The definition of kernel and degree of greyness is given as follows:
Definition 1[43]Let sequencewhere grey numberThe universe of grey numberthat isLet the possibility function of grey numberand satisfiesthen the kernel of grey numberis
Definition 2[43]When the distribution information is unknown,we take the possibility function of the grey number interval as 1,which means that the possibility of values in the interval is equal.Accordingly,the value of the possibility function outside the interval grey number but in the universe is 0,recorded as
then the kernel and degree of greyness of grey numberwith unknown interval distribution information are
2.2.1 MGM(1,m|τ,γ) model based on the new kernel sequence
is called the Albino differential equation of the timedelay nonlinear MGM(1,m|τ,γ) model based on the new kernel sequence.
Definition 4By discretizing the above whitening differential equations,
Theorem 1The least square estimation of parameter sequencesatisfies
(i) Whenn=m+2,then=P?1Qj,j=1,2,···,m;
(ii) Whenn>m+2,then=(PTP)?1PTQj,j=1,2,···,m;
(iii) Whenn ProofAccording to the discrete equation of the MGM(1,m|τ,γ) model based on the new kernel sequence,can be obtained,wherej=1,2,···,m. (i) Whenn=m+2,the matrixPis a nonsingular matrix,that is,matrixPhas an invertible matrix,then Theorem 2The discrete solution of MGM(1,m|τ,γ)based on the new kernel sequence is The theorem is proved by transforming it into the matrix form.□ 2.2.2 MGM(1,m|τ,γ) model based on the new degree of the greyness sequence Since the construction process of the MGM(1,m|τ,γ)model based on the new degree of the greyness sequence is similar to the MGM(1,m|τ,γ) model based on the new kernel sequence,this section will not repeat its modeling mechanism.The discrete solution of the MGM(1,m|τ,γ)model based on the new degree of greyness series is In order to solve the parameters of the above model,the time-delay parameter τ and the nonlinear parameter γ should be determined first.In this section,the time-delay grey absolute correlation degree is selected to calculate the time-delay parameter τ.The time-delay grey absolute correlation degree is realized by calculating the absolute correlation degree of reference sequence and contrast sequence with time-delay,so as to select the number of time-delay corresponding to the maximum absolute correlation degree as the solution [45].The specific method is as follows.This section takes the MGM(1,m|τ,γ)model based on the new kernel sequence as an example,and only considers the case that the time-delay parameter is integer. When εjireaches the maximum value,the corresponding time-delay parameter τjiis solved.The expression is The arithmetic mean value of all the values of τjiis the time-delay parameter τ of the model,and the expression is as follows: The above is the process of determining τ. After the time-delay parameter τ is determined,the nonlinear parameter γ is determined here with the objective of minimizing the average relative error and the constraint condition of the model itself.In this section,we take the MGM(1,m|τ,γ) model based on the new kernel sequence as an example: It can be seen from the above formula that different γjhas different avg(e(k)) values.In order to make avg(e(k))reach the minimum value,the genetic algorithm or the particle swarm optimization algorithm can be selected for calculation.After the nonlinear parameter γjis determined,the relevant parameters of the MGM(1,m|τ,γ)m odel are also obtained. Suppose that the simulated and predicted values of the new kernel and degree of greyness of interval grey numberrespectively,and the estimated values of upper and lower bounds obtained by reduction arerespectively.Then according to (1) and (2),the following results can be obtained: The relative errors of the upper and lower bounds of the interval grey number sequence are as follows: wherej=1,2,···,mandk=1,2,···,n. The average relative error is The error test standard can be seen from the prediction accuracy,and the evaluation standard is shown in Table 1[46]. Table 1 Average relative error and prediction accuracy of the model Step 1Collect data and get interval grey number sequence. Step 2Determine the possibility function of the interval grey number sequence according to the expert scoring method.The expert scoring method is a method to comprehensively count the opinions of most experts on the data that cannot be quantitatively analyzed by other methods. Step 3The new kernel and degree of greyness sequences of the interval grey number sequence are calculated by Definition 1. Step 4Establish the MGM(1,m|τ,γ) model for the new kernel and degree of greyness sequences obtained in the above steps. Step 5Determine the time-delay parameter τ and nonlinear parameter γ,and then put them into the model to obtain the simulation and prediction values. Step 6Calculate and restore the upper and lower bounds of the interval grey number sequence according to(11). Step 7Do the error test. The flow chart is shown in Fig.1. Fig.1 Modeling flow chart The severity of smog is closely related to the concentration of pollutants in the air.The air quality index (AQI)monitors the concentration of sulfur dioxide,nitrogen dioxide,PM10,PM2.5,carbon monoxide and ozone.And there are evaluation standards published by the state.Therefore,it is reasonable to choose AQI as one of the variables.In addition,visibility is mainly determined by atmospheric transparency.The better the atmospheric transparency is,the greater the visibility distance is.The smog weather condition can directly affect the atmospheric transparency,so visibility as another variable can directly and positively describe the air quality.Therefore,it is necessary to simulate and predict the trend of smog through the study of AQI and visibility data. In this section,AQI and visibility are selected as the behavior sequence,i.e.,m=2,to construct the MGM(1,2|τ,γ) grey model based on the new kernel and degree of greyness sequences.A total of 33 daily data of Nanjing from December 18,2020 to January 19,2021 are selected.By selecting the maximum and minimum values of each factor in the six days before a certain day,the interval grey number of 27 days from December 24,2020 to January 19,2021 is generated.Among them,13 days’data from January 5,2021 to January 17,2021 are used as simulation data,i.e.,l=11,n=13,and the data on January 18 and 19,2021 are used for prediction.In this paper,the domain of AQI is ?1∈[0,200]and the domain of visibility is ?2∈[0,200]. Step 1The interval grey number sequence is obtained according to the above method. Step 2Determine the possibility function of the interval grey number sequence according to the expert scoring method,as shown in Table 2. Table 2 Possibility function Step 3Calculate the new kernel and degree of the greyness sequences of the interval grey number sequence. Step 4Establish the MGM(1,2|τ,γ) model for the new kernel and degree of greyness sequences respectively. Step 5According to the principle of the maximum time-delay grey absolute correlation degree and arithmetic average,we get the time-delay period τ=6 of the new kernel sequence and τ′=7 of the new degree of the greyness sequence.Then based on the principle of the minimum average relative error,the new kernel sequence γ1=1.342 9,γ2=1.189 9,and the new degree of greyness sequence γ1′=0.059 2,γ2′=0.001 7 are calculated.Finally,the calculated parameters are substituted into the model to obtain the simulation and prediction values. Step 6Calculate and restore the upper and lower bounds of the interval grey number sequence according to(11),as shown in Table 3. Table 3 Simulated and predicted values Step 7Test the error,and the related error is shown in Table 4,where L means the lower bound of the interval grey number,and U means the upper bound. Table 4 Relative error This section compares the new model with the univariate regression model,the MGM(1,m) model based on the conventional kernel and degree of greyness sequences,the MGM(1,m) model based on the new kernel and degree of greyness sequences,and the time-delay multivariable grey model (MGM(1,m|τ)) based on the new kernel and degree of greyness sequences.The MGM(1,m) model based on the conventional kernel and degree of greyness sequences refers to the interval grey number sequence whose probability function is unknown.The MGM(1,m) model based on the new kernel and degree of greyness sequences does not consider the time-delay effect and nonlinearity,i.e.,τ=0,γj=1 ;the MGM(1,m|τ)model based on the new kernel and degree of greyness sequences does not consider the nonlinearity,i.e.,γj=1.The univariate regression model,the MGM(1,m) model based on the conventional kernel and degree of greyness sequences,the MGM(1,m) model based on the new kernel and degree of greyness sequences,the MGM(1,m|τ)model based on the new kernel and degree of greyness sequences,and the new model are recorded as Model 1,Model 2,Model 3,Model 4,and Model 5 respectively.The intervals and errors obtained from the four models are shown in Table 5 and Table 6. Table 5 Comparison of simulated and predicted values Table 6 Comparison of relative errors The following analysis can be obtained by observing the above Tables 3 to 6.For the simulation errors of AQI and visibility,the average errors of the new model are 3.51% and 3.42% respectively,which are less than 10%and are the least among the four models.It can be seen from the results that the new model is not the best of the four models in terms of the simulation error of the lower bound of the AQI sequence.However,considering all the simulation results,the new model still has the highest simulation accuracy.For the prediction errors of AQI and visibility,the average errors of the new model are 8.87%and 3.45% respectively,which are less than 10%.Through model comparison,it can be found that the average prediction error of AQI series of Model 1 and Model 3 are less than that of the new model,but average prediction error of visibility series of Model 1 and Model 3 are 28.48% and 20.43% respectively,which are much higher than that of the new model.Therefore,the prediction accuracy of the new model is the highest among the four models. In order to intuitively show the error difference between the four models,the average relative error of each sequence is made into a comparison chart,as shown in Fig.2,where APE means absolute percentage error. Fig.2 Comparison of average relative error of AQI It can be seen from Fig.2 that there are fluctuations in the models,which are caused by the sudden increase or decrease of the original interval data.Through comparison,we can find that the trend of the new model and Model 1 are more stable than the other three models.Although the accuracy of individual points is not the highest,compared with other models,the overall error of the new model is less than 10%,that is,the simulation accuracy and prediction accuracy are high.It can be seen from Fig.3 that the new model has the best simulation and prediction effect among the four models.Especially in the average prediction accuracy,the new model has significantly improved compared with other models.The overall error trend of the new model is stable,and there is no abnormal fluctuation.By comparing the new model with Models 1 to 3,we can see that the time-delay and nonlinearity of the original sequence are effectively improved when the interval distribution information is known,which reflects the feasibility and effectiveness of the new model. Fig.3 Comparison of average relative error of visibility In this paper,for interval grey numbers with known distribution information and considering the time-delay and nonlinearity of the original sequence,a time-delay nonlinear MGM(1,m|τ,γ) grey model based on the new kernel and degree of greyness sequences is established.The modeling steps are as follows:Firstly,the possibility function of the interval grey number is determined according to the expert scoring method.Then,the new kernel and degree of greyness sequences are obtained by the definition of them when the distribution information is known.And the time-delay nonlinear MGM(1,m|τ,γ)model is established for the two sets of sequences.Finally,the upper and lower bounds are restored and the error is checked.The above are the theoretical feasibility of the new model.In order to complete the feasibility of its practical application,this paper uses the smog related data of Nanjing city in Jiangsu Province for case analysis,and compares the univariate regression model,the MGM(1,m) model based on the conventional kernel and degree of greyness sequences,the MGM(1,m) model based on the new kernel and degree of greyness sequences,and the MGM(1,m|τ) model based on the new kernel and degree of greyness sequences.Comprehensive analysis and discussion show that the new model has high simulation accuracy and prediction accuracy. However,this research can be improved.Considering the diversity of determination methods of time-delay parameters and nonlinear parameters,the model can be optimized by improving the parameter determination method,which will provide ideas for expanding and perfecting the model.2.3 Solution of time-delay parameter τ
2.4 Solution of nonlinear parameter γ
2.5 Error test
3.Modeling steps
4.Case analysis
4.1 MGM(1,2|τ,γ) model based on new kernel and degree of greyness sequences
4.2 Model comparison
5.Conclusions
Journal of Systems Engineering and Electronics2022年2期