Zunwen HE ,Yue LI ,Yan ZHANG? ,Wancheng ZHANG ,Kaien ZHANG,Liu GUO,Haiming WANG
1School of Information and Electronics,Beijing Institute of Technology,Beijing 100081,China
2State Key Laboratory of Millimeter Waves,Southeast University,Nanjing 210096,China
Abstract: Asymmetric massive multiple-input multiple-output (MIMO) systems have been proposed to reduce the burden of data processing and hardware cost in sixth-generation mobile networks (6G).However,in the asymmetric massive MIMO system,reciprocity between the uplink (UL) and downlink (DL) wireless channels is not valid.As a result,pilots are required to be sent by both the base station (BS) and user equipment (UE) to predict doubledirectional channels,which consumes more transmission and computational resources.In this paper we propose an ensemble-transfer-learning-based channel parameter prediction method for asymmetric massive MIMO systems.It can predict multiple DL channel parameters including path loss (PL),multipath number,delay spread (DS),and angular spread.Both the UL channel parameters and environment features are chosen to predict the DL parameters.Also,we propose a two-step feature selection algorithm based on the SHapley Additive exPlanations (SHAP) value and the minimum description length (MDL) criterion to reduce the computation complexity and negative impact on model accuracy caused by weakly correlated or uncorrelated features.In addition,the instance transfer method is introduced to support the prediction model in new propagation conditions,where it is difficult to collect enough training data in a short time.Simulation results show that the proposed method is more accurate than the back propagation neural network (BPNN) and the 3GPP TR 38.901 channel model.Additionally,the proposed instancetransfer-based method outperforms the method without transfer learning in predicting DL parameters when the beamwidth or the communication sector changes.
Key words: Asymmetric massive multiple-input multiple-output (MIMO) system;Channel model;Ensemble learning;Instance transfer;Parameter prediction
To fulfill the requirements of high spectrum effi-ciency and high energy efficiency,massive multipleinput multiple-output (MIMO) arrays are widely used in fifth-generation mobile networks(5G).However,as the number of antenna elements increases continually to meet the increased demand in sixthgeneration mobile networks(6G),the signal processing burden and hardware costs become higher and higher (Albreem et al.,2019,2021;Radhakrishnan et al.,2021;Qiu et al.,2022).To solve this problem,asymmetric full-digital beamforming massive MIMO systems (Hong et al.,2020,2021) have been put forward.Unlike conventional large-scale beamforming arrays,the asymmetric massive MIMO system employs a smaller number of receiving (Rx) radio frequency (RF) chains than transmitting (Tx)chains,which can reduce the hardware cost and energy consumption.This asymmetric system with non-reciprocal Tx/Rx beam patterns has also been proven to realize wider coverage in the uplink (UL)and offer higher gain beams in the downlink (DL)(Hong et al.,2021).Thus,there is great potential for its application in future wireless communication networks.
However,an emerging problem in the asymmetric massive MIMO system is that the reciprocity between the UL and DL wireless channels is not valid.This means that even in a time division duplexing(TDD)system,it is impossible to directly use the UL channel estimation results to predict the DL channels.Thus,pilots are required by both base station(BS) and user equipment (UE),which occupy more transmission and computational resources(Wimalajeewa et al.,2017).This problem becomes severe in the DL because the number of Tx antenna elements on the BS side is very large.
Actually,the non-reciprocity between the UL and DL channels results from different Tx and Rx beam patterns;i.e.,the UL/DL signals do not travel under the same propagation conditions.Analyzing the relationship between beam patterns and channel parameters can be useful in characterizing this nonreciprocity.Some research has been conducted to investigate the impacts of the beamwidth on channel parameters and can be divided into two categories.The first category is based on empirical models.In several studies (Wu et al.,2015;Kim et al.,2016;Li et al.,2017;Erden et al.,2020;Jiang et al.,2021),the channel parameter distributions of different Tx beamwidths were obtained by conducting signal acquisition experiments,to fit empirical models that can be used for channel parameter estimation.The estimation results can be quickly calculated by the empirical models.However,they could provide only statistical results and the accuracy was limited.In addition,they were applicable only to specific beamwidths and scenarios.The second category is to employ ray-tracing techniques,which are based on the principles of geometric optics (Tuan et al.,2016;Chen YS et al.,2019).The ray-tracingbased methods can provide accurate results,but may require a great amount of computation time.
Recently,methods based on artificial intelligence(AI)(Ye et al.,2018;Joo et al.,2019;Yang GS et al.,2019;Yang YW et al.,2019;Han et al.,2020;Lin B et al.,2021;Zhang S et al.,2021;Zhang SB et al.,2021)enabled fast and accurate channel statistical parameter prediction without relying on electromagnetic propagation maps.Although these works considered only a single link,i.e.,UL or DL,they provided inspiration to solve the prediction problem in the non-reciprocal channels.In this paper,we propose a method based on ensemble learning and instance transfer to predict DL channel parameters,which are important for the link budget and adaptive transmission in asymmetric massive MIMO systems.Although the selected scenario is a typical urban one,our proposed method can also be applied in other outdoor scenarios,such as suburban or rural ones.It should be mentioned that the transfer learning module may be useful only when the new scenario is similar to the original one;e.g.,they are different sectors in the same cell.The intrinsic physical mechanism is that although the Tx and Rx beam patterns are different,the BS and UE are located in the same area.Therefore,the UL and DL channels share a common part of the propagation environment.An AI-based model is built to characterize the relationship among the UL channels,DL channels,and propagation environments.Our work is the first one to apply ensemble transfer learning for channel parameter prediction in asymmetric massive MIMO systems.We investigate the correlation between the UL and DL propagation environments and design a twostage feature selection method,which can reduce the negative impact of redundant features.An ensemble learning approach is used to combine multiple weak learners to improve the generalization of the channel parameter prediction model.In addition,to realize fast network deployment in new propagation conditions,we propose an instance transfer scheme that can use the knowledge within the original condition for the new model training.The proposed method is helpful for the design and optimization of asymmetric massive MIMO systems under non-reciprocal channel conditions.
In our proposed model,both the environment information and UL channel parameters are chosen as features to participate in the DL channel parameter prediction.However,too many training features or irrelevant features may increase the computational complexity of the model and even reduce the accuracy(Liu and Tang,2014;Wang D et al.,2015;Lundberg and Lee,2017;Yang GS et al.,2019;Wang ZG et al.,2020).Thus,we design a two-step feature selection algorithm.Compared with classical feature selection methods which quantify the importance of multiple features and select the best number of features by ergodic search,our algorithm can automatically ascertain the number of most relevant features and therefore reduces the computation complexity of feature selection.Then,an ensemble-learning-based prediction model composed of multiple weakly supervised models is proposed to predict DL channel parameters.
A sufficient amount of training data are essential to ensure the prediction accuracy of AI-based methods.In practice,obtaining a large amount of training data under new propagation conditions is time-consuming and computationally intensive,thus limiting the generalization performance of AI-based methods.To settle this problem,we propose an instance transfer method to achieve prediction in new propagation conditions by applying the existing data from the original condition to the new model training.
Moreover,the aforementioned works concerning channel parameter prediction focus on path loss(PL) (Wu et al.,2015;Kim et al.,2016;Tuan et al.,2016;Li et al.,2017;Chen YS et al.,2019;Joo et al.,2019;Yang GS et al.,2019;Yang YW et al.,2019;Erden et al.,2020;Han et al.,2020)or delay spread(DS) (Kim et al.,2016;Li et al.,2017;Yang GS et al.,2019;Han et al.,2020).In this paper,we realize the prediction of many important parameters including PL,path number(PN),DS,azimuth angular spread of arrival (AASA),elevation angular spread of arrival(EASA),azimuth angular spread of departure(AASD),and elevation angular spread of departure (EASD).The prediction results are compared with the data generated by commercial ray-tracing software.It is verified that the proposed method can provide acceptable prediction accuracy with low complexity.
The major novelties and contributions of this paper are summarized as follows:
1.A prediction method based on ensemble learning and instance transfer is proposed to predict DL parameters in non-reciprocal DL channel parameters.The proposed method is able to predict multiple DL channel parameters including PL,PN,DS,and angular spreads.Simulation results show that this method can achieve rapid channel parameter prediction while ensuring their accuracy.
2.We propose a two-step feature selection algorithm that can determine a feature importance ranking and an optimistic feature combination.This algorithm reduces the computational complexity of feature selection and is beneficial for improving prediction accuracy.
3.The instance transfer method is introduced to assist the prediction model in new propagation conditions.By using this transfer-learning-based approach,it is possible to deploy our method in a new propagation condition where enough training data are difficult to collect within a short time.
As illustrated in Fig.1,we consider an asymmetric massive MIMO system in an urban scenario,which is composed of a BS andnUEs.The array on the BS side is equipped with non-reciprocal Tx/Rx beam patterns.
Fig.1 A typical application scenario of the asymmetric massive MIMO system (BS: base station;MIMO:multiple-input multiple-output;UE: user equipment)
Environment features are those features that reflect the characteristics of the communication environment.In addition to the commonly used UE coordinates and propagation distance(Luo et al.,2019;Yang GS et al.,2019),we further consider the properties of buildings between BS and theithUE.The environment featuresare given by
Fig.2 The explanation of environment features (BS:base station;UE: user equipment)
In practice,we can obtain partial data for training purposes by measurement or other means.The training set includes UL channel parameters,environment features,and DL channel parameters,which are denoted asandrespectively,wherei′denotes the index of the UE whose DL parameters are known.To predict the DL parameters of other UEs in the same condition,the first step of the proposed prediction method is to select features from the UL parameters and environment features as inputs of the ensemble learning model.We design a two-step feature selection method,in which the SHapley Additive exPlanations(SHAP)algorithm is employed to obtain the feature importance,and then the minimum description length (MDL) algorithm is used to determine the number of selected features.Second,the ensemble learning model can be built with a training set to realize DL channel parameter prediction.The UL channel parameters and the environment features of the predictive channel are expressed asrespectively,wherei′′denotes the index of the UE that is waiting for the prediction,and are inputted into the trained ensemble learning model to predict DL channel parameters.
The accuracy of the built model relies on the amount of training data.However,in practical applications,it is difficult to obtain sufficient training data in a short time for a new propagation condition.Thus,we introduce the instance transfer method to quickly build the DL channel parameter prediction model in a new condition.The new condition contains changes in the Tx beamwidth of the BS or changes in the sector where the UEs are located.Assume that the index of the UE in the training set within the new condition isj′.Similarly,it consists of the UL channel parametersenvironment featuresand DL channel parametersUnlike the above prediction steps,we perform feature screening for the training set under the new channel propagation condition based on the feature selection results obtained in the first step.In the third step,the instance transfer model is trained together withThe fourth step is to input the UL channel parameters and environment features under the new conditions into the trained instance transfer model,which will output the corresponding DL channel parameters under the new conditions after calculation.
In this section,we consider the prediction of DL parameters in the same propagation condition with the training set.As mentioned above,to achieve balance between the prediction accuracy and the time consumption,the first step of the prediction model training process is selecting the features fromandThe second step is to feed the selected features and DL parameters of the training samplesinto the weak learners of the ensemble learning model for training.The overall process is shown in Fig.3.
Fig.3 Framework of the proposed downlink parameter prediction in the same condition (SHAP: SHapley Additive exPlanations;MDL: minimum description length)
In Section 2,we have listed many features including the UL parameters and environment features.However,some features may have little relevance to the predicted targets.Using these features for training will increase the burden on the model,and even cause a decline in the accuracy of predictions.Moreover,for different DL channel parameters and different propagation conditions,the relationship between the features and the predicted parameters may differ.We propose a two-step feature selection method.The features are interpreted with the SHAP algorithm.The optimal number of features is then decided with the MDL algorithm.
3.1.1 Feature interpretation
The SHAP algorithm measures how much each feature contributes to each training sample.The median SHAP values reflect the importance of features for prediction and are used as the measurement of feature importance.Consider thei′thtraining sample;if we denote the basic prediction model used to calculate the SHAP values asG,the feature interpretation process can be expressed as
3.1.2 Estimation of the number of features
After calculating the median SHAP values of all features,we introduce the MDL-based threshold calculation algorithm to compute the number of selected features.The MDL criterion is commonly used for accurate source enumeration (Wax and Ziskind,1989;Huang et al.,2009;Bazzi et al.,2016;Lin CH et al.,2018).In this paper,the algorithm is used to determine the number of features involved in the prediction.The number of selected features computed by MDL is ?Kand the selected features are stated asThe objective function of the MDL criterion used in calculating the feature selection threshold is defined as
Then,we select the features with the top ?Kimportance values.These features compose the new training set and are inputted into the ensemble learning model to realize the prediction of DL channel parameters.
Boosting is the most popular branch of ensemble learning.The focal point of boosting is reducing bias.Therefore,boosting methods can build a strong integration learner from several base learners,thus enabling powerful generalization.Three boosting methods,such as adaptive boosting(AdaBoost)(Hu et al.,2008),extreme gradient boosting (XGBoost)(Chen TQ and Guestrin,2016),and light gradient boosting (LightGBM) (Wang DH et al.,2017),are selected for DL parameter prediction.
Based on the method depicted in Section 3,the DL parameters of the channel under the same propagation conditions can be predicted.When we consider the development in a new propagation condition,the Tx beamwidth may be set with more optional values,and the users may be located in different sectors.It is difficult to obtain enough training samples under the new condition in a short time,which affects the prediction accuracy of the datadriven ensemble learning model.In this paper,we further propose a method for the prediction of DL channel parameters with a small number of samples under new propagation conditions based on instance transfer.
We employ the two-stage TrAdaBoost.R2(Pardoe and Stone,2010) algorithm,which divides the weight adjustment of source domain instances and target domain instances into two stages.In the first stage,only the weights of source instances are adjusted to a certain point and the weights of target instances remain unchanged.In the second stage,the weights of all source instances are frozen and the weights of the target instances are updated.The process of the proposed prediction method based on the two-stage TrAdaBoost.R2 algorithm is described as follows.
wherew1,irepresents the weight of theithsample in the first iteration and is given by
The predicted values of the DL channel parameters can be acquired by inputting the selected features of the new propagation condition in the trained DL channel parameter prediction model.
The simulation data are generated by commercial ray-tracing software,Wireless Insite(Mede?ovi? et al.,2012),which is an accurate and reliable urban wave propagation calculation tool for planning and designing mobile wireless networks.A typical urban propagation environment in Ottawa,Canada is considered (Fig.4).It is an urban scene consisting of buildings and streets.The height distribution of buildings is from 9 to 51 m.The average height of buildings is 19.38 m.The area of the considered scenario is 1000 m×630 m.The BS is fixed in the center,30 m above the ground.The propagation environment is divided into three sectors centered on the location of the BS.The Rx beamwidth of the BS is 60°/120°and the Tx beamwidth is 10°/30°.All UEs employ omnidirectional antennas for transmission and reception and they are distributed along the roads.A total of 14 routes,namely routes 1–14,are considered.The height of each UE is 1.5 m.We consider 5635 different UE locations with 1 m spacing on each road.The channel parameters including PL,PN,DS,AASA,EASA,AASD,and EASD are calculated at each position.The parameter configurations are listed in Table 2.
Table 1 Definitions of UL/DL parameters and environment features
Table 2 Parameter configurations of the simulation environment
Fig.4 Propagation environment and sector partition in an Ottawa urban area (BS: base station)
The weak learners of the ensemble learning model are decision trees.The maximum tree depth is 16,and the number of ensemble members is 500.The performance of the proposed prediction method based on ensemble learning and instance transfer is evaluated.In terms of prediction accuracy and calculation time,the prediction results of the proposed method are compared with those of the 3GPP TR 38.901 (3GPP,2020) channel model and the back propagation neural network (BPNN) (Gao et al.,2021).
The mean absolute error (MAE) is used as a metric to measure the accuracy of the prediction results (Sterba and Kocur,2009;Pan et al.,2019).It can be calculated as
A standardized model,the 3GPP TR 38.901 channel model,is used as one of the comparison methods.We calculate PL using its urban macro(UMa)propagation model with the non-line-of-sight(NLoS)condition.
BPNN is also used as a comparison method.The rectified linear unit function is selected asthe activation function.A three-layer feed-forward structure is employed and the optimal number of neurons in the hidden layer is 15.
In the following analysis,we compare the prediction results of the proposed method with the 3GPP TR 38.901 channel model and the non-ensemble learning method BPNN.Both prediction accuracy and running time are evaluated.
To verify that the proposed two-step feature selection method could achieve balance between the prediction accuracy and efficiency of DL channel parameter prediction,we compare the prediction accuracy,number of features,and training time of the proposed two-step feature selection method with those of a variance filter (Wang F et al.,2018;Zhao et al.,2018;Huang et al.,2019) and Lasso regularization (Gallieri and Maciejowski,2012;Yao et al.,2015).The variance filter is used to remove the features with low variance.The results of feature selection using the variance filter are obtained by setting the threshold at 0.16 and discarding all features with variance less than the threshold.Lasso regularization performs regularization and feature selection of the given data.Lasso regularization places a limit on the sum of features and the results of feature selection consist of the features that are not reduced to zero after reduction.In the comparison simulation,the UL beamwidth is set to 120°and the DL beamwidth is set to 30°.The proposed models are trained using the feature sets obtained from different feature selection methods to predict multiple DL channel parameters.A comparison of different feature selection methods is shown in Table 3.
According to the comparison results of the predicted MAE,the number of selected features,and the model training time of different feature selection methods,it is shown that different numbers of features are selected,with the number ranging from 7 to 10.The feature selection based on the variance filter considers only the variance of the features,so the selection is the same for different prediction targets,and the number of features is 2.Although the reduction in the number of features reduces the model training time,the prediction accuracy is significantly reduced.For the feature selection based on Lasso regularization,different feature sets are selected depending on the prediction target,but the number of selected features is larger than that of the proposed method and is between 10 and 13.The increased number of features results in a longer training time.However,instead of improving the prediction accuracy of the model,the increase in the number of features enlarges the MAE compared to that of the proposed method.This indicates that the inclusion of less relevant features not only increases the training time of the model but also reduces the prediction accuracy.The feature set obtained by the proposed method can achieve accurate and efficient predictions.
To verify the proposed method in DL channel parameter predictions in the same condition,there are 2518 samples corresponding to different UE locations in sector 1.To verify the prediction ability of the proposed prediction algorithm,we randomly divide the collected data into two sets.Eighty percent of the data are used as the training set,and the remaining,as the testing set.Table 4 shows the predicted MAEs of different methods.The proposed method with three different boosting methods has higher prediction accuracy than BPNN.Furthermore,the prediction results when XGBoost is chosen as the boosting module are the worst.When AdaBoost or LightGBM is employed,AdaBoost has higher prediction stability and gives accurate prediction results for each DL channel parameter.Light-GBM has slightly lower prediction accuracy than AdaBoost for some parameters,but faster model training and prediction.
In addition,the Tx and Rx beamwidths of the BS have an impact on the model prediction accuracy.When the beamwidths of Tx and Rx differ greatly,the intrinsic relation between UL and DL channel parameters drops drastically,which causes increase of MAE.Finally,Table 4 illustrates the MAEs of the proposed method and BPNN for different DL parameters.
Table 3 Comparison of different feature selection methods
To compare the prediction performance between the proposed method,3GPP TR 38.901 channel model,and BPNN,we select PL prediction results of route 4 and plot them in Fig.5.The horizontal coordinate indicates the indices of the test samples,which are given in the order of their positions.Thevertical coordinate indicates the PL values.From the prediction results of ray tracing in the figure,the PL values on the street fluctuate greatly due to the presence of building obstruction in the urban scenario.The proposed method can accurately predict the PL fluctuation.The MAE of the proposed method is 1.01 dB.In contrast,3GPP TR 38.901 can predict only the trend of PL based on the propagation distance.The MAE of the 3GPP TR 38.901 channel model is 26.91 dB.BPNN can also predict PL fluctuations,and the MAE of BPNN is 4.21 dB,which is larger than that of our proposed method.
Fig.5 Path loss prediction values of the proposed method with AdaBoost and other methods
In addition,we compare the running time of each method and list the results in Table 5.Ray tracing relies on the electromagnetic map,and the calculation time is almost hundreds of times that of our method with AdaBoost and almost thousands of times that of our method with LightGBM.In addition,when using the proposed method for DL parameter prediction of the channel in the same condition,offline training is available and the model needs to be trained only once.After the model is trained,the actual time required is only the prediction time.BPNN takes less time than ray tracing,but its training time and prediction time are much longer than those of our proposed method.Among our proposed method with different boosting methods,LightGBM takes the least time,whereas AdaBoost takes the longest time for training and prediction.
Table 4 MAEs of different methods for each Downlink parameter
Table 5 Running time of different methods
In this subsection,we consider two transfer cases of new conditions.The first case is to predict the channel parameters with a new Tx beamwidth.To verify our proposed method,the source domain is selected as the channel with 120°Rx beamwidth and 30°Tx beamwidth in sector 1.The target domainis the channel with 120°Rx beamwidth and 10°Tx beamwidth in the same sector.The prediction target is the DL PL with the new Tx beamwidth.The total number of samples in the target domain is 2518.We choose only 10–250 samples for training.The training set is composed of a total of 2518 samples in the source domain and the chosen samples in the target domain.We compare the performance of using the proposed algorithm with that of using the method without the transfer learning algorithm.
Fig.6 shows a comparison of the results,showing that the participation of instance transfer improves the accuracy of DL channel parameter prediction.As the number of samples in the target domain increases,the prediction accuracy of the proposed model gradually increases.When the number of samples in the target domain reaches 150,the predicted MAE is only 3.54 dB.Meanwhile,the MAE of the method without transfer learning is 14.25 dB.Note that the proposed method can also be extended to the prediction of other channel parameters.
Fig.6 Comparison of the path loss prediction accuracy of the proposed method and the method without transfer learning when the downlink beamwidth changes from 30° to 10° (MAE: mean absolute error)
The second case is to predict the channel parameters in different sectors.We select the source domain as the channel with 120°Rx beamwidth and 30°Tx beamwidth in sector 1.The target domain is the channel with the same beamwidth pair in sector 2 or 3,and the prediction target is the DL PL.The total number of samples in sector 2 is 1703 and that in sector 3 is 1414.We choose only 50–200 samples of the target domain for training.The training set is composed of a total of 2518 samples in the source domain and the chosen samples in the target domain.
Fig.7 shows the comparison between the proposed method and the method without transfer learning.The results of the proposed method are always better than those of the method without transfer learning.In sector 2,when the number of samples in the target domain reaches 150,the MAE of the proposed method has fallen to 6.31 dB and is 19.01 dB less than that of the method without transfer learning.Moreover,the prediction accuracy is related to the similarity between the original sector and the new sector.As shown in Table 2,the distribution and the average height of the buildings in sector 2 are more similar to those in sector 1,and the prediction error of sector 2 is a bit lower than that of sector 3.
Fig.7 Comparison of the path loss prediction accuracy of the proposed method and the method without transfer learning when the sector changes (MAE:mean absolute error)
In this paper,an ensemble-transfer-learningbased channel parameter prediction method was proposed for asymmetric massive MIMO systems with different Tx/Rx beam patterns.The UL channel parameters and environment features participated in the model training.A two-step feature selection algorithm was designed to improve the prediction accuracy and efficiency of the model.In addition,the proposed method could predict DL parameters in the new propagation condition with the help of the instance transfer algorithm,even when the new training samples were insufficient.The proposed method could be used to predict PL,PN,DS,and angular spread.Its performance was compared with those of the 3GPP TR 38.901 channel model and BPNN.Simulation results showed that the prediction accuracy of the proposed method was better than those of the compared methods.When the Tx beamwidth or the sector changed,the instance-transfer-based DL parameter prediction method provided higher prediction accuracy than the method without the transfer learning algorithm with a small number of new samples.The proposed prediction method could be useful in analyzing the effect of beamwidth on channel parameters and for developing practical applications of asymmetric systems,thereby reducing the burden of data processing and hardware costs in 6G mobile networks.
Contributors
Zunwen HE,Yan ZHANG,and Wancheng ZHANG proposed the ideas and designed the simulations.Yue LI processed the data and completed the simulations.Zunwen HE,Kaien ZHANG,Liu GUO,and Haiming WANG drafted,revised,and finalized the paper.
Compliance with ethics guidelines
Zunwen HE,Yue LI,Yan ZHANG,Wancheng ZHANG,Kaien ZHANG,Liu GUO,and Haiming WANG declare that they have no conflict of interest.
Data availability
Due to the nature of this research,participants of this study did not agree for their data to be shared publicly,so supporting data are not available.
Frontiers of Information Technology & Electronic Engineering2023年2期