Telephone Traffic Prediction Based on Modified Forecasting Model

This study presents a busy telephone traffic prediction model that combines wavelet transformation and least squares support vector machine. Firstly, decompose preprocessed telephone traffic data with Mallat algorithm and get low frequency component and high frequency component. Secondly, reconfigure each component and use LS_SVM model to predict each reconfigure one. Then the traffic can be achieved. The results of experiments have testified higher prediction accuracy and stability of this combined traffic prediction model.


INTRODUCTION
With the continuously increasing of mobile users, telephone traffic is also increased.And in holidays, mobile communication network are faced with the great influence of heavy telephone traffic.In order to prevent network congestion and improve the utilization ratio of network resources, accurate prediction of the future telephone traffic has an important significance.Traditional linear prediction model of time series include Poisson distribution model, Markov model, AR model and ARMA model (Jie et al., 2007), Poisson model cannot describe the characteristics of the contemporary telephone traffic.Markov model has high complexity of the calculation.The algorithm of AR model and ARMA model is simple, but the prediction error is high.ARIMA (Wang and Shen, 2010), FARIMA (Xiao-Tian and Jing-Xian, 2011) prediction model can better capture long and short data-related characteristics, but for non-linear and non-stationary traffic, they do not forecast accurately.Neural network model (Jun-Song, 2009;Xu-Qi and Zong-Tao, 2011) reflect the inherent law of data through the training, but the neural network is based on empirical risk minimization principle and it is easy to lead to less learning or more learning phenomenon in the training process, so the prediction is not accurate.The method of Support Vector Machine (SVM) is based on structural risk minimization (Cristianini and Shawe-Taylor, 2004;Hai-Shen and Xiao-Ling, 2006;Hua and Hong, 2011;Rui and Zhen-Hong, 2011;Xiao-Dong et al., 2005;Xiao-Qiang and Xian-Min, 2011;Zhi-Hui et al., 2007), it can better solve the data of the small sample, nonlinear, high dimension and other characteristics.The shortcomings of the traditional SVM is on the issue of quadratic programming, namely, kernel function iteration error accumulation will lead to inaccurate results.And the method use equality constraints of least squares support vector machine (LS_SVM) to replace the inequality constraints of SVM to solve the quadratic programming problem which is transformed into the problem of linear equations, so as to simplify the model structure and it can improve the training speed.So the LS_SVM model has been the most widely used.The wavelet transform has strong Multi-scale analysis capability; it can remove the correlation of traffic data.The literature (Jun et al., 2008;Hua-Li and Yuan, 2011;Xiao-Tian and Jing-Xian, 2011) use the combined forecasting model with wavelet, the combination model can achieve good prediction.Single prediction model can't solve various characteristics of telephone traffic, so this study is combined with wavelet prediction model.
For the non-stationary self-similarity and multiscale characteristic of telephone traffic, this study puts forward a busy telephone traffic prediction model that combines wavelet transformation and least squares support vector machine.Firstly, use the 3-layer wavelet of Mallat algorithm decomposition on the busy traffic data to get the low-frequency component and highfrequency components.Secondly, reconfigure each component and use LS_SVM model to predict single reconstructed components.Finally the prediction results are superimposed.There must be some errors after LS_SVM prediction model.These errors in the low frequency part and high frequency part of the prediction may be positive or negative.In the last step for telephone traffic synthesis, it can make positive and negative error offset each other, so the improvement of the wavelet transform can achieve better prediction results.The simulation experiment shows that this model has higher prediction accuracy and stability.The decomposition and reconstruction of Mallat algorithm formula: The H is a low-pass filter; G is a high-pass filter.The original signal is decomposed into low frequency part and high frequency part by Mallat algorithm.The low frequency component reflects the outline characteristics and changing tendency of the busy telephone traffic data and high frequency component reflects the impact of dynamic factors such as random disturbance.Low frequency part can be further decomposed, so we can obtain the new low frequency component and high frequency component.Single reconstruction is not to reconstruct the low frequency part and high frequency part at the same time, but separately reconstructed.That's to say when a certain part is reconstructed; the other part should be set to zero.
The difficulty of wavelet decomposition and reconstruction lies in the selection of wavelet basis.This study use different wavelet basis to superpose the reconstruction of the low frequency and high frequency data, then compare with the original data and find that when using biro 1.3 as wavelet basis, the error is 10 -11 (Fig. 1).While using wavelet basis such as dbN and sym, the error is 10 -8 .So this study chooses biro 1.3 as wavelet basis.

The model of LS_SVM:
The main idea of nonlinear regression of Least Squares Support Vector Machine (LSSVM) is that the input data is mapped to high dimensional feature space H through the nonlinear mapping ψ (.), so low dimensional nonlinear regression problem is transformed into linear regression problems in high dimensional feature space.The principle is that we assume that there are 'm' samples data, such as (x 1 , y 1 ), (x 2 , y 2 ) … (x m , y m ) and soon.x i ∈R n is Fig. 2: 113 busy telephone traffic sample input and y i ∈R is the sample output.Construct the optimal regression function: The W T ∈H, b∈R, b is bias, ψ (x): an Rn-H nuclear space mapping function.
According to the principle of solving objective and the structural risk minimization, use Lagrange multiplier method to solve optimization problem.Optimization problem of Lagrange function for least squares support vector machine is: According to the KKT optimization conditions, get the partial derivative from the type (4) and make the result 0. Use Radial Basis kernel Function (RBF) to solve the problem of high dimension calculation.The optimization problem can be further transformed into the solution of linear equations: where The final nonlinear regression model is: Generally speaking, the more data is selected, the more correctly the learning and training results reflect the relationship between the input and output and the higher precision of prediction is.However, in practical applications, it is impossible to increase the sample data without restrictions, in which case it should try to select representative sample.Considering the correlation between four factors and telephone traffic, the author made some of data pretreatment in this study to guarantee that it would improve the efficiency of the algorithm without losing the original data characteristics.During the May Day, the traffic is larger and on May 4 th the traffic is largest, this experiment is mainly to predict the busy traffic of May 4 th day.
As can be seen, when considering the four impact factors, the prediction of combination forecasting method of WT_LSSVM is closer to the actual value.This is mainly due to the error of the prediction result of the respective components which is positive or negative.After superposing each component, the prediction error can be canceled from each other.So the prediction error can be reduced to -0.0102 and it takes 4.91s.The efficiency of LS_SVM prediction model which only takes 1.16s is higher than the prediction of WT_LSSVM, but the accuracy of LS_SVM prediction model is lower, because the relative error is -0.0535.The prediction error of Elman is -0.0627 and it takes 8.14s.Both the precision and efficiency is lower than WT_LSSVM prediction method.The stability of Elman model is worse than the compared model.

CONCLUSION
This study put forward a kind of improved telephone traffic measurement algorithm based on wavelet transform and LSSVM model.At first, decompose preprocessed telephone traffic data with Mallat algorithm and get low frequency component and high frequency component, then reconfigure each component and use LS_SVM model to predict single reconstructed components.In the last step for telephone traffic synthesis, it can make positive and negative error offset each other, so the improvement of the wavelet transform model can achieve better prediction results and more stable.It will start to research prediction model which combines wavelet transform with optimized LSSVM in next step, improving the prediction accuracy further.


Preprocess the original telephone traffic data, then get x' (n)  Decompose x' (n) with three layers wavelet of Mallat algorithm and get low frequency component (a 3 ) and high frequency components (d 1 , d 2 , d 3 )  Use Mallat algorithm to reconfigure low frequency component (a 3 ) and high frequency components (d 1 , d 2 , d 3 ), get A 3 and D 1 , D 2 , D 3  Use LS_SVM model to predict A 3 , D 1 , D 2 , D 3 , get A 3 ' and D 1 ', D 2 ', D 3 '  Get the final forecast results: x' (n)' = A 3 '+ D 1 '+D 2 '+D 3 ' Data pretreatment: In mobile communication, there are some changes about the telephone traffic every day and the number of telephone traffic is not the same at different times.Especially in holidays, the busy telephone traffic (peak value) has the largest effect on mobile communication network.Sample telephone traffic one time every hour per day and collect 24 data.Then compare the 24 data and treat the maximum as busy telephone traffic data.Collection of 113 days telephone traffic can get 113×24 data and 113 busy data.Decompose 113 busy data and reconfigure each component, we can get A 3 and D 1 , D 2 , D 3 four components.In consideration of the busy VLR users, VLR boot users, message and system connection rate impacts on traffic, treat the four factors as impact factor of the telephone traffic, then use the four factors and single reconstruction data to compose four new matrixes 113×5.Conduct the front of 112 data as the training sample data to predict the final busy telephone traffic data.If the collection of traffic data is x t (n) (n: n∈ [1, 113] the days, t : t∈ [0, 23] the sampling time every day) and assuming the largest traffic is x max (n) per day, then use the busy data of X max (1), X max (2), …X max (112) to predict the 113 th day of busy telephone traffic data.In 1987, Mallat and some others put forward Mallat algorithm for rapid decomposition and reconstruction.Literature (Hua-Li and Yuan, 2011; Xiao-Tian and Jing-Xian, 2011) use the combination forecast model based on the Mallat algorithm of modified wavelet transform and obtain a better prediction result.

Fig. 3 :
Fig. 3: The decomposition of the busy telephone traffic

Fig. 9 :
Fig. 9: The comparison of the prediction results predictive value.The method to predict high frequency components is similar to the low frequency.Selecting different canonical and kernel function parameters for four components prediction can improve prediction accuracy.Low frequency component (A 3 ) and the high frequency component (D 1 , D 2 and D 3 ) are predicted respectively as shown in Fig. 5 to 8.And then the predicted value of the respective components will be superimposed to obtain the prediction busy telephone traffic of the last day.Results analysis: The method of analyzing relative error is used in this study.The error formula is ξ = (x i 'x i ) /x i (x i ' is prediction and x i is practical value).If the busy telephone traffic on May 4 is 'S' and low frequency component (A 3 ) and the high frequency component (D 1 , D 2 , D 3 ) are predicted respectively as A 3 ', D 1 ', D 2 ' and D 3 '.Then the relative error of WT_ LSSVM prediction model is ξ = [(A 3 ' + D 1 ' + D 2 ' + D 3 ') -S] /S.This study chooses LS_SVM and Elman neural network prediction model as contrast model.The results of prediction shown as Fig. 9. Experiments were conducted 10 times in this study.The error and prediction efficiency of the prediction model for the statistics are shown as in Table1.As can be seen, when considering the four impact factors, the prediction of combination forecasting method of WT_LSSVM is closer to the actual value.This is mainly due to the error of the prediction result

Table 1 :
The error and prediction efficiency of the prediction model