Comparative Study of MLP and RBF Neural Networks for Estimation of Suspended Sediments in Pari River, Perak

,


INTRODUCTION
Since the last two decades, numerous studies have been performed to estimate sediment concentrations and understand mechanisms of sediment movement in the natural rivers or manmade water channels using different soft computing techniques (Kakaei Lafdani et al., 2013;Liu et al., 2013;Mustafa et al., 2011b).Particularly, in river engineering and training evolution, utmost efforts have been made to estimate suspended sediments precisely.This is because of its high importance and incredible influence on water resources engineering related projects including river training and practices, river management, design and planning of hydraulic structures and hydropower plant intakes.Additionally, this topic is also gaining much popularity due to its environmental impact with respect to industrial discharge and agricultural residuals to natural rivers which contaminates the bed load material with poisonous constituents.
Artificial Neural Network (ANN) is one of the best predicting techniques particularly to establish relationships between variables pertaining highly nonlinear and complex patterns.In recent years, a number of comparative studies have been reported with multiple combinations of neural networks with respect to training algorithms, basis functions and some hybrid techniques (Kakaei Lafdani et al., 2013;Long and Pavelsky, 2013;Mustafa et al., 2013;Mustafa et al., 2012b;Mustafa et al., 2012c, d).Particularly in water resources engineering a wide range of applications of neural networks have been observed including for the estimation of river suspended sediments.Jain (2001) predicted sediment concentration in Mississippi River using multilayer perceptron neural networks and compared the results with conventional sediment rating curve method.Nagy et al. (2002) estimated sediment load in rivers using MLP neural networks with backpropagation training algorithm.They compared performance of MLP model with the conventional sediment load formulas and suggested that MLP performed comparatively better than conventional methods.Cigizoglu (2004) estimated suspended sediments in rivers using multiple linear regression model, stochastic autoregressive model, sediment rating curve and MLP neural networks.He suggested MLP model was superior to the rest of the models.Kisi (2005) investigated the application of neuro-fuzzy, neural networks, sediment rating curve and multiple linear regression approaches for estimation of sediments in rivers.He suggested that the neuro-fuzzy and neural network models produced comparable results.Alp and Cigizoglu (2007) simulated suspended sediment loads in rivers using radial basis function and MLP neural networks.They found that both models produced results very close to each other, however they suggested that RBF networks may provide some advantages to the user because of its application.Melesse et al. (2011) estimated suspended sediments in rivers using multiple nonlinear regression model, MLP and autoregressive integrated moving average method and found that MLP model was superior than rest of the models.Mustafa et al. (2011a) predicted river suspended sediment load using MLP and MATLAB built-in self-learning radial basis function neural networks and proposed that self-learning radial basis function can produce better results compared to MLP model.
Although, all the above studies successfully estimated sediments in rivers it was found that ANN produced comparatively better than rest of the models.However, investigation of best performance of most appropriate and best training algorithm for sediment prediction was only reported by Mustafa et al. (2012c).They suggested that Levenberg Marquardt training algorithm perform better than other MLP training algorithms.But no study was observed in literature showing the performance comparison between the best MLP training algorithm and RBF basis function models for estimation of sediments in rivers.Therefore, the objective of this study is to evaluate the performance of the best radial basis function among six basis functions to estimate the suspended sediments.Additionally, to compare the performance of best MLP and RBF models for accurate estimation of river suspended sediments.

MATERIALS AND METHODS
Study area: Pari River in Perak State of peninsular Malaysia was selected to quantify the sediments in the river to accomplish the objectives of this study.Pari River is in fact a sub-catchment of Kinta River.The Pari River spans over a drainage area of about 284 km 2 and collects an average annual mean precipitation of about 2250 mm.Nearly 45% of the catchment area of Pari River is developed and the rest is covered by forest and some agricultural practices.Previously relatively frequent of floods has been observed that might happen due to heavy sedimentation in the river caused by tin mining activities in the catchment area (Sinnakaudan et al., 2003).Time series data of water discharge and suspended sediments of Pari River used for this study was obtained from the Department of Irrigation and Drainage (DID), Kuala Lumpur.
Artificial neural network modeling: Artificial neural networks are data processing modeling techniques which are generally used for estimation, forecasting, pattern recognition, optimization and establish relationships between complex featured variables.Radial basis function and multilayer perceptron are the most popular and commonly used types of neural networks for estimation of relationships between hydrological parameters.This study is mainly divided into two sections; the first section evaluate the performance of six different basis functions and identifies the appropriate basis function for prediction of suspended sediment in Pari River.In the subsequent section, a comparison between the best RBF model and MLP model trained with most appropriate LM training algorithm is presented.
Prediction of suspended sediment discharge using RBF neural network: A program code in MATLAB ® was written to perform radial basis function neural network modeling for estimation of time series data of suspended sediments.One current water discharge with two antecedent values was used as input to predict the current suspended sediment value in the river.Six different basis functions were employed to investigate the most appropriate basis function for the available data set.Spread of the basis functions were selected by normalization method whereas the number of hidden neurons was established by trial and error procedure.A comparison between the training and testing stages using six basis functions was made using three commonly used statistical measures (RMSE, MAE and CE).This comparative analysis is shown in Table 1.
The Thin Plate Spline produced the highest error during the training (RMSE = 424 and MAE = 234) and testing stages (RMSE = 281, MAE = 204) among all the six basis functions.The Coefficient of Efficiency (CE) is also low during the training (CE = 0.7676) and testing stages (CE = 0.7759).The high prediction error and low efficiency produced by the Thin Plate Spline basis function suggest that the basis function did not learn the exact relationship between the input and output variables well.
The prediction errors of the models at the testing stage are fewer than at the training stage and the coefficient of efficiency is high except for the Cubic basis function.In the Cubic basis function, the prediction error is less in the testing stage but the prediction efficiency is slightly reduced.The reason might be the difference between the mean of the training and testing datasets; this is because, the coefficient of efficiency is computed with relation to the mean of the dataset.In contrast with the Thin Plate Spline function, the Cubic and Linear basis functions performed better.The Cubic basis function, during training, produced less error and higher efficiency (RMSE = 285, MAE = 168 and CE = 0.8951) compared   But during the testing stage, the Inverse Multiquadric (RMSE = 68, MAE = 37 and CE = 0.9870) and Gaussian (RMSE = 61, MAE = 41, CE = 0.9895) produced significantly better results and predicted the sediment with accuracy nearly equal or even slightly better than the Multiquadric function (RMSE = 77, MAE = 51 and CE = 0.9833).It was observed that the lowest prediction error and highest coefficient of efficiency during the testing stage was produced by the Gaussian function.However, it was found common among all the basis functions that every function produced better results in the testing stage as compared to the training stage.This might be because of a huge difference between the range of the datasets (maximum and minimum values) and the mean of the training and testing dataset.However, the minimums and maximums of the testing stage have been trained very well even beyond the limits of the testing dataset.
Obviously, all the results suggested that the six basis functions employed are able to predict suspended sediment discharge in rivers but with significantly different prediction accuracy.The presented results recommended that the Gaussian, Multiquadric and Inverse Multiquadric are highly efficient at predicting the suspended sediment discharge as compared to the Cubic, Linear and Thin Plate Spline functions.Based on the overall prediction error and efficiency during the testing stage of the models, the Gaussian function may be suggested to be the most suitable function among the ones tested.

RESULTS AND DISCUSSION
Comparison between the MLP and RBF models for the prediction of suspended sediments: Mustafa et al. (2012c)  The comparison between the predicted time series of suspended sediments by best RBF and MLP models are presented in Fig. 1.It shows the time series of observed and predicted suspended sediments during the testing stage of both models.The models have learned precisely the nonlinear pattern of the suspended sediment discharge during the training and produced good generalization during the testing stage of the networks.The observed and the predicted suspended sediment discharge values are close to each other.The difference between the observed and predicted values between both models is very small.This examination reveals that both models have followed the exact pattern of suspended sediment data and predicted the data with an insignificant difference from the observed values.Figure 1 suggested that both the RBF and MLP models predicted the suspended sediments very closely; even most of the observed data have been overlapped by the predicted ones.Previous attempts by (Alp and Cigizoglu, 2007) on suspended sediment prediction from rainfall and river flow data using RBF and MLP neural networks showed some negative prediction values for suspended sediment discharge.However, in this study, time series of river discharge and sediment discharge data were used for training the models and no negative prediction values for the suspended sediment discharge was observed.
A comparison of the predicted and observed suspended sediment discharge data with the line of perfect agreement is shown in Fig. 2. Both models showed good agreement with the line of perfect agreement and predicted the suspended sediment data very close to the observed values.The nearly perfect agreement of both models suggests their appropriateness for prediction of suspended sediments.The comparison between the models showed that the RBF (Gaussian) model produced some values at above or below the line of perfect agreement particularly at the above 2500 tons/day data.This discrepancy indicates that the Gaussian function is not efficient enough to predict data at high suspended sediment values.Conversely, the MLP (LM) model showed a consistent behavior throughout the data range and predicted all values close to the observed data even at high suspended sediment values.The coefficient of determination of the MLP model (R 2 = 0.9929) is slightly better than the RBF model (R 2 = 0.9907).However, this minute difference of the coefficient of determination between both models may not be enough to select the superior model but the more accurate and efficient prediction at high suspended sediment values makes the MLP model superior to the RBF model.The comparative analysis between the MLP and RBF models using statistical performance measures is shown in Table 2.The summary of the statistical measures shows that the MLP was well trained during the training stage as compared to the RBF model.There was a significant difference in the training performance between the models.The RBF model showed high Some previous studies for prediction of sediment using different Artificial Intelligence techniques also produced good coefficient of determination i.e., R 2 = 0.894 (Kisi, 2005), R 2 = 0.91 (Cigizoglu, 2004) R 2 = 0.94 (Kisi, 2008), R 2 = 0.958 (Cigizoglu and Alp, 2006), R 2 = 0.92 (Alp and Cigizoglu, 2007), R 2 = 0.99 (Kisi et al., 2008).However, this study was intended to present the performance evaluation of best MLP training algorithm and Radial Basis Functions to enhance the efficiency of the models while using ANN techniques.The results demonstrated in this study strongly suggest that the testing of all basis functions and training algorithms is always advantageous for applications of ANN to predict any water resources engineering variable.
All the results for the comparison between the MLP and RBF models in this section revealed that the performance of both models in all cases is close to each other.Both models captured well the complex behavior of the suspended sediments.But the RBF model showed some inconsistency while predicting high suspended sediment data.Particularly, the RBF model during the training stage showed poorer performance compared to the MLP model.Therefore, based on the performances of the models in this section, MLP could be a better option to capture nonlinear patterns of river suspended sediments particularly at high suspended sediment data.

CONCLUSION
ANN was successfully applied for prediction of time series data of suspended sediment at Pari River, Perak, Malaysia.All the results showed the robust prediction ability of ANN for the selected time series of the water resources variables.The study suggested that the appropriate application of ANN may lead to solving several problems of water resources engineering rather than only one.In this regard, the developed models presented the prediction of suspended sediments in Pari River by using only water discharge and sediment data.The examination of different basis functions proposed that all the basis functions can predict suspended sediments in rivers but with different accuracy.Therefore, it is always advantageous to establish a comparative analysis between different basis functions to obtain the most appropriate function for the time series data.Furthermore, the performance comparison of best basis function model with the best training algorithm model showed the predicted values were close to each other but application of MLP neural networks are more user friendly compared to in RBF networks.However, all the comparisons between the established models in this study suggested that appropriate selection of basis function/training algorithm have prime importance to establishing a most an efficient ANN model.
to the Linear (RMSE = 351, MAE = 183 and CE = 0.8402) basis function.The performance during the testing stage by the Cubic function (RMSE = 198, MAE = 143 and CE = 0.8882) is comparable to the Linear function (RMSE = 194, MAE = 131 and CE = 0.8929).However, the Multiquadric, Inverse Multiquadric and Gaussian functions performed robustly during the training as well as testing stages.These three algorithms outperformed the rest of the algorithms particularly in the testing stage.The prediction error and efficiency during the training stage by the Multiquadric (RMSE = 124, MAE = 66 and CE = 0.9800) is comparatively better than the Inverse Multiquadric (RMSE = 314, MAE = 83 and CE = 0.8726) and the Gaussian (RMSE = 265, MAE = 57 and CE = 0.9091).

Fig. 1 :
Fig. 1: Time series of observed and predicted suspended sediments discharge (LM and Gaussian) error and low efficiency (RMSE = 265, MAE = 57 and CE = 0.9091) as compared to the MLP model (RMSE = 47, MAE = 29 and CE = 0.9971) during the training stage.However, both models performed equally well during the testing stage.The prediction error and coefficient of efficiency by the RBF (RMSE = 61, MAE = 41 and CE = 0.9898) and MLP (RMSE = 62, MAE = 40 and CE = 0.9895) models during the testing stage are similar.The inconsistency in the RBF model between the training and testing stages could be attributed to the maximum and minimum range of the dataset.As observed from Fig. 2, the RBF (Gaussian) model was not able to predict high suspended sediment values (above 2500 tons/day) efficiently.As, the training data have many values beyond this range (2500 tons/day) as compared to the testing dataset, the RBF model did not learn these high suspended sediment values very well during the training stage, resulting in lower efficiency in the training performance and relatively poor prediction of high values during the testing stage.

Table 1 :
Statistical performance evaluation of different radial basis functions concluded that the most appropriate MLP training algorithm for prediction of suspended sediments was Levenberg Marquardt (LM) training algorithm.Therefore, this study presents the comparison of best RBF model based on Gaussian basis function (identified in the previous section) and best MLP model based on LM training algorithm for estimation of suspended sediments in Pari River.