A Preliminary Study on a Hybrid Wavelet Neural Network Model for Forecasting Monthly Rainfall

In this paper, a hybrid wavelet neural network (HWNN) model is developed for effectively forecasting rainfall with the data of antecedent monthly rainfalls, the ant colony optimization algorithm (ACO) is combined with particle swarm optimization algorithm (PSO) to improve performance of artificial neural network (ANN) model. ACO is adopted to initialize the network connection the weights of and thresholds of WNN and PSO is used to update the parameters of ACO, HWNN can avoid falling into a local optimal solution and improve its convergence rate and obtain more accurate results. In simulations based on monthly rainfall data from the city of Ningde in the southeastern China. The forecasting performance is compared with observed rainfall values, and evaluated by common statistics of relative absolute error, root mean square error and average absolute percentage error. The results show that the HWNN model improves the monthly rainfall forecasting accuracy over Ningde in comparison to the reference models. The performance comparison shows that the proposed approach performs appreciably better than the compared approaches. Through the experimental results, the proposed approach has shown excellent prediction performance.


INTRODUCTION
Prediction is a challenging research the topic of prediction since the world is full of complex nonlinear problems.These problems are related to various concrete daily applications that related to many fields.For instance, prediction can be used to anticipate stock market prices, climatic conditions, soil moisture level, robots positioning and tracking (Chouikhi, N., et al. 2017), especially time series forecasting.Over the past few years, artificial intelligence techniques have been frequently used to predict the nonlinear time series and achieved good results (Kisi, O., 2008;Nourani, V.et al. 2011) Recently, a wavelet neural network model which uses multi-scale signals as input data that can present more suitable prediction performance rather than a single pattern input (Alizdeh, M.J.et al.2015;Nourani, V. et al.2009;Okkan, U., 2012;Rajaee, T. et al.2010).Generally, using soft computing techniques such as artificial neural networks (ANN), Adaptive Neuro-Fuzzy Inference System (ANFIS) and wavelet neural network (WNN) has the potential to reduce the computation time and effort and the possibility of errors in the calculation.Gazzaz et al. developed an ANN model for the prediction of water quality index for Kinta River (Gazzaz, N.M. et al.2012).They applied more than 20 monitored parameters for their model development.Tryland et al. investigated the impact of rainfall on the hygienic quality of blue mussels and water in urban areas in the Inner Oslofjord, Norway (Tryland et al. 2014).The traditional methods which are based on linear relationships are not good enough for solving these types of problems (Alizadeh and Kavianpour, 2015),the most common optimization techniques are ant colony optimization algorithm (ACO)( Y. Li, et al. 2012), grid algorithm method (Shi and Zhou,2012) and genetic algorithm (X.Shi, J. Zhou,2009) But ant colony algorithm method has disadvantages such as initial pheromone scarcity, long-time searching and local best solution; grid algorithm method is computationally intensive, time consuming and low learning accuracy; genetic algorithm is operation complex and different issues need to design different crossover or mutation.The particle swarm optimization (PSO) was found to have the extensive capability of global optimization for its simple concept, easy implementation and fast convergence (Shi and Eberhart, 1999;Boeringer and Werner, 2004;A. Subasi, 2012).Therefore, instead of using the above algorithms, a new method--a hybrid wavelet neural network (HWNN), PSO and ACO is proposed to optimize the HWWN parameters in this study.
The purpose of this paper is to investigate the performance of wavelet neural network and optimization algorithms conjunction model for fourmonth-ahead rainfall forecasting to compare with WNN and BP Neural Networks Based on PSO (PSOBP) models.The presented study is the first application for forecasting precipitation using PSO, ACO and wavelet neural networks in literature.The paper is organized as follows: The second part analysis the data characteristic.The third part introduce methodologies, including wavelet neural network, Ant colony optimization algorithm and Particle swarm optimization algorithm.A hybrid wavelet neural networks model is described in the fourth part.The fifth part describes applications and discussion of results.Conclusions are presented in the last part of the paper.

Study Area and Data Collection
The climate in Ningde city is a humid climate of subtropical monsoon, it is less cold in winter and less hot in summer; but it has abundant rainfall.The summer is longest while the autumn is shortest; climate is vary, meteorological disasters frequent happening.Because there are four high-altitude mountain counties, meteorological elements of the geographical is differences.The city's average annual temperature is 17.5 degree, growth period has 327.9 days, frost-free period has 270.4 days, sunshine hours has1637.7 hours, precipitation which concentrated two periods is 1811.1 mm. that is, July and June is months of the rainy season (before the flood season) and from July to September is the typhoon season (after the flood season).There are about 3.5 typhoons in a year, the average number of rainstorm days is 5.7 days, the average probability of heavy rain occurring is 80.3%, especially, heavy rain may be caused by a big typhoon.
Long-term daily rainfall data were taken from the high-quality Ningde Weather Bureau daily rainfall dataset.There are 37 stations in Ningde.This resulted in this stations with data in 1985-2015 distributed relatively uniformly across Ningde.Each monthly rainfall time series is transformed to standardized monthly rainfall anomaly values for rainfall forecasts.
In this study, Statistical trend was conducted using R software packages (R-3.4.1).The time series data need to be described and visualized before the complex model is established, the data will be smoothing processed to explore its overall trend and decompose to see if there has seasonal trend in the time series.There are different levels after the time series is smoothed by a simple moving average method in difference k lags, k is 4, 6 and 12.The results is shown in Figure 1.

Contribution of this paper to the literature
• The extensive development of practical model will also promote the development of school teaching activities.Especially, nowadays has entered the stage of modern teaching, but should focus on diversification of teaching to adapt to increasingly complex teaching activities.
• Many young people in the model hobby to inspire wisdom, extend horizons, cultivate hands-on ability.As the model involves a wide range of knowledge, they will develop meticulous, patient habits and improve the aesthetic, cultivate temperament as well.
It can be seen from the graph of Figure 1, with the increase values of k, the image becomes more and more smooth, the time series does not have long-term significant rise or fall and the mean values is close to 160.Rainfall from 1995 to 2015 is no significant change in the trend, the other changes are not easy interpretation.For seasonal data's intervals of its greater than one, we need to investigate seasonal fluctuations and overall trends through seasonal decomposition, the data of time series can be decomposed by program R-3.4.1 into seasonal effects graph, trend graph and stochastic fluctuations graph, as shown in Figure 2.
See from the above flow chart, rainfall in Ningde is no linear trend in the most of time, which is randomness of change; these has a slight periodic oscillation in its season graph.We get the seasonal and quarterly figure to further analyze their changes (Figure 3).
From the Figure 3, it is evident that more precipitation in June and August and least in January and December.In the quarterly figure, it also can come to the same conclusion, but the annual distribution patterns is somewhat different similar with the seasonal in overall growth trend.Therefore, the sequence has non-linear characteristics, methods of nonlinear is adopted to predict.

Theory of Wavelet
Wavelet analysis is the development on the lack of Fourier transform, Fourier transform is the most extensive means of analysis in the field of signal processing, but it has a serious shortage that the conversion results cannot determine the time at which a signal occurs, because the time is abandoned during the conversion.Wavelet is a limited length waveform which average value is 0, its features include: 1. Time domain has a tight set or approximate tight set.

DC component is zero;
Wavelet transform is the transformation of a basic wavelet function ψ(t) after moving b distance, and then dot product to the signal x(t) to be analyzed at different scales.
The formula of equivalent time domain is:

Wavelet Neural Network
WNN can be regarded as a functional link network based on the wavelet function, WNN is built based on BP neural network structure, in which the transfer function of hidden layer nodes is the wavelet basis function.The signal transmit in forward propagation while the error transfer in the back propagation.In Figure 4,  1 ,  2 , . . .,   represent the input of the wavelet neural network. 1 ,  2 , . . .,   denote the predicted output.In this paper, the Morlet wavelet basis function is employed as the transfer function of hidden layer nodes, namely: where  is a contrast, in this paper  = 1.75.The output of the WNN can be expressed as: The weight parameter correction algorithm of wavelet neural network is similar to the BP neural network with the gradient to correct the network weight and wavelet basis function parameters and let the prediction output is approaching to the approximate expected output.The wavelet neural network correction process as follows: 1. Computational neural network prediction  = ∑   () − ().2. The wavelet neural network weights and the wavelet basis function coefficients are corrected base on the prediction error e.

Ant Colony Optimization Algorithm
Suppose there are  cities and  ants, let   (,  = 1,2, ⋯ , ) denotes the distance between  and  cities,   () indicates the residual pheromone intensity on the path of city between  and  at time of .Ant  chose the next path for process base on the residual pheromone density on each of paths during processing.   () denotes the property of Ant  from city  to the city , namely: ()( = 1,2, ⋯ , ) −   denotes ant  allow to walk the city set for the next, table   is taboo for the Ant .Ant k has completed a traversal is it visits all the cities and return to original point, that is, all the cities is putted in the   table.Then the path through which the ants is a feasible solution of the TSP.  is the heuristic factor that is expectation of the ants from the city  to the city , usually as the inverse of   ; ,  denote the pheromone and heuristic factors respectively of relative importance in the equation; when all the ants has completed a traversal, the pheromone on each path has to update in the global according to the formula 12:

Particle Swarm Optimization
PSO is developed by Kennedy and Eberhart (Kennedy and Eberhart, 1995), is considered as an excellent technique to solve the combinatorial optimization problem and very simple to implement (Shen, Q.et al.2007;Wu, C. H.et al.2009;M. Bigiarini, et al 2011).It progresses towards the solution by mutual sharing of knowledge of every particle collectively.In PSO, the population of particles with velocity    is initially randomly generated.Each particle's velocity gets updated with respect to its corresponding old position    using neighborhood best   (see Eqs. ( 13) and ( 14) and global best particle   until the convergence criterion is satisfied.After the convergence criterion is satisfied, the global best particle is the optimal solution.
where    is old velocity,   +1 is updated velocity,    is old particle, is updated particle,  +1 is a local best particle,   is the global best particle, c1 and c2 are two positive constants,  is the inertia weight and finally,  1 ,  2 is a random number between 0 and 1.

The Hybrid Wavelet Neural Networks Model
WNN has the characteristics of non-linear mapping of neural network and time-frequency analysis of wavelet, the application effect is superior to the simple of artificial neural network or wavelet analysis.However, it still has the difficulty of the traditional neural network how to determining the number of nodes, the initial weight, the scaling factor, the time shift factor and other parameters, which affect the network convergence speed and other problem.PSO algorithm is prone to premature convergence, especially in dealing with complex multi-peak search problems, and poor local optimization.In this paper, the ant colony algorithm is used to optimize the wavelet neural network.According to the Ant colony algorithm has the characteristics of global optimization, the optimal path that is the result of the ant colony algorithm taken as the parameter of the neural network.The performance of the ant colony algorithm is greatly reduced because it is a typical probability algorithm and parameters α and β are usually determined by the experiment.So, in this paper, PSO is used to train the ant colony system parameters α and β to improve the time efficiency and prediction accuracy.

The algorithm main idea is:
The parameters of the WNN is determined by the searching of the artificial ants.The path of the next cycle of the ants depends on the adjustment of the pheromone, the ant colony algorithm is used to optimize the wavelet neural network parameters, and the training does not stop until the end condition is satisfied.The pheromone adjustment formula of the ant colony algorithm is: Apply the basic ant colony mode Ant-Cycle, If the Number k ant in the cycle choice C  ，  (C  )() equals /  ; else equals zero.  is a set parameters selected by the ant of the number k, as the weight of WNN and the coefficient of expansion, contraction and translation, that is   =   −   , where   is the desired output of the wavelet neural network.

Steps of HWNN:
The operational flowchart for HWNN is shown in Figure 5.
(1) Initialize the network.First, determine, check and normalize input samples, and then determine the maximum number of training network  max and the total number of artificial ants M, and then initialize   () = c, where c is a contrast.
(2) Set parameters of ACO with PSO, initialize information heuristic factor  and expected heuristic factor , pheromone evaporation rate , and Pheromone intensity Q。 (3) Determination of network structure.According to the actual situation, the number of neuron nodes in the input layer and the output layer of WNN are determined, the number of neurons is set by several adjust, the initial weights   ,   and the thresholds of the hidden and output layers are determined randomly by the system.The path of   ,    ,   ,   as to be traversed for ant colony, that is  = [  ,    ,   ,   ], let the number of cycles N = 1.
(4) Start artificial ant, artificial ant select next path to go base on the probability that is calculated by the formula of status shift probability.
(5) Modify the taboo table tabu* and transfer the artificial ant k to the element  +1 .
(6) If the element in C has been visited, then step ( 7) is executed, otherwise the program goes to step (4) (7) Based on the weights and wavelet parameters of each artificial ants, the output value is calculated according to the formula of wavelet neural network, and the mean square error is calculated by using the mean square error formula.
(8) According to adjust the pheromone formula to update the pheromone on each path in time (9) N=N+1.If all the artificial ants can converge to an optimal path or the number of cycles  >  max , then the cycle is stop and output the results of the program, otherwise taboo table tabu* to the step (4).
(10) With the above training the WNN optimized by the ant colony algorithm will enter a stable state and the parameters of the steady state will be used for the precipitation prediction in the latter stage.

Validation of validity:
In this experiment, three generally adopted error indexes are used to estimate the different training algorithms, including the MAE, the MAPE and the RMSE.These indexes are defined as follows: Mean Absolute Error (MAE), the MAE signifies the average of the absolute errors over a given prediction horizon.
Mean absolute percentage error (MAPE), it usually expresses accuracy as a percentage, and is defined by the formula: Root mean square error (RMSE) is an estimate of the standard deviation of the random components in the data, and the best model has a minimum RMSE.
where  is the number of samples data,   is the raw rainfall data, and  �  the forecasted rainfall data.

EXPERIMENTS AND RESULTS ANALYSIS
In this study, Wavelet analysis has been carried out using MATLAB software package.ACO algorithm based ANN model was prepared using a MATLAB code (MATLAB, 2016a).The data set was then loaded and divided into two parts: training data (22 years) and testing data (9years).We use PSO to update the parameters of ACO.In ACO, we set the population size is 50 and the target error is 0.001.ACO to calculate the initial connection weights and thresholds, initial optimal connection weights and thresholds are obtained.
Although the neural network is being used for the one-step prediction, its multi-step ahead forecasting performance has not been investigated.Based on these consideration, the HWNN is selected as the rainfall predictor to work with compare experiments.The result as show in Table 1.
As shown in Table 1, the values in the 12-step-ahead are the smallest in the corresponding parameters, so we take the lags of 12.
To identify the optimal number of hidden neurons, a trial and error procedure was initiated with a hidden neurons, and the number of hidden neurons was increased to 50 with a step size of 1 in each trial.For each set of hidden neurons, the network was trained in batch mode to minimize the mean square error of the output layer.To identify over fitting during the training, a cross validation step was performed by evaluating the efficiency of the fitted model.The training was stopped when there was no significant improvement in the efficiency, and the model was then used for its generalization properties.We set the number of neurons in hidden layer to be from 15 to 50, and obtain 8 sets of results.Table 2 shows the performance comparison among 8 sets of results.According to Table 2, it shows that the network performance is the best when the hidden layer is with 35 neurons.Therefore, the network with 35 neurons in one hidden layer is employed in this study.Adopting PSO to iterate, the maximum number of iterations are 1000.Then, the connection weights of WNN simulation model is obtained, and the HWNN simulation model is established.In the experiments, three models are used to compare and analyze the results.Table 3 shows the predicted results of MAE and RMSE that are obtained with the three models.The data are the results of repeated training with PSOBP, WNN and HWNN, respectively.It can be seen that the MAE and RMSE obtained with HWNN are smaller than those obtained with PSOBP and WNN.The MAE obtained with HWNN is decreased by 2.38 and 11.18 compared that obtained with PSOBP and WNN, respectively, thereby effectively overcoming the shortcomings of the single-parameter optimization of WNN.These results clearly demonstrate the effectiveness of our algorithm on the parameter optimization of WNN and the impact of the initial parameter optimization of ACO on the forecast accuracy.In addition, the RMSE of HWNN is smaller than 10 when the wavelet initial parameters are optimized with ANT.In contrast, RMSE values of PSOBP and HWNN are less than 10, indicating that ACO and PSO optimization the initial parameters can accelerate the speed of network convergence and avoid local optimum, therefore the curve fitting ability are also improved accordingly.Data from Figure 6 indicate that the proposed algorithm model in this paper has achieved satisfactory results, which are more close to the actual rainfall than those obtained with the other two models.

CONCLUSIONS AND DISCUSSIONS
A hybrid wavelet neural network model with the mutual information and particle swarm optimization has been presented, tested and discussed for effectively forecasting monthly rainfall in this study.The HWNN model is developed by incorporating the wavelet multiresolution analysis for both predict and predictor variables, the partial mutual information algorithm for input identification and the particle swarm optimization for determining the number of hidden neurons into artificial neural network models.The efficiency of the method lies in the use of the discrete wavelet transform and mutual information for capturing the impacts of the predictor variables on the rainfall anomaly at different time scales.The proposed method is applied to Ningde monthly rainfall forecasting to evaluate the performance of the method.The HWNN model was calibrated with the 29-year data from 1985 to 2007, and tested with the remaining 8-year data from 2007 to 2014, and compared to the reference BP and WNN models based on monthly rainfall.The results show that HWNN forecast skill appears to be significantly better than two reference models.At the same time, two reference models provide very similar results in the study area.The HWNN model reduced the relative absolute errors and increased the predicted efficiency, respectively, in comparison to the WNN model.The improvement is not significant for extreme wet and dry anomaly months.This may be because the HWNN cannot capture the impacts of the predictor variables on the rainfall anomaly under different time scales.
The influence of neural network prediction error: (1) The network of different topological structure (input layer node number and hidden layer node number) achieves its optimal prediction ability.There is an optimal network, and its topological structure and network convergence position are optimal.
(2) When a network with a topological structure, its convergence to a particular position will have the best predictive capability.And the network allowable training error MSE target value is directly related to the convergence position of the network.Therefore, searching for the optimal network, we should not only search out the network topology, but also search out the network training error target value MSE.
(3) When the convergence of network converges to the global minimum, the ability of fitting training point is stronger, but its prediction ability is worse.
(4) The Influence parameters of BP (back propagation) neural network predictive capability are input layer nodes N, Middle hidden layer node number M and MSE target value.In order to obtain the optimal predictive performance of the BP neural network, the optimal N, m and MSE target values are needed for the middle hidden layer node number m and MSE target values.Obviously, this is an optimization problem, if the search for the optimal value is done in a poor way, the amount of computation will be so great that it cannot be achieved.Because of the implicit parallelism and powerful global search ability, the genetic algorithm can search the global best in a short time.Therefore, the structure optimization of BP Neural network prediction model is made by using PSO and ACO.

Figure 1 .
Figure 1.A simple moving average sequence in the different smooth levels

Figure 2 .
Figure 2. The time series of the trend, seasonal and random

Figure 3 .
Figure 3.The monthly and quarterly rainfall

Figure 4 .
Figure 4. Topological Structure of Wavelet Neural Network

Table 1 .
Results of the experimental simulation based on different lags

Table 2 .
Performance comparison among 15-50 neurons in the hidden layer The

Table 3 .
Performance comparison among 3 different algorithms in the case