Algorithm for Tree Growth Modeling Based on Random Parameters and ARMA

Abstract: Chapman-Richards function is used to model growth data of dahurian larch (Larix gmelinii Rupr.) from longitudinal measurements using nonlinear mixed-effects modeling approach. The parameter variation in the model was divided into random effects, fixed effects and variance-covariance structure. The values for fixed effects parameters and the variance-covariance matrix of random effects were estimated using NLME function in S-plus software. Autocorrelation structure was considered for explaining the dependency among multiple measurements within the individuals. Information criterion statistics (AIC, BIC and Likelihood ratio test) are used for comparing different structures of the random effects components. These methods are illustrated using the nonlinear mixedeffects methods in S-Plus software. Results showed that the Chapman-Richards model with three random parameters could typically depict the dahurian larch tree growth in northeastern China. The mixed-effects model provided better performance and more precise estimations than the fixed-effects model.


INTRODUCTION
Analysis of repeated measurement data is a recurrent challenge to statisticians engaged in biological and biomedical applications.A common type of repeated measurement data is longitudinal data.Longitudinal data can be defined as repeated measurement data where the observations within individuals could not have been randomly assigned to the levels of a treatment of interest.Repeated measurements data are data in which multiple individuals have multiple measurements over time or space.Thus the method of analysis for this type of data needs to recognize and estimate two distinct types of variability: between-individual variability and withinindividual variability.Nonlinear mixed effects models provide a tool for analyzing repeated measurements data and give an unbiased and efficient estimation of the fixed parameters of the model.Furthermore, nonlinear mixed models improve predictive ability if we are able to predict the value of the random parameters for an unsampled location.This is possible if complementary observations of the dependent variable are available.There has been a great deal of recent interest in mixed effects models for repeated measures data in forestry (Calama and Montero, 2004;Gregoire et al., 1995;Jiang and Li, 2008a, b, c).
Forest biometricians have been developing and adapting statistical techniques to improve tree growth prediction to meet particular objectives for forest growth and yield system.The aim of this study was to investigate the algorithm for dahurian larch (Larix gmelinii Rupr.) tree grow modeling using nonlinear mixed-effects modeling approach with random parameters and autocorrelation.

MODELING ALGORITHM
Algorithm for nonlinear mixed modeling: The general expression for a nonlinear mixed effects model can be written as Pinheiro and Bates (2000): where, yij = A diameter measurement at time j for the ith tree tij = Age (years) for the ith tree on time j m = The number of trees, ni is the total number of measurements on tree i f = A nonlinear function, βi is a parameter vector, εij is a normally distributed within-plot error term The parameter βi varies from tree to tree to account for between-individual variation.The parameter βi can be expressed as: where β is a p × 1 vector of fixed effects parameters, bi is a q × 1 vector of random effects parameters, p is the number of fixed parameters in the model, q is the number of random parameters in the model, D is the variance-covariance matrix for the random effects.In the basic assumptions, the within-tree errors are independent and the εi follow a N(0, Ri) distribution and are independent of bi, Ai and Bi are design matrices for the fixed and random effects respectively.These design matrices contain zeros and ones associated with the fixed and random effects.
After initial screening of some growth models, such as Chapman-Richards, Logistic, Weibull and Gompertz, the Chapman-Richards model was selected for this study due to its flexibility and biologically interpretable coefficients.The form of the Chapman-Richards model is: where, y is tree diameter (cm), t is age (years), β1, β2, β3 are regression coefficients.

Model development:
Commonly, three steps were involved for mixed-effects model development (Pinheiro and Bates, 2000).First, the parameters of the model were specified either as mixed (both random and fixed) or purely fixed.Then the structure of the amongtree variance-covariance matrix (D) was determined.Finally, the within-tree variance-covariance structure (Ri) was determined to account for heteroscedasticity and residual correlation.
Data from stem analysis of 40 trees were collected from Daxinganling region in Heilongjiang Province, northeastern China.This species is common and commercially important in northeast China.Sample trees were selected from the dominant and codominant crown class, undamaged and unsuppressed.Each sample tree was felled and disks were extracted at breast height (1.3 m).The number of annual rings and ring width were measured for each disk.The 40 sample trees yielded 528 diameter-age pairs.
Parameter effects were analyzed first.For the construction of a mixed model, the first question that should be addressed is determining which effects should be considered mixed and which should be considered purely fixed.An intuitive approach is to fit diameter-age models to each individual tree and assess the variability of estimated parameters by considering the individual confidence intervals (Lindstrom and Bates, 1990).The parameters with high variability and less overlap in confidence intervals across trees should be considered as mixed effects.This approach requires sufficient observations on each individual to give meaningful parameter estimates by the individual fits.In our case there are over 15 observations for each tree and thus it is possible to get the confidence intervals of the parameters for each individual fit.
Variance-covariance structure was further investigated.The among-tree variance-covariance matrix for the random-effects (D), common to all trees, defines the existing variability among tree.First, we assume that the random effects are independent of each other which make a diagonal random effects variancecovariance matrix to reduce the number of parameters in the model, then different random effects variancecovariance structures (e.g.diagonal and general positive-definite matrices) will be explored to determine whether a correlated variance-covariance structure is needed for random effects.
To specify the within-tree variance-covariance structure (Ri), two components of the heterosedasticity and the autocorrelation structure should be addressed (Pinheiro and Bates, 1998).The expression for the within-tree variance-covariance matrix is then having the following form: where, σ 2 is a scaling factor for the error dispersion given by the value of the residual variance of the model and Gi is a ni × ni diagonal matrix that specifies withintree variance, Гi is correlation structure，i is the tree number and ni indicating total number of diameter-age measurements in the tree.

Model evaluation and prediction:
To evaluate model performance, Bias, root mean square error (RMSE) and coefficient of determination (R 2 ) were employed.These evaluation statistics are defined as: where, yi is observed value for the i th observation,   ̂is predicted value for the i th observation,   ̅ is the mean of the yi, n is the total number of observations used to fit the model.The model with the smallest values of Bias, RMSE and higher R 2 was considered the best.Nonlinear mixed-effects models were fitted with the NLME procedure in S-Plus 2000 (Mathsoft, Seattle, WA).
The main purpose of final model involves diameter prediction for unmeasured trees.An important characteristic, compared to traditional regression, is that mixed-effects models allow for both mean response and calibrated prediction.If no sub-sample tree diameters have been measured for a particular year, the parameters of random effects is not predicted.Thus, the expected value 0 is used for all random parameters.The mean diameter prediction is obtained using only the fixed parameters estimates, where the mean behavior of the diameter variation for a certain tree is represented.
A calibrated response requires prior measured subsample diameter information for a tree.It could significantly increase the accuracy of tree diameter prediction.If both diameter and age have been measured for a sub-sample, the vector of random effects bk at the tree level can be estimated using this additional information.Calculation is carried out using an approximate Bayes estimator of bk (Vonesh and Chinchilli, 1997;Fang and Bailey, 2001): where  ̂ is the q × q variance-covariance matrix (q is number of random-effects parameters included in the model) for the among tree variability,   ̂ is the k × k variance-covariance matrix for within-individual variability,   ̂ is the design matrix for the random components specific to the additional observations,   ̂ is the difference between the observed diameter and the predicted diameter using the fixed effects parameters.

RESULTS AND DISCUSSION
Determining parameter effects: Individual fits approach was first used to determine parameter effect either as mixed or purely fixed.Confidence intervals were obtained on the parameters in Chapman-Richards model showed in Eq. (3) based on individual fits using nlsList function in S-Plus.Figure 1 gives the approximate 95% confidence intervals for three parameters of b1, b2 and b3 in Eq. (3) for each tree except that several trees failed to converge.It was noticed that, for parameters b1, b2 and b3, the confidence intervals of the thirty five trees showed more among-tree variability.Therefore, parameters b1, b2 and b3 were considered mixed effects.
In order to get reliable random parameters, The Chapman-Richards model with different combinations of fixed and random parameters was fitted using nonlinear mixed-effects modeling procedure with NLME function in S-Plus.Summary of fit statistics obtained from fitting the Chapman-Richards model are presented in Table 1.
It indicates that there are significant differences between models containing random parameters and model with only fixed parameters.These statistics show that addition of random-effects improves the model fit.Chapman-Richards model with three random parameters is the best with the smallest AIC and BIC values of 900.02 and 942.77, respectively.To avoid over-parameterization and illconditioning problems of variance-covariance matrix, the scatter-plot matrix of the estimated random parameters was produced (Fig. 2).Random parameters b1, b2 and b3 did not show high correlation among random effects.

Variance-covariance structure:
To get the among-tree variance-covariance structure, the Chapman-Richards model with three random parameters was fitted to the data using NLME function in S-Plus.Both diagonal (Diag) and general positive-definite symmetric (Sym) variance-covariance structures were examined for the random effects (D) using Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Likelihood ratio test (LRT) (Table 2).
It indicated that two variance-covariance structures are significant different (LRT = 84.6,p<0.0001).Chapman-Richards model with symmetric variancecovariance structure resulted in lower values of AIC (900.02) and BIC (942.77),indicating that a correlated variance-covariance structure was preferred in the model fitting.Correlation structures: Correlation structures are used to model dependence among observations.In the context of mixed-effects models, they are used to model dependence among the within-group errors.Historically, correlation structures have been developed for two main classes of data: time-series data and spatial data.The former is generally associated with observations indexed by an integer-valued time variable, while the latter refers primarily to observations indexed by a two-dimensional spatial location vector.In this study, the sample data is tree growth data in which serially correlated errors usually arise in the fitting of growth curves to the sample data.Correlation structures can be incorporated into mixedeffects models (Chi and Reinsel, 1989;Lindstrom and Bates, 1990).In this study, the AR (p), MA (q) and ARMA (p, q) correlation structures were tested for inclusion in the mixed-effects model (Table 3).The mixed model with different correlation structures produced significant improvements in model fitting except that several models failed to converge (p<0.0001).ARMA (2, 1) correlation structure was selected as the best one because the mixed model with ARMA (2, 1) correlation structure resulted in smaller AIC and BIC values, the ARMA (2, 1) could explain more dependency among repeated measurements within the individual.The final model is mixed-effects model with autocorrelation structure ARMA (2, 1).Parameters estimates for the final model are presented in Table 4.
Scatter plots of Studentized residuals were constructed for mixed-effects and fixed-effects models (Fig. 3).The Scatter plots showed that the mixed-effects model showed more homogeneous residual variance over the full range of the predicted values and no systematic pattern in the variation of the residuals.It indicated that mixed-effects model significantly improve the model performances compared to fixedeffects model.In this case, the within-tree variancecovariance matrix is the diagonal matrix: where Ini is the identity matrix (ni × ni) and other variables as previously defined.
Based on the above considerations, the resulting mixed diameter-age model was: Model diagnosis: Before making inferences about a fitted mixed-effects model, we should check whether the underlying distributional assumptions are valid for the data.There are two basic distributional assumptions for the mixed-effects Richards model considered in this study: The within-group errors are independent and identically distributed, with mean zero and variance σ 2 and they are independent of the random effects; The random effects are normally distributed, with mean zero and variance σ 2 i (not depending on the group) and are independent for different groups.
Scatter plots of Studentized residuals is showed in Fig. 3.It indicates that the residuals are approximately centered at zero, though we observe several outlying observations and large residuals.
Normal plot of estimated random effects was used to investigate departures for the assumption of normality of random effects (Fig. 4).
Parameter estimates and fit statistics for basic and mixed models are presented in Table 4. Compared to the basic model, the mixed-effects model showed the better performance with the lower bias, RMSE and higher R 2 .The performance of the mixed model was also visualized by displaying the predicted and observed values in the same tree (Fig. 5).Both the fixed-effects model with random effects set to zero and mixed-effects model are compared.Diameter growth predictions for all trees obtained from the fixed-effects model differ substantially from the actual diameter.The mixed-effects model more closely followed the actual values for most trees and indicated that mixed effects model described the diameter growth data well.

CONCLUSION
In this study, a nonlinear mixed-effects diameter growth model was developed for dahurian larch in northeastern China.Nonlinear mixed-effects modeling techniques were used to estimate fixed and randomeffect parameters for Chapman-Richards model.The results showed that the Chapman-Richards model with three random parameters was found to be the best in terms of goodness-of-fit criteria.The mixed-effects model provided better model fitting and more precise estimations than the fixed-effects model.The fixedeffects parameters alone can be used to obtain the mean diameter curve for dahurian larch plantation based on all trees sampled from the population of plantation in Dailing forest bureau.The random parameters for a tree of interest could be predicted with an approximate Bayes estimator of bk using the diameter and age measurements from that tree together with estimates of the fixed-effects parameters, the residual variance and the estimated variance-covariance matrix for the random-effects parameters.
The developed models can be implemented in forest inventories by measurement of two tree heights per plot.The process of calibration apparently accounted indirectly for effects of stand density on the height-diameter relationship.The estimate of randomeffects parameters makes the inclusion of additional predictor variables in the model unnecessary for many forest inventory applications that consider measurement of sub-sample trees.In addition, the use of the mixedeffects model through a sub-sample of trees for height measurement allows maintenance of a simple model structure without including additional predictor variables.
It should be recognized that the data used in this study were collected from a narrow geographic range, severely restricting the use of the model.Thus, caution should be used when extrapolating beyond the natural range of the data on which the model is based.

Fig. 1 :
Fig. 1: Ninety-five percent confidence intervals on the parameters in the Chapman-Richards model based on individual fits

Fig. 5 :
Fig. 4: Normal plot for the estimated random effects

Table 1 :
AIC, BIC, log-likelihood, and Likelihood Ratio Tests (LRT) with different fixed and random effects components

Table 3 :
Comparisons of mixed-effects model performance with different correlation structures.