The Evaluation Model of Grape Wine Quality Based on Multivariate Statistical Methods

The purpose of this study is to establish contacts of the physiochemical indexes between the Wine grapes and the Wines. Because of a wide range of physiochemical indexes, in order to more clearly reflect the contact between Wine grapes and Wines, firstly, the principal component analysis is used to select principal components and the correlation matrix is established based on the corresponding variables of principal components. And then, by stepwise regression method, the function of the relationship of physiochemical indexes between the Wines and Wine grapes is fitted, through which shows a strong correlation between the physiochemical indexes of Wines and Wine grapes.


INTRODUCTION
During determining the quality of wine, a number of qualified wine-tasting are usually employed to tasting.Each wine-tasting gives a score to classification index of wine after the tasting of the wines, then summed to obtain the total score to determine the quality of the wine.There is a direct relationship between the quality of Wine grapes and the quality of Wines, so the physiochemical indexes of Wine and Wine grapes will reflect the quality of the Wines and Wine grapes to some extent (Li et al., 2011).The examples give the composition data of some Wines and Wine grapes in a given year.This study will attempt to establish a mathematical model to analyze the connection with the physiochemical indexes of Wines and Wine grapes (Gao, 2004).
From this study, we want to know which are the important physiochemical indicators having a significant impact on Wines and Wine grapes.And we also want to know the models of the important physiochemical indicators and Wines and Wine grapes.There are too many physicochemical indicators to establish the model of Wine grapes and wines.Therefore, in order to obtain a clearer regression equation on the physicochemical indicators between Wine grapes and wines, firstly the principal component analysis is used to obtain the main component, further regression equation is based on the corresponding physiochemical indexes by stepwise regression analysis and then the connection of the physicochemical indicators between Wine grapes and Wines is obtained.

Model assumes:
It is assumed that the data used in this study are real and effective and have a systematic analysis of the value.
In this study, obviously erroneous data are manually modified and data are accurate and objective, that is not considered view error.
The sample data can be approximated as from a normal or near-normal distribution.

Model establish:
The basic principles of the principal component analysis: Assume that there are n samples, each sample has a total of p variables, an n*p matrix of order data is constituted: When p is large, it is problematic for expedition in p-dimensional space.To overcome this difficulty, the dimension is needed to reduce, which uses relatively few comprehensive index instead of the original variables more indicators, but these less comprehensive index can reflect as much as possible of the original indicators that are more variable reflects the information, while between them is independent of each other (Fang and Pan, 1982).
The determining principle of coefficient of l ij is: 1 L .z 2 is not related to z 1 , and has the second largest variance in all linear combinations of and has the M th largest variance in all linear combinations of Yu, 1993) From the above analysis, the essence of principal component analysis is to determine the load l ij and proved mathematically, they are the eigenvectors of the m larger eigenvalues to the correlation matrix.

The calculation step of principal component analysis:
• The normalization processing of raw data standardization: Because of different dimension of various indicators, it is first necessary to normalize the data.Standardized formula is as follows: • Establish the correlation coefficient matrix variable: • Seeking the eigenvalues and their corresponding eigenvectors of R: (He, 2004): • Write the principal components: Generally the cumulative contribution rate is required to be above 80%.

RESULTS AND DISCUSSION
First, there is need to extract the principal component of the physiochemical indexes data of Wines (including red wine and white wine) and the Wine Grapes (including red grapes and white grapes).The contribution rate and the cumulative contribution rate are calculated and scatter plot of the contribution rate are below (Itamar et al., 1994).Due to the large amount of data, take the red wine for example.The contribution rate of each main component of red wine is in Table 1.
From Fig. 1 and Table 1, the operating results above show that: the contribution rate of first principal component is 45.7917% and the contribution rate of the second main components is 20.7300% and the contribution rate of the third main components is 14.9704%, so the cumulative contribution rate of the first three principal components is 81.4921%.Because the cumulative contribution rate is more than 80% (Zhou et al., 2010), the first three new factors are chosen.
From Fig. 2 about white wine, the cumulative contribution rate of the first three principal components is more than 80%, so the first three new factors are chosen.
From Fig. 3 about red wine grapes, the first nine new factors are chosen, because the eigenvalues of the first new nine factors are more than 1 and the others are less than 1.
From Fig. 4 about white wine grapes, the first twelve new factors are chosen for the same reason as Fig. 3.
The main representatives of the variables extracted of the other three goals are in follow.
To analyze the relationship between red wine and red grapes among the main variables, Pearson correlation coefficient is calculated.Thus the correlation matrix shows: the anthocyanins and tannins of Wine Grape are significantly positively correlated to the anthocyanins and tannins in the Wines.From the above regression equations established, the color (HD65) in physiochemical indexes of red wine has lesser extent related to the physiochemical indexes of wine grapes and only has 38.72% of the goodness of fit to anthocyanins, tannins, sugar and fruit color a *.Color (CD65) also shows the general goodness of fit and only has 58.87% of the goodness of fit to anthocyanins, skin color a * and skin color C.For the five other Physicochemical indexes of red wine, most showed strong correlation to the physiochemical indexes of the wine grape (Li et al., 2011).
Here are the scatter plot of anthocyanins and tannins for wine and wine grapes.From Fig. 5 and 6, the red wine is higher than red wine grapes between anthocyanin content and tannin content.
For anthocyanin content, from Fig. 5, anthocyanin content of red wine significantly higher than red grapes on the point 1 and 8 and the gap between red wine and red grapes is up to 600 or more.At other points, the gap between the anthocyanin content of red grapes basically fluctuates around 150 and the extent of fluctuations is not big.
For tannin content, from Fig. 6, the extent of gap fluctuations of red wine and red wine is large and the tannin content of red wine is significantly higher than red grape on eight points, namely point 1, 2, 3, 8, 9, 10, 14 and 22, respectively.

•
Calculate the contribution rate and the cumulative contribution rate of the principal components:

Fig. 1 :
Fig. 1: The distribution of eigenvalues of red wine

Fig. 5 :
Fig. 5: The anthocyanin content of red grapes and red wine

Table 1 :
The contribution rate of each main component The results obtained through regression analysis are: