Relation between Grape Wine Quality and Related Physicochemical Indexes

The aim of this study is to evaluate grape wine quality more objectively by reducing the error of traditional grape-wine-quality evaluation. On combining grape wine quality and physicochemical index of grapevine, we provided a grape-wine-quality evaluation model by grapevine’s physicochemical index in this study. Firstly, evaluations of the tasters are analyzed, for eliminating the disturbance caused by their individual difference. Then, relationship between grape wine and grapevines are analyzed. Inherent mechanism which affects the grape wine quality was figured out based on description of grape wine quality by physicochemical index of grapevine. Finally, we evaluated the grape wine quality by physicochemical index of grapevine. Additionally, rationality of the model is verified by statistical test while the accuracy of the results is verified by comparison with the evaluating results made by tasters.


INTRODUCTION
At present, classification of grape wine differs from country to country while distinguish of grape wine quality are similar which mainly depends on sensory quality (Wen-Jing, 2007).Main components in aroma of grape wines are summarized in a research on aromatic substance of grape wine since aroma is a pivotal index of grape wine quality evaluation (Yu et al., 2005).The 12 main aromatic sources in grape wine, their sensory characteristics and their influence on grape wine quality are described in a research (Ji-Ming, 2005).In fact, evaluation of grape wine quality includes the appraisal, taste and so on besides aroma.By building regression equation between grape wine quality and four factors including aging time, alcohol content and residual sugar, relationship between grape wine sensory quality and each factor has been figured out in a research (Li et al., 2005).
Sensory evaluation by tasters is commonly used on evaluating the grape wine sensory quality.During the evaluation, tasters grade several indexes of the grape wines after tasting them.Based on the summation of the indexes, the quality of grape wine is finally evaluated.However, the side effect of the evaluation method is the evaluation error due to individual difference of the tasters, which reduce the objectivity of the results.To reduce evaluation scale of tasters, confidence interval method is better than standardization method, leading the difference of grape wine quality more objectively (Li et al., 2006).
Considering the direct relationship between grapevine and grape wine quality, physicochemical indexes of grapevines reflect the quality of grape wines to some extent.As a result, the relationship between grape wine quality and physicochemical indexes of grapevines are researched in this study.Based on the data on http://www.mcm.edu.cn/, a model which estimates grape wine quality by physicochemical indexes of grapevines was established.

SELECTION OF THE TASTERS BASED ON SIGNIFICANT DIFFERENCE TESTING MODEL
Two groups of tasters were chosen to evaluating the 27 samples of red wine.The evaluation dimension of the evaluation includes appearance analysis, aromatic analysis, taste analysis and overall assessment.The appearance analysis contains clarity and hue while aromatic analysis and taste analysis contains purity, concentration and quality.
It is variance analysis that could deal with the problem that whether there are significant differences between the two groups of tasters.However, further calculation is needed for judging whether the variance of grades are difference between the two groups of tasters in one index of a sample, since the presumption of testing significant difference of mean value is that the variance of each sample equals.
Based on the analysis, firstly, the same indexes of each taster in different samples were standardized, avoiding the influence of individual difference.Then each standardized index of the two groups was tested by F-test and t-test with the salience value of 0.05, for judging whether the significant difference of each index graded by two groups of tasters exists.Finally, the grades of tasters with higher reliability were selected as the evaluation standard of red wine according to the rules that the group with smaller variation is better.
Standardization of the data: Firstly, standardize the data with standard deviation, as is in Formula (1): According to the formula P ijkn = The original mark of the Index n by Taster No. j of Group i in Sample k.A ijkn = The standardized result of P ijkn .̅ R ijn = The mean value of the k groups of original mark of the Index n by Taster No. j of Group i. σ ijn = The standard deviation of the Index n on all the samples of the original marks graded by Taster No. j of Group i.
The standardization result of red wine graded by tasters of Group 2 is figured out as the following Table 1.
F-test: significant difference test on standard deviation of grades: Firstly, build the hypothesis H 0 : σ 2kn , = σ 1kn , which means that there is little significant difference between the grades of an index in the same sample marked by two group tasters.

The value of
F presents the F distribution with the degrees of the freedom valuing 9.As =0.05 α , it could be figured out that F0.975 = 0.2484 and F0.025 = 4.0260.
After calculating each F Statistics of each index graded by different tasters, we judged that whether the value is between F0.975 and F0.025.If the value is not between the intervals, the significant difference exists.
Data in Table 2 shows the results of F test of the two-group tasters based on Red Wine Sample 1.
For 27 red wine samples, 270 times of F test was calculated since there are 10 indexes in each sample, which contains 58 significant difference analyses.
t Test: testing whether the mean value of indexes with no significant difference equals: Choose the t Statistic as Formula (3): In Formula (3),  1 2 and  2 2 are the sample variances while N is the sample size of each group.In this study, N = 10 while  1 ��� and  2 ��� represents the mean value of the same indexes graded by 2 groups of tasters.
t test was used in the indexes which have the same variance according to F test.The results are as Table 3.
We compared variance of the indexes with the same mean value and established that the evaluation index with smaller variance is better as is in Table 4.
Tasters in Group 1 enjoy higher reliability of evaluation results in red wine according to the comparison result of variance.

PHYSICOCHEMICAL INDEX EXTRACTION OF RED WINE BASED ON PRINCIPAL COMPONENT ANALYSIS
Due to the large size of physicochemical index and the uncertainty relation between every two indexes, physicochemical indexes were classified and processed before merging indexes with strong relationship based on the characteristic of the indexes.
Firstly, we made correlation analysis on standardized indexes for judging that whether the multicollinearity exists among the indexes.Then, based on principal component analysis, we merged the remaining indexes with strong relationship, aiming to simplifying calculation and eliminating the multi-collinearity among indexes.

Correlation coefficient matrix:
We made the correlation analysis of standardized physicochemical indexes with SPSS 17.0.The results are shown in Table 5.
Based on the correlation coefficient matrix of grapevines' physicochemical indexes, results could be figured out that reducing sugar, total sugar and soluble solids present to be remarkably positive correlation, which means there is huge overlap of information among them.
However, the test of coefficient matrix may face difficulty when the multi-colllinear is too strong in multiple-linear-regression model.It may cause the situation that F test is passed while the t test of the coefficient matrix cannot pass, which may further leading to that the meaning of estimated coefficient contradict common sense.
As a result, principal component of n grapevine physicochemical index is extracted by principal component analysis, aiming at getting the independent principal component, which are F1, F2, …Fm, based on simplifying the calculation of physicochemical index.The principal component analysis used here could not only simplify the regression equation but eliminate the influence caused by correlation among varieties.

Extraction of principal component:
Variance contribution and cumulative were figured out by SPSS 17.0 as is shown in Table 6.As the cumulative of the first 13 principal components is 88.87%, we combined feature value of the components to new comprehensive indexes which are independent to each other, leading to a rounded reflection on grapevine quality.

Coefficient matrix of principal component:
Coefficient matrix of principal component was figured out by SPSS 17.0 as is shown in Table 7.
The data in Table 7 represent the load that principal components have on variables.Based on the data, expression of each principal component was figured out as Formula (4): In the formula, C ij = The coefficient of the physicochemical index j in principal component i. x j = Physicochemical index after standardization.

GRAPE WINE QUALITY EVALUATION MODEL BASED ON MULTIPLE LINEAR STEPWISE REGRESSIONS
Large number of information could be extracted by stepwise regression since influence of each factor is considered after the comprehensive influence by other factors.Therefore, stepwise regression could be used for describing the factors which influence grape wine quality since the influences between factors are controlled.There are two main advantages of the stepwise regression.One is to extract the factors which affect grape wine quality among quantities of factors.The other is to express the saliency of each factor which is easy for comparison and selection.

Establishment of multiple stepwise regression equation:
Multiple stepwise regressions were mainly used on selecting indexes in this study.Since many factors of grapevine could lead to a characteristic of grape wine, factors with remarkable influence should be extracted.Firstly, influences on grape wine quality by all independent variable, principal component, were considered.Then, principal components were introduced to the stepwise regression equation based on the salience.Principal Components with large salience enjoys the priority of introducing to the equation while components with small salience might never been introduced to the equation.Additionally, introduced components may lose its significance when a new component is introduced to the equation, which would be eliminated from the multiple stepwise regression equation.
Firstly, grape wine quality was chosen as dependent variable and physicochemical indexes are chosen as the independent variable of the regression before F value set.Since the evaluation reliability of the taster in Group 2 is higher, evaluation of Group 2 has been chosen as the dependent variable.Meanwhile, independent variables were represented by 13 principal components by principal component analysis.
Before the stepwise regression, we tested whether each variable is in the interval of F test for ensuring that the regression equation contains principal component with great influence only.In this period, we established that the significance level =0.05 α . When a variable is introduced, critical value of F test is F1, while critical value of F test is F2 when a variable is eliminated.Additionally, F1>F2, is established as a standard when a principal component is introduced or eliminated.
Rationality test of the model: Analysis on the statistical results of the stepwise regression model was figured out.The related parameters namely constant term of the multiple regression equation, Partial regression coefficient of the variable (B) and sampling error are figured out as is shown on Table 8.
According to the regression coefficient, the grape wine quality is mainly affected by three principal components, which are all made up by 53 physicochemical indexes of grapevine.Therefore, it is the 53 physicochemical indexes of grapevine that influence the grape wine quality.
The Sig. value of both constant and independent variable is far smaller than 0.05 and the p value of the model is 0.001 based on the variance analysis, which means the model is apparent due to the variables with statistical significance.Thus, the established multivariate linear regression equation is the optimality equation for the problem.
Accuracy of the model: Due to the unique application of the grapevine and the direct relationship between grape wine quality and the physicochemical indexes of grapevine, we classified the grapevine based on the quality of grape wine.
Firstly, physicochemical indexes of each cultivar of grapevine were substituted to the regression equation before the scheduling of the 27 cultivar of red grapevine.Based on the ranking result, we defined that the first 7 cultivar of grapevine are First Rate Grapevine, the next 7 are Second Rate Grapevine and the rest were deduced by analogy as is shown on Table 9.
Then, the grape wine quality were evaluated based on the regression equation, then 5 best cultivar of grapevines could be figured out, whose numeration are 23, 9,3,21 and 22. Due to the direct effect that grapevine has on grape wine quality, we scheduled the grape wine by grape wine quality based on evaluation of the tasters.The numerations of the top 5 grapevine are 23, 9, 22, 3 and 19.By comparing the two ranking results, a conclusion could be made that the results calculated by the two methods is of high similarity, which verify the rationality and accuracy of the multiple linear regression.

CONCLUSION
Grape wine quality and physicochemical indexes of grapevine are connected in this study, leading to a model of evaluating grape wine quality by physicochemical indexes.
Firstly, F test and t test were used in the analysis whether the significance difference of the two groups of tasters exists.t test was made after the F test passed, avoiding the assumption for variance analysis.Then, regression was used for describing the relationship between grape wine quality and the physicochemical indexes since the chemical reaction during the brewing time is too comprehensive to describe with mechanism analysis, making the abstract problem concrete.As the multiple linear regression is greatly influenced by Multi-collinearity, principal component analysis was used in this study to reduce the side effect.Finally, the rationality of the model was verified by statistical test while the accuracy of the model was verified by the comparison between data and the result calculated by the model.
Heteroscedasticity cannot be avoided despite of the standardization in this study.Therefore, the model would be of more accuracy if the influence of heteroscedasticity is considered.

Table 1 :
Standardization result of red wine sample 1 by tasters of group 2

Table 3 :
Results of the significant difference test

Table 5 :
Correlation matrix of the physicochemical indexes

Table 7 :
Coefficient matrix of principal component Principal

Table 8 :
Partial-regression-coefficient significant test of independent variable Non-standardized regressive coefficient