The Research on Volleyball Trajectory Simulation Based on Cost-Sensitive Support Vector Machine

Aim of study of the study is to effectively solve the imbalanced data sets classification problem, the field of machine learning proposed many effective algorithms. On Seabrook’s view, now the classification methods for class imbalance problem can be broadly divided into two categories, one class is to create new methods or improve the existing methods based on the characteristics of class imbalance. Another is to reduce class imbalance effect reuse of existing methods by resembling technique. But there are also some drawbacks in resembling method. Sample set sampling may lead to excessive learning; the next sample may result in the training set. Therefore, by dividing the training set, it does not increase the number of training samples; also it is without loss of useful information in the sample to obtain a certain degree of balance in a subset of the problem.


INTRODUCTION
Support vector machine theory development has undergone a process of continuous improvement, the first study of statistical learning theory in the 1960s by V. Vapnik began to study and he can be called the founder of SVM.In 1971, "The Necessary and Sufficient Conditions for the Uniforms Convergence of Averages to Expected Values" in an article, V. Vapnik and A. Chervonenkis proposed VC dimension theory, it is an important theoretical basis of the SVM.V. Vapnik 1982 Estimation of Dependences Based on Empirical Data "a book made of structural risk minimum theory, this theory put forward is the cornerstone of the epochmaking significance, SVM algorithm.Boser, Guyon and Vapnik in 1992 proposed the optimal classifier.Cortes and Vapnik in 1993, further discussion of the nonlinear optimal boundary classification problem.V. Vapnik published in 1995, "The nature of statistical Learning Theory", a book, a complete proposed SVM theory.
The core of the support vector machine 1992 1995 SVM algorithm, SVM is so far the most successful statistical learning theory to achieve is still in the stage of continuous development.Support vector machine development process lasted a very short, but it has a solid theoretical foundation, because it is based on statistical learning theory and in recent years has emerged a lot of theoretical research, but also laid a solid foundation for applied research (Maloof, 2003;Barandela et al., 2003).

BAYESIAN DECISION THEORY AND INSPIRATION
Bayesian decision theory is an important part of the subjective theory Bayeux Spirax summarized.It was founded by the British mathematician Bayesian.
Bayesian decision, in the case of incomplete data information, can estimate an unknown part with the subjective probability and finally uses Bayesian probability formula to fix the probability of occurrence.At last, the same expectations and correction of the final decision-making optimal probability is made.
Bayesian decision theory's basic idea can be described as follows: First class conditional probability density parameter expression and a priori probability of the premise, using Bayesian probability formula to convert it to a posteriori probability already know, the final utilization the posterior probability Size to optimal prediction decisions (Chawla et al., 2002;Japkowicz and Stephen, 2002;Barandela et al., 2004).
Bayesian inverse probability and using it as a universal method of reasoning is a significant contribution to the statistical reasoning.Bayesian formula is expressed on Bayes' theorem, a mathematical formula.
Assuming that 1 2 , ,... B B is a prerequisite of a process, ( ) i P B is the a priori probability, estimate the likelihood of each prerequisite.This process has been a result of A the Bayesian formula provided the preconditions to make a new evaluation method based on the appearance of a re-estimate of the probability of occurrence of B in the premise of a ( | ) i P B A posterior probability (Akbai et al., 2004;Han et al., 2005;Tang and Chen, 2008).
A set of methods and theory based on Bayesian formula has a very wide range of applications in real life.The following is a mathematical formula of the Bayesian formula expression: Bayesian formula: Let 1 2 , ,..., n D D D be a partition of the sample space S, said that the probability of the event .
For any event x ( ) 0 Bayesian decision theory analysis: Bayesian decision theory is analyzed from the following aspects: The conditions minimize risks known as Bayesian risk.
{ } , m said the number of categories; ( , ) C i j is a class sample j classification for i the risk, i j = correct classification, i j ≠ wrong classification.
Based on the accuracy 0-1 loss classifier IF ( , ) 0 C i j = , i j ≠ , ( , ) 1 C i j = , the classification task is to find the maximum a posteriori probability.
Solve the problem of cost-sensitive data mining, , relying only on the posterior probability of x to determine the category of the sample is not desirable.Given the cost of misclassification of a class of samples, the cost matrix can be re-constructed.Based on the formula (4-1) since you can solve the problem of data mining and make the minimum cost of misclassification, Bayesian decision theory can be used based on the above analysis shows that the cost of the different cost of misclassification embedded in the different categories of samples sensitive data mining (Wang et al., 2007;Zhou and Liu, 2006;Rehan et al., 2004).

Domingo proposed a new method of a classification model into consideration sensitive model, called
Element consideration; it is a through meta-learning process, has been estimated ( | ) P j x after class probability of the sample and then by a minimum desired consideration to modify the sample category tag.
Meta-learning process in the global scope, after knowledge received local learning secondary learning.Use appropriate learning program for each individual dispersed data meta-learning objectives and these procedures are performed together last relatively independent classifiers who classifiers are merged together to form a new data set, then the use of a new learning program to operate the new data set and finally generate meta-knowledge.Meta-learning biggest feature is any suitable algorithm can be used in the training phase to produce relatively independent classifier.Because of the use of meta-learning method, using a variety of integration methods in the initial stages, the last generated meta-classifier higher accuracy of prediction.Meta-learning flow chart shown in Yuan cost of a classification learning algorithm based on Bayes decision theory.The algorithm process for the first of several from the training set sampling multiple models from multiple models in the training set for each sample belonging to the posterior probability of each class ( | )  P j x then calculated for each sample belonging to each category expectations Consideration ( | ) R j x , last modified minimum expected cost category tag, get a new set of data, resulting in a new model, the minimum expected cost from Each sample in the training set x , first the posterior probability ( | )  P j x and then calculated according to the formula (1) which belongs to each category i consideration Further Reconstruction x class is marked: Misclassification class is marked with the sample information, is called the "true" class label sample.

COST-SENSITIVE SUPPORT VECTOR MACHINES
Different samples with different misclassification cost, cost-sensitive support vector machines (Weiss, 2004;Kubat and Matwin, 1997;Yin et al., 2011;Zhao et al., 2012) (CS-SVM) is the different samples misclassification integrated into the design of SVM, considering each sample has a different misclassification cost original sample set: ( , ), ( , ),..., ( , ), , 1, 1 , 1, 2,..., Reconstructed as follows: ( , , ), ( , , ),..., ( , , ),..., ( , , ), , 1, 1 , 0, 1, 2,..., Where in coi first i samples misclassification for the normal number , it depends on the x or the yi.Let the sample set can be the hyperplane the ( Including: 2 w for the structure of the consideration; for the experience of a consideration; C is the relaxation factor, the role is a balance between the consideration for the control structure and the experience of the consideration, for solving optimization problems is constructed as follows Lagrange equation: Where in: and the Lagrange coefficients minimize formula (7).
CS-SVM classification-oriented class offset the relatively small cost of misclassification, thus making it costly misclassification of samples can be correctly classified reducing overall misclassification.

COST-SENSITIVE SUPPORT VECTOR MACHINE BASED ON THE DATA SET DECOMPOSITION
Imbalance data set to solve classification problems in use sampling methods for data processing, there may exist because the training samples when the sampling of data sets on the training samples increased lead to excessive learning and performs down-sampling the reduction in the number and loss of the classification information of the sample.In order to avoid the above situation, a method is to divide the training set, the training set is divided into a plurality of subset have a certain degree of balance, the use of machine learning methods to train in each subset and then integrated learning, this method neither increasing the number of training samples, you will not lose a sample of classified information.
Solve the cost-sensitive data mining problem, when i j ≠ ( , ) ( , ) C i j C j i ≠ , if only rely on x maximum a posteriori probability does not determine the category of the sample, when given sample misclassification cost matrix can be re-constructed Bayeux misclassification costs embedded in different misclassification cost price sensitive issues, the Adams decision-making theory provides an implementation framework, based on Bayesian decision theory can to achieve the costsensitive, so that the global minimum.
Using Bayesian decision theory to deal with the cost-sensitive data set is divided into subcategories i, Class i subclass i minimum expected cost relative to other subclass misclassification Category i samples costly, then it will be the original does not belong to the subclass i on minimum expected cost part of the sample is allocated to the sub-classi, which also is a change in the sample class tag to reconstruct the sample.
The foregoing analysis, presents a cost-sensitive support vector machine (KCS-SVM) based on the data set decomposition.It has the In this algorithm, it is first decomposed into sample sets arbitrary negative class subsets.Then will decompose each negative subset L-i the positive class take minimum misclassification the ) min( R , according to the conditions to determine true class label of the sample L', thus making the sample integrated consideration misclassification, above the sample concentration of each sample to regain class label and then reconstructed sample set, because the reconstruction of the samples the Chichi become misclassification cost, so you can take advantage of the cost-sensitive support vector machine, to get a decision function with misclassification, making the classification of the overall misclassification minimum.

SIMULATIONS
Experiment data: Algorithm used in the experimental data set is used in the study of the unbalance data classification 2 discloses data sets are from http://www.ics.uci.edu/mlearn/MLRepository.htmstandard data set obtained, respectively the hepatitis, soybean data set, these two data sets hepatitis for two types of problems, soybean original multi-class problem, consider the convenience of calculation, the experiments in this soybean data set is first converted to the two types of questions to the class label positive class and all kinds of combined negative class.The number of various samples of each data set, such as shown in Table 1.
Assuming the Consideration known cost matrix as shown in Table 2.
Experimental results and analysis: Data set randomly split into a training set and a test set, cross-validation method, all samples were randomly divided into 8 parts and each sample imbalance rate remains equal to the rate of overall imbalance.Then every seven randomly selected as the training set, the rest as a test set.
Use standard SVM, KCS-SVM; two methods were compared as shown in Table 3 and 4  Misclassification costs calculated for each data set, taking 10 times to calculate the average of the results, the results as shown in Table 5.
As shown in Fig. 5, the experimental results from the two data sets and SVM, KCS-SVM in the positive class samples obtained classification accuracy but also the positive class samples misclassification, while the negative class samples lower on classification accuracy and lower consideration misclassification.The positive class sample classification accuracy greater than average misclassification role negative samples classification accuracy, thus reducing the average misclassification cost According to the experiment can be seen KCS-SVM method effectively reduce the sample misclassification cost.

CONCLUSION
This study based on Bayesian decision theory and risk minimization revelation by the decomposition of the training set a new consideration based on the data set decomposition sensitive support vector machine.Firstly, the sample set according to a certain data set of decomposition rules be broken down into several subsets, training support vector machine that can output posterior probability for each subset, according to Platt's proposed use sigmoid function mapping posteriori output probability of the sample and the minimum , n represents the training set the positive class number of samples, m n − represents the number of the training set samples negative class.
The proportion of the number of samples of the training set of positive and negative class can be controlled by adjusting the support vector machine trained on each subset can output posterior probability.Take the training sample concentration of each sampleL x i ⊆were obtained in each sub-classifier posterior set in consideration of the use of metalearning, misclassification cost of training samples

Fig. 1 :Fig. 2 :Fig. 3 :
Fig. 1: Data set hepatitis is the correct class sample classification accuracy learning theory and the costs to get the samples to get a sample of the "true" class label, after the training is completed for each training sample, the reconstruction of a data set, a new set of data is the integration of the misclassification data set, then training the new sample set using cost-sensitive support vector machine, a new classifier, the classifier can make minimum misclassification.Simulation results show that the new cost-sensitive support vector function effectively reduce misclassification cost, to achieve good results.

Table 1 :
Positive and negative sample size of data set Number of samples Data sets The total number of samples Is the number of samples Number of negative samples Unbalance rate

Table 3 :
Data sets of hepatitis two types of sample classification accuracy ( % )