A Study on Automated Semantic Analysis of Customer Satisfaction Comments-A Case Study on Service Quality of Hotels on a Chinese Tourism Website

This study aims to present a model of comment semantic vocabulary built on ontological theories to perform semantic conversion of customer comments on hotel services to ratings through the process of word matching. Ultimately, ratings given by the customers and ratings derived from the conversion of semantic analysis would be used for the analysis of customers’ satisfaction on hotel service quality. The method proposed in this study has proven that semantic analysis is capable of delivering results of substantial accuracy. Automated semantic analysis would not only allow corporate managers to boost their efficiency in data collection and processing through cloud information services but also help them better understand contents that customers really care about. By paying more attention and making more effort to improve upon shortcomings identified by customers, corporate managers would be able to raise customer satisfaction and improve the image of their businesses at the same time.


INTRODUCTION
Given the prevalence of internet access and the emergence of cloud computing services today, the rapid flow of information has become a reality and channels such as commercial websites, blogs, chat rooms, email, forums and even social media such as Facebook have often served as platforms of communication where consumers share their opinions, experience, knowledge and thoughts with one another (Bickart and Schindler, 2001).Consequently, ratings or comments from customers on specific services or products have evolved into a common format of opinion expression on the Internet (Bickart and Schindler, 2001).For corporate managers, appropriate utilization of cloud customer opinion data not only facilitates the process of service quality improvement but also renders it more efficient.However, the processing of massive amount textual information such as qualitative comments or response feedback would call for substantial labor and time.
The adoption of automated semantic analysis for the processing of relevant information would effectively shorten the process of handling customers' satisfaction with service quality for businesses in order to provide better and faster services.In the environment of business management, service quality not only plays a vital role but also serves as a crucial element to successful corporate management (Majda and Slavka, 2012).Reichheld and Sesser (1990) pointed out that a company simply has to lower its customer churn rate by merely 5% to boost its profit between 25 to 85%, respectively.Therefore, the key to successful business management lies in the improvement and elevation of service quality to retain customers.
When it comes to the acquisition of customer feedback, comments, ratings and responses posted by consumers online are exempt from temporal and spatial constraints and they offer a higher degree of privacy.However, despite being an important means of opinion feedback (Tanimoto and Fuji, 2003), if the number of feedback entries were insufficient, the collected opinions might suffer from inadequate objectivity.Scholars tend to use quantified ratings as the basis of their researches and analyses.But as far as corporate managers are concerned, quantified data does not help them to achieve sufficient understanding of contents that customers really care about or aspects of their services that require improvement.It would require qualitative data such as contents of customer contents in order provide concrete orientations of improvement for corporate service quality.However, the processing of massive amounts of qualitative data (such as customer contents) would not only consume substantial human resources but also significantly increase costs.
In the domain of management science, the use of questionnaire survey as a method of data collection is a commonly used technique in the field (Sudman and Bradburn, 1982).But despite the obvious advantages of statistical analysis for surveys, the prevalence of the Internet, especially the universalization of Internet Fig. 1: Customer comments and ratings on a Chinese tourism website access, Parker (1992) felt that the contents of response gathered from online questionnaires might be subject to discrepancies due to personal factors of individual customers.Questionnaires also suffer from the following drawbacks: • The discrepancy of word use, word order and question design could easily lead to misunderstanding or dishonest answers from customers.This would affect the reliability and validity of research results (Schuman and Presser, 1981).
• The process of data collection requires relatively longer period of time to ensure sufficient collection of data.• As the questions on the questionnaire are broken into parts, there would be no analysis or discussion of qualitative customer opinion in the survey.
In light of these shortcomings, this research has chosen data available online by taking comments (Fig. 1) written by customers online and processing the data through semantic analysis for conversion into ratings pertaining to the dimension of service quality.Semantic data of customer comments, including relevant vocabulary relating to the dimension of service quality, tone vocabulary and emotion expressing vocabulary can be constructed using ontological structure for the establishment of a semantic knowledge base (Hui and Yu, 2005;Gillam et al., 2005) with definitions for vocabulary concept relationship and logical rules for semantic deduction.In their study, Berners-Lee and Fischetti (1999) adopted ontological theories to convey the relationship of vocabulary definition with specific domains and presented a Semantic Web for users to query for data through semantic search.The structure of semantic vocabulary relationship can serve as the basis of ontology in this regard and such a structure would not only render the semantic descriptions of vocabulary clearer but also allow expansion of knowledge presentation in the semantic web, even applicable for the description of knowledge in different domains (Asuncion and Oscar, 2002).
In addition, the concept of the Delphi Method (Murry and Hammons, 1995) has been adopted for the purpose of this study to enlist the assistance of experts in the appraisal of commonly used vocabulary for the dimension of service quality, tone and emotion expression for the construction of comment semantic vocabulary using the ontological structure as the database for comment semantic vocabulary.Finally, Tree bank (Wang and Jin, 2008) technology was utilized to turn phrases of customer comments into sentence words for the matching.Results of the matching (words) would then be converted into ratings using the semantic conversion table presented in this study.
In an effort to validate the framework presented in this study, the data of hotel comments provided by customers on a Chinese tourism website have been chosen as the basis for the research, since the website offers data of customer comments and ratings required for this study (Fig. 1).Analyses of satisfaction and service quality satisfaction correlation were performed based on the ratings converted from comments and customers' original ratings to validate the degree of precision for the automated comment semantic analysis presented in this study.More specifically, this study has been designed to accomplish the following objectives: • To utilize customers' comments on hotels as the data source for semantic analysis data • To accurately analyze customers' satisfaction with service quality based on semantic conversion of customer comments • To help corporate managers attain a concrete grasp of contents that customers really care about through semantic analysis of customer comments • To expand the scope of application for the method of converting qualitative comment data into quantified ratings so that it could be used in community such as Face book, blogs and even in different domains such as teaching evaluation, news comments and so forth

LITERATURE REVIEW
Ontology: Early applications of ontology mainly focus on the domain of artificial intelligence.With the development of Internet technology, especially the maturation of XML technologies, ontology has been adopted for fundamental applications that extend to the presentation of semantic web knowledge (Asuncion and Oscar, 2002).In order to analyze the characteristics of patients' medical history, Jiang et al. (2003) combined ontology and natural language processing to achieve effective categorization of massive amounts of textual data.Knublauch et al. (2004) built a special semantic structure using ontological concepts to achieve interconnection of resources in order to create an open query structure that effectively overcome specific issues arising from the search of information by means of key words.Hui and Yu (2005) and Gillam et al. (2005) noted that the construction of ontological knowledge base requires prudent and meticulous implementation since the quality of the knowledge base directly determines the performance of the knowledge system in question.A semantic knowledge base adopts ontology as its underlying structure to define vocabulary concept relationships and logical rules for semantic deduction.
An ontological structure would allow clearer semantic description for words and phrases.In other words, a structure of semantic vocabulary relationship could function as the ontological basis for the purpose of this study.Therefore, ontological comparison could be perceived as a means of resolving semantic heterogeneity (Choi et al., 2006) and be used to identify consistent semantics from relevant ontological bodies.
In other words, given a heterogeneous system, one could first establish ontological structures for different domains and establish an environment of high interoperability through the comparison of different ontology (Su and Gullah, 2005;Islam and Piasecki, 2008;Castano et al., 2008).
Related studies on semantic analysis: Since it is impossible to achieve comprehensive and precise interpretation of word usage and human emotions, analyses of Chinese words and phrases by computers are usually hindered by issues of undeterminable phrases, words and phrases not found in the word bank or misinterpretation.The Tsinghua Chinese Treebank (TCT) developed by Tsinghua University (TCT) (Wang and Jin, 2008) makes use of word and phrase analysis to interpret the meaning of Chinese sentences, thus rendering the expression and application of knowledge possible.In their study, Pang and Lee (2004) utilized graphic techniques in their analysis of comment articles, whereas Dave et al. (2003) categorized portions of emotional description in articles into two groups (positive and negative).Pang and Lee (2004) defined five grades for semantic categorization and Hagedorn et al. (2007) presented five levels for the definition of semantic intensity.In their study, Little and Rubin (2002) pointed out that people tend to use specific words and phrases to express their emotions in their speech and conversations.In their study, Zagibalov and Carroll (2008) configured "good" as the basic word and combined it with affirmative words and To create negative basic vocabulary, negative particle would be added before positive basic vocabulary.Wang and Jin (2008) used a Chinese Tree Bank technology to break sentences into words and phrases in order to understand the content of an article.Ku and Chen (2007) compiled lists of positive and negative vocabulary in their study.Since words represent the most basic unit of semantic expression, the task of word string matching is crucial for any computer built with language processing capabilities.Typical documents can be perceived as non-structuralized or semi-structuralized data.Therefore, prior to document matching, important information must be extracted and converted into expressions of vector space model (Salton and McGill, 1983).Berners-Lee and Fischetti (1999) proposed a semantic web built on online information structure to express the relationship between vocabulary of specific domains and their definitions through ontology in order for users to perform data search through semantics.However, the categorization of semantics would take on different means of expression in different domains and would have lower accuracy (Engstrom, 2007).Swoogle (http://swoogle.umbc.edu/) is the main database for the semantic web and it is designed to collect definitions from individual users and accessible to individual users.

DISCUSSION ON SERVICE QUALITY
Service quality: Service Quality is now universally perceived as one of the key factors for corporate success and service quality level is known to have direct effect on customer satisfaction.Most businesses have adopted service quality as a key indicator for the improvement of competitiveness (Lin, 2010).Levitt (1972) is the first author to link quality and service together and promote the idea in the service sector with the belief that service quality represents the standard that customers set for the outcome of services they anticipate (Bitner et al., 1994;Churchill and Surprenant, 1982;Crosby, 1979;Gronroos, 1983).Parasuraman et al. (1988) broke down service quality into five dimensions (Parasuraman et al., 1993): tangibles, reliability, responsiveness, assurance and empathy, which serve as components of Service Quality framework (SERVQUAL), otherwise known as the PZB model.Recognized for its outstanding reliability and validity, SERVQUAL is not only the most representative framework of service quality model but is also the most commonly used model that can be applied to different service industries.

Dimensions of service quality:
The main concept that the PZB model aims to convey emphasizes the fact that customers are the sole determiner of service quality.Since most academic researches on this subject also separated service quality into five dimensions, SERVQUAL items (Table 1) presented by Parasuraman et al. (1993) have been chosen as the basis of hotel service quality and customer satisfaction analysis.

Customer satisfaction:
The term "customer satisfaction" was brought up for the very first time by Cardozo (1965), who also introduced the idea of marketing scope.Cardozo believes that the degree of satisfaction customers derive from products provided by vendors would affect their inclination to make repeated purchases.Customer satisfaction is a common goal for many businesses and stands as one of the most important indicator for corporate operation management.Customer satisfaction involves multiple dimensions and can only be measured through multiple items of importance and satisfaction for specific product or service (Woodside et al., 1989;Oliver, 1980).In recent years, businesses are not the only ones that value customer satisfaction; the service industry has grown to place special emphasis on the significance of customer satisfaction.

Customer satisfaction:
The term "customer satisfaction" was brought up for the very first time by Cardozo (1965), who also introduced the idea of marketing scope.Cardozo believes that the degree of satisfaction customers derive from products provided by vendors would affect their inclination to make repeated purchases.Customer satisfaction is a common goal for many businesses and stands as one of the most important indicator for corporate operation management.Customer satisfaction involves multiple dimensions and can only be measured through multiple items of importance and satisfaction for specific product or service (Woodside et al., 1989;Oliver, 1980).In recent years, businesses are not the only ones that value customer satisfaction; the service industry has grown to place special emphasis on the significance of customer satisfaction.

RESEARCH METHODOLOGY
Research process: The research flow chart is illustrated in Fig. 2. Comments and ratings posted on a Chinese tourism website by customers are utilized to conduct analysis.and Jin, 2008) technology would be deployed to break sentences down into word, which would then be matching against dimensional vocabulary, tone vocabulary and emotion expressing vocabulary for the analysis.Finally, the results would be converted using the semantic conversion table presented in this study on the 7-point Likert scale into service quality dimension ratings.2. The module of quantitative analysis: Directly acquire hotel rating data posted by tourists on a Chinese tourism website: hygiene, service, facilities and environment.Based on the items of service quality dimension listed on Table 1, these items could be replaced by the corresponding dimensions of service quality (reliability, empathy, tangibles and assurance).Since the data source is limited by the nature of information provided by the customer, the items of service quality analysis would only cover four dimensions.Ratings for each dimension would be obtained from the conversion of corresponding data items into respective service quality dimensions.3. Since only four dimensions of service quality would be covered for the conversion of comments on hotel service quality into ratings, the analysis of comment semantic would also cover the vocabulary of just four dimensions and relevant vocabulary of tone and emotion expression.For the sections covered in (1) and ( 2), customer satisfaction correlation analyses have been performed to validate the precision of comment semantic analysis presented in this study.

Comment analysis:
Frequently used vocabulary in the dimensions of service quality: According to the five dimensions presented by Parasuraman et al. (1993) for their SERVQUAL framework, the dimensions of hotel service quality would include responsiveness, reliability, tangibles, empathy and assurance.Due to the lack of ratings on responsiveness on the tourism website in question, only four dimensions of service quality will be examined for the purpose of this study.
In order to ensure consistency in research prudence, all commonly used expressions and words for each dimension have been adequately constructed in the framework of semantic ontology based on the concepts of Delphi method (Murry and Hammons, 1995).The Delphi method is a structured communication technique involving a panel of experts to achieve convergence of predictions.The aim of this technique is to attain basic consensus among experts to derive a uniform opinion on a given subject.The technique offers the advantage of brainstorming without hindering the experts from making independent judgments.And thus, a panel of hotel management experts was enlisted to evaluate a collection of frequently used vocabulary in the dimensions of service quality, colloquial expressions and frequently used descriptions on satisfaction in comments collected for the purpose of this study.In the event of conflicting opinions, the panels' opinions would be summarized to identify the cause of splitting opinions and the experts would be asked to make judgments again until a consensus is reached to establish the frequently used words in the dimension of ontological construction.
It is worth pointing out that the frequently used vocabulary also includes commonly used words that carry negative meanings.For example, words such as 「價位」 (price), 「聲音」 (sound), 「費用」 (fee) all represent negative meanings.If a comment contains descriptive phrases like 「價位」 (price) 「高」 (high), 「聲音」 (sound) 「大」 (loud) or 「費用」 (fee) 「高」 (expensive), one should easily identify the feeling of dissatisfaction experienced by the customer.The frequently used vocabulary established by the experts can be further separated into positive vocabulary and negative vocabulary.
It is worth pointing out that the frequently used vocabulary also includes commonly used words that carry negative meanings.For example, words such as 「價位」 (price), 「聲音」 (sound), 「費用」 (fee) all represent negative meanings.If a comment contains descriptive phrases like 「價位」 (price) 「高」(high), 「聲音」 (sound) 「大」 (loud) or 「費用」 (fee)「高」(expensive), one should easily identify the feeling of dissatisfaction experienced by the customer.The frequently used vocabulary established by the experts can be further separated into positive vocabulary and negative vocabulary.
Generally speaking, each sentence of comment should contain positive or negative dimensions word based on the rules of common usage.Such linguistic characteristics have been used in the matching of vocabulary dimension from customer comments and vocabulary of semantic ontology for the purpose of this study in order to determine the items of service quality that customers emphasize.

Frequently used vocabulary for tone and emotion expression:
Based on the common expressions taken from customer comments, words and phrases such as「很到位」 (spot on), 「很遠」 (far from) and「不差」 (not bad) to describe their preferences or the lack thereof for the quality of hotel service they have experienced.From the analysis of sentence structure, a recurring model of phrase combination becomes apparent: tone vocabulary and positive (negative) emphasis expression vocabulary.Take Fig. 1 as an example; the use of tone word 「很」(very) reveals the extent of tone emphasis and「恐怖」(terrible), being a negative word of emotional expression, makes the phrase a combination of tone emphasis word and negative of emotional expression word.The possible combinations of words can be summarized as: In this research, vocabulary of tone and emotion expressive vocabulary frequently used by the general public has been evaluated and selected by a panel of linguistic experts using the Delphi Method for the construction of an ontological structure.Complemented with the collection of vocabulary in the dimensions of service quality, the vocabulary would form the structure of semantic ontology (Fig. 3).In this structure, both PZB dimension vocabulary and emotion expressive vocabulary comprise positive and negative vocabulary Conversion of comments into ratings: For the process of comment-to-rating conversion, a panel of experts was asked to reach consensus on the vocabulary through the Delphi method for the construction of commonly used service quality dimension items, tone and emotional expression vocabulary in an ontological structure, leading to the creation of a semantic ontological module for the comments.Comments from customers were then processed by means of Tree bank technology to break down the sentence structure for the extraction of related word, which was matching against the vocabulary in the semantic ontological module.Finally, using the semantic conversion table presented in this study, the results of semantic matching would then be converted to their corresponding 7-point Likert scale points and compiled into individual ratings for their corresponding dimensions to derive customers' satisfaction ratings on hotel's service quality.The process of data construction and conversion is covered as follows: • Data definition: Based on the dimensions of SERVQUAL presented by Parasuraman et al. (1993), the dimensions of hotel service quality chosen for this study include: reliability, tangibles, empathy and assurance.The first step involves the definition of Comment Semantics (C-S) for service quality dimension (PZB).Let C-S be a service quality dimension vocabulary (PZB i ), which is a combination of tone vocabulary (P k ) and emotional expression vocabulary (Q n ) where i represents the dimension item (i = 1, 2, 3 and 4) corresponding to reliability (PZB 1 ), tangibles (PZB 2 ), empathy (PZB 3 ) and assurance (PZB 4 ); k represents the category of tone vocabulary, including emphasis (P 1 ) and strong emphasis (P 2 ) vocabulary and n represents the category of emotional expression, including positive (Q 1 ) and negative (Q 2 ) vocabulary.
Since the vocabulary of service quality dimensions come in both positive and negative forms, therefore, service quality dimensions vocabulary PZB i would encompass all vocabulary represented by PZB ij where j represents the category of service quality dimensions and thus let PZB i = {PZB ij , j = 1, 2} where PZB i1 represents positive vocabulary for dimensions i and PZB i2 represents negative vocabulary for dimension i.And thus, C_S = {PZB ij ∪P k ∪Q n , i = 1, 2, 3, 4; j, k, n =1, 2}.The relationships between the vocabulary of PZB ij , P k , Q n are represented in Table 2. Finally, service quality dimension rating would be defined as PZB i _CS where i represents dimension of service quality.Since a 7-point Likert scale in this research has been chosen for the conversion of comment semantics-to-ratings, the range of rating would be 1 to 7 points.• Data construction: The frequently used vocabulary of SERVQUAL dimensions, tone and emotional expression compiled by the panel of experts is constructed in the collection of PZB i = {PZBij∪P k ∪ RQ n }, where i = 1, 2, 3, 4 and j, k, n = 1, 2. The following example on the dimension of "assurance" (PZB4) will be used to illustrate the process of data construction in the comment semantic ontological method: o The construction of dimension item vocabulary would come first: PZB 41 represents the collection of positive dimension item vocabulary.PZB 41 = {交通 • Construction of semantic ontological module: The construction of three major components of comment semantic vocabulary (frequently used vocabulary on service quality, tone and emotion expression) is achieved through the ontological structure designed for the purpose of this study as shown in Fig. 4 Since PZB 42 serves as the collection of negative vocabulary (such as 「價格」 (pricing), 「價位」 (price), 「費用」(fees), 「房費」(room charge)), all carrying negative connotations since a high "price" would be dissatisfactory for customers.When paired with a positive emotion expressive word, it would denote a negative emotion, such as 「價位」 (price) 「高」 (high).However, when paired with a negative emotion expressive word, it would denote a positive emotion, such as 「價位」 (price) 「低 (low).
Sentence structure analysis: As for the sentence types and formats of comments, the data have been grouped into four categories: Although the arrangements of word in the sentences are not consistent, the meanings obtained from the combination of word should adequately and accurately convey the feeling that customers wish to express through their comments.In light of the varying location of word in the collection of comment sentences, word would be matched in a oneby-one manner to prevent the rate of successful matching from falling.Finally, the word taken from comment sentences would be matched against the sentences from the result using the first sentence format seen in the frequently used sentence type.

Conversion of customer comment matching vocabulary and rating:
In this research, dimensions of hotel service quality are broken down into; reliability, tangibles, empathy and assurance.Each dimension has its specific commonly used positive (and negative) vocabulary, complete with frequently used combination of tone and emotion expressive vocabulary that would render the analysis of comment sentences possible.After matching word of customer comments and the vocabulary derived from the semantic ontological structure, it would be possible to analyze the semantics of customer comments.Coupled with the comment semantic conversion table (Table 3), one can proceed to work out customer satisfaction ratings on service quality.The 7-point Likert scale (7-very satisfied; 6satisfied; 5-somewhat satisfied; 4-barely satisfied; 3somewhat unsatisfied; 2-not satisfied and 1-0 very unsatisfied) is chosen to represent the degree of customers' satisfaction with hotels' service quality.
The first step of the process involves the use of Tree bank technology to break down the structure of customer comment sentences to enable the segmentation of sentence into word, which would be matched with vocabulary from the semantic ontological structure.Customers' degree of satisfaction would be determined through their use of tone and emotion  expressive vocabulary until all sentences have been matched.Finally, the occurrence of each service dimension item would be used to compute the mean of ratings to complete the process of comment semantic conversion to ratings.
As for the service quality dimension that has been omitted in the contents of customer comments, such information would be collective acknowledged as "missing data" in this study.Traditionally, deletion is usually the standard approach when dealing with missing data where missing entries would simply be deleted (Little and Rubin, 2002;Musil et al., 2002;Scheffer, 2002).However, deletion of data could easily lead to significant loss of data samples, insufficient samples and so forth.Scheffer (2002), Acock (2005) and Schafer and Graham (2002) had proposed that we can use the average value to replace the missing data.Such approach is not only simple but also easy to understand and deploy.In other words, if any service quality item has missing description after comparison (resulting in a missing rating), the service quality item would be interpreted as "barely satisfied", which rating 4 points on the scale.The following section will illustrate the steps involved in the matching and conversion of word from comment sentences into their corresponding ratings: First, import the data of customer comments and determine the number of sentences (d) contained in a given comment by examining the punctuations (",", ";" or ".").The comment would then be broken into d numbers of sentences.This step can be represented as comment = {S 1 , S 2 , S 3 , …, S d }, alternatively: comment = ∑    =1 . Next, utilizing the Tree bank technology, sentences will be decomposed into word, or   ={t 1 , t 2 .t 3 , …, t e } where "e" represents the number of word obtained from the sentence.This can be otherwise represented as:   = ∑  ƒ  ƒ=1 .Positive and negative vocabulary combination of emotional expression must be accurately determined and there are four possible combinations in total: • Positive word and positive word • Positive word and negative word • Negative word and positive word • Negative word and negative word These four combinations correspond to feeling of positive emotion, feeling of negative emotion, feeling of negative emotion and feeling of positive emotion respectively.Let z =1 represents the feeling of positive emotional expression; let s = 0 represent the initial value of emotion expressive word.Should a customer mention vocabulary of two service quality dimensions in the same sentence (i.e., 「設施」(facility) and 「環境」(environment)), assume both share the same emotional expression and tone expression.Let h represent the dimension of sentence S a and let h = 0.In addition, the array where successfully matched word would be stored must be configured as: dimension: B; tone: C; emotion expression: D and rating: E. G i is the counter of service quality dimension word occurrence in a given sentence and G i should be set to 0 by default.The matching would be performed as the final part of the process.
Step 1: Determine the presence of service quality dimension word by matching word t f of sentence S a with PZB ij (i = 1~4, j = 1~2).
Step 1.1:If the match were successful, set h = h + 1, which represents the occurrence of service quality dimensions in word S a thus far.Store word t f into the slot of B ah in the dimension vocabulary array B and i into the slot of B' ah in the temporary array of B'.
If j = 1, the word would be a positive one If j = 2, the word would be a negative one Let G i = G i + 1 to increase the count of dimension i by 1 Let f = f + 1 and return to step 1 to continue with the comparison until f = e.
Step 1.2:If the match were unsuccessful, determine the presence of tone word by matching t f with P k .
Step 1.2.1:If the match is successful and if k = 1, the word would be a tone emphasizing and let y = 2.If k = 2, the word would be a strong emphasizing and let y = 3.
Store t f in the slot of C a in tone vocabulary array C. Let f = f + 1 and return to step 1 to continue with the comparison until f = e.
Step 1.2.2:If the match is unsuccessful, determine the presence of emotion expressive word by matching t f with Q n .
Step 1.2.2.1:If the match is successful, set s = s + 1 to represent the s th emotion expressive word in sentence S a .
If n = 1, the word would be a positive emotional expression and let z = z*1.
If n = 2, the word would be a particle of negative emotional expression and let z = z* (-1).
Store t f in the slot of D as in the emotional expression vocabulary array D. Let f = f + 1 and return to step 1 to continue with the comparison until f = e.
Step 1.2.2.2:If the match is unsuccessful, let f = f +1 and return to step 1 to continue with the comparison until f = e.
Step 2: In sentence S a , if f = e, it would mean that the match for sentence S a has been completed.
If h = 0, it would mean that sentence S a contains no service quality dimensional word and remove the word stored in: • C a slot of tone vocabulary array C • D a slot of emotional expression vocabulary array D, since one cannot determine the nature of a customer's comment if the sentence contains no word relating to the service quality dimension Step 3: The value of rating for sentence S a would be computed based on Table 4 (Comment sentence semantic to rating value conversion table) and stored in slot E a of rating array E. If h>1, it would mean that there are more than one service quality dimension word in sentence S a , so retrieve i from slot B' ah of array B' (from B' a1 to B' ah-1 ).
If j = 1: E a = 4 + y*z If j = 2: E a = 4 + y (-z) PZB i -CS = PZB i -CS + E a Step 4: Let a = a + 1.This will reset the following values: f = 1, h = 0, z = 1 and s = 0.Return to step 1 and continue with the matching until a = d.
Step 5: After all sentences have been match, one will be able to see the results of automated semantic analysis of customer comments and their converted ratings from service quality dimension vocabulary array B, tone vocabulary array C, emotional expression vocabulary array D and rating array E. The resulting representation of satisfaction rating for service quality would be expressed as PZB i -CS/ G i .
Rating analysis: Generally speaking, tourist websites that display ratings of customer satisfaction on hotels  the rating converted from customer comment through the automated semantic analysis; 2 mean of customer satisfaction (1) : comment = 7-grade scale', ratings = 5-grade; 2 mean of customer satisfaction (2) : Converted from comment = 7-grade scale', ratings = 5-grade usually use a 5-point Likert scale (5-very satisfied; 4somewhat satisfied; 3-neutral points, 2-somewhat dissatisfied and 1-very dissatisfied).Ratings come between the values of 1 to 5 points.Items rated by customers include hygiene, service, facilities and environment.By referring to the items on Table 1, these items are transformed into corresponding dimensions of service quality (reliability, empathy, tangibles and assurance) before ratings were given.These ratings would ultimately be used to validate the accuracy of customer satisfaction analysis and comment semantic conversion.

RESULT ANALYSIS
For the purpose of this research, ratings and comments provided by customers on hotels available on a Chinese tourist website from January 2011 through July 2012 were collected as the data (Fig. 1) for various analyses.Customers rated on items including hygiene, service, facilities and environment.Referring to the framework of the PZB model (Table 1), these items correspond to dimensions of reliability, empathy, tangibles and assurance respectively.Since the ratings available on the aforementioned Chinese website only covered these four dimensions, the automated semantic analysis featured in this study has been designed to analysis vocabulary for these four dimensions and vocabulary of tone and emotional expression.Since the source of data for semantic analysis comes from comments by posted customers (rather than formally written articles in specific format), entries that contained no comments or repetitive comments or bilingual (English and Chinese) comments gathered during the data collection process have been filtered at the initial stage.A total of 1,235 entries of valid samples were collected.In addition to the conversion of comments into ratings through the automated semantic analysis featured in this study, ratings provided by the customers were also used for the following purposes: • Comparison of ratings • Comparison of customer satisfaction • Comparison of the correlation between service quality and customer satisfaction The aforementioned analyses have been chosen as the means of validating the accuracy of semantic analysis presented in this study.

Comment and rating analysis:
In this study, customers' comments (qualitative data) were converted into quantitative ratings by means of the automated semantic analysis presented herein.Since the 7-point Likert scale was chosen to represent the level of customer satisfaction in the semantic analysis, the resultant ratings ranged from 1 to 7 points.However, the scale of rating available on the tourist website was the 5-point Likert scale and thus the ratings were ranged from 1 to 5 points.In order to validate that accuracy of the semantic conversion of comments to ratings presented in this study, the results of conversion were compared against ratings provided by customers, since both comments and ratings from the same customer would represent his satisfaction with service quality and the significance of the two ought to be consistent.
From Table 3, one can see some discrepancy between the average of ratings from semantic conversion and the average provided by customers and this is primarily due to the difference in the range of scales used.It is important to note that with regards to the comparison of customer satisfaction, the converted ratings turned out to be consistent with the actual customer ratings.Thus, despite having scales of different range, the significance of both ratings is fairly close.

Regression analysis on service quality and customer satisfaction:
In this research, the dimensions of tangibles, reliability, assurance and empathy of service quality have been configured as the independent variables with "customer satisfaction" as the dependent variable for the regression analysis of semantically converted ratings and original customer ratings.Apart from validating the accuracy of the semantic conversion, the analysis would also allow further discussion of the impact of these dimensions on customer satisfaction.Results of the analysis are presented in Table 5.

Results of analysis:
• With "customer satisfaction" as the criterion variable, four independent variables of "tangibles", "reliability", "assurance" and "empathy" were chosen to analyze their significance in terms of customer satisfaction.From Table 5, one can see that service quality has an overall predictability of 90.5% (R 2 = 0.905) for customer satisfaction, meaning that the overall regression has reached levels of significance (the significance of reliability, empathy, tangibles and assurance are all smaller than 0.001).• From Table 5, the comments and ratings' service quality dimensions (reliability, empathy, tangibles, assurance) and customer satisfaction both have significant impact that are positive relationship.
From the ratings converted from comment data semantic analysis and original customer ratings, although the data are presented in two different ways, they turned out to show consistency in terms of satisfaction categorization and exhibited consistency in the dimensions (reliability, tangibles and empathy) that were most significant from the regression analysis after the semantic conversion.One could say that the automated semantic analysis presented in this study offers significant accuracy when compared against the original ratings.This research also goes to show that qualitative comment data could also be processed into a data source for statistical analysis.With adequate understanding of the semantic analytic process and methods of matching, one could turn comments into quantifiable ratings through the automated semantic analysis for future researches and applications.

CONCLUSION
The automated semantic analysis presented in this study for qualitative comment data has been designed with the goal of helping businesses to promote their efficiency in service improvement.The analysis of customer comments would help businesses to better understand the contents that customers truly care about.If service providers would pay special attention and improve these service contents, they would no doubt be able to raise customer satisfaction, which would in turn elevate sales performance.However, the analysis of ratings only reveals the extent of customers' satisfaction; it does not offer further insight to the contents of services that customers expect to see improvements.In this study, we have proven that automated semantic analysis is more effective in the generation of ratings compared to manual processing and from the contents that had low ratings; businesses would be able to quickly understand the opinions and thoughts of their customers.The semantic analysis establishes the three components of service quality dimension vocabulary, tone vocabulary and emotional expression vocabulary through a structure of semantic ontology.Customers' comments are then fed to a Tree bank technology to structurally decompose sentence in order to extract key vocabulary, which would be matched against the vocabulary from the semantic ontological structure.Finally, vocabulary of successful matching would be converted into their corresponding ratings using a semantic conversion table.This research results of the analysis showed that the ratings converted from the comments were not only fairly similar to the original ratings from the customers but they also fell in the same range of satisfaction level.In light of this, this research has successfully presented a model of data analysis that is different from those presented in previous studies.
In an environment driven by cloud computing technology, corporate service quality has improved at an accelerated pace and it is easy to foresee that most corporate websites would provide relevant information of customer comments in the near future.Since the Chinese tourist website chosen for the purpose of this study offers a vast amount of hotel information and comment and rating data from customers, their comments were chosen as the source of data for the automated semantic analysis.The model of automated semantic analysis presented may be used in social media such as Face book, blogs and researches in other fields in the future.In addition, corporate managers should also pay more attention to the results of semantic analysis and refer to them as a crucial basis for improvements in service quality in order for their businesses to benefit from higher customer satisfaction.

Table 2
1m represent negative emotional expression word Fig. 4: Diagram of semantic ontological-with PZB 4 , P n , and Q k as an example

Table 3 :
Comment sentence semantic to rating value conversion table Evaluation value

Table 4 :
Comments and ratings-comparison of customer satisfaction

Table 5 :
Comments and ratings-regression analysis and comparison of service quality and customer satisfaction