Journal Home \| Aim & Scope \| Author(s) Information \| Editorial Board \| MSP Download Statistics

Research Journal of Applied Sciences, Engineering and Technology

Abstract


2014(Vol.7, Issue:5)

Article Information: Linear Reranking Model for Chinese Pinyin-to-Character Conversion Xinxin Li, Xuan Wang, Lin Yao and Muhammad Waqas Anwar Corresponding Author: Xinxin Li Submitted: January 31, 2013 Accepted: February 25, 2013 Published: February 05, 2014
Abstract:
Pinyin-to-character conversion is an important task for Chinese natural language processing tasks. Previous work mainly focused on n-gram language models and machine learning approaches, or with additional hand-crafted or automatic rule-based post-processing. There are two problems unable to solve for word n-gram language model: out-of-vocabulary word recognition and long-distance grammatical constraints. In this study, we proposed a linear reranking model trying to solve these problems. Our model uses minimum error learning method to combine different sub models, which includes word and character n-gram LMs, part-of-speech tagging model and dependency model. Impact of different sub models on the conversion are fully experimented and analyzed. Results on the Lancaster Corpus of Mandarin Chinese show that our new model outperforms word n-gram language model. Key words: Dependency model, minimum error learning method, part-of-speech tagging, word n-gram model, , ,
Abstract	PDF	HTML

Cite this Reference: Xinxin Li, Xuan Wang, Lin Yao and Muhammad Waqas Anwar, . Linear Reranking Model for Chinese Pinyin-to-Character Conversion. Research Journal of Applied Sciences, Engineering and Technology, (5): 975-980.

ISSN (Online): 2040-7467
ISSN (Print): 2040-7459

Information

Sales & Services

Home | Contact us | About us | Privacy Policy
Copyright © 2024. MAXWELL Scientific Publication Corp., All rights reserved