Home           Contact us           FAQs           
     Journal Home     |     Aim & Scope    |    Author(s) Information      |     Editorial Board     |     MSP Download Statistics
2013 (Vol. 5, Issue: 07)
Article Information:

A Grammatical Evolution Approach for Content Extraction of Electronic Commerce Website

Wei Qing-jin and Peng Jian-sheng
Corresponding Author:  Wei Qing-jin 

Key words:  DOM, grammatical evolution, web content extraction, Xpath, , ,
Vol. 5 , (07): 2426-2432
Submitted Accepted Published
July 26, 2012 September 12, 2012 March 11, 2013

Web content extraction, a problem of identifying and extracting interesting information from Web pages, plays an important role in integrating data from different sources for advanced information-based services. In this paper, an approach and techniques of extracting electronic commercial information from the Web pages without any given template is investigated in a way of Grammatical Evolution (GE) method. Although a lot of research used the Xpath technique to extract the content of Web pages, but due to the complexity of the Xpath grammar, it is too difficult to perform the processing automatically for evolutional tools. Hence, a reduced language integrating Xpath and DOM techniques is given to generate the solution of parse in a BNF grammar form, which is used in the GE. Moreover, a fitness function evaluation method is also proposed on the fuzzy membership of the two parts in the chromosome. Finally, empirical results on several real Web pages show that the new proposed technique can segment data records and extract data from them accurately, automatically and flexibly.
Abstract PDF HTML
  Cite this Reference:
Wei Qing-jin and Peng Jian-sheng, 2013. A Grammatical Evolution Approach for Content Extraction of Electronic Commerce Website.  Research Journal of Applied Sciences, Engineering and Technology, 5(07): 2426-2432.
    Advertise with us
ISSN (Online):  2040-7467
ISSN (Print):   2040-7459
Submit Manuscript
   Current Information
   Sales & Services
Home  |  Contact us  |  About us  |  Privacy Policy
Copyright © 2015. MAXWELL Scientific Publication Corp., All rights reserved