Automatic Structured Data Extraction from a Webpage

  • Samiah Jan Nasti, Ed Gowhar , Shoaib Mohd Nasti, Neerendra Kumar
Keywords: : Data Records, Data Items, Extraction, Segmentation

Abstract

As much of the effort is being put in developing a fully automatic extraction method. To extract structured data from the Web many researchers propose different solutions. In this paper, we propose a method for automatically extracting structured data from a web page known as Extraction of Structured Data (ESD). When a structured webpage is requested from the web server, web server in turn queries for the information from the underlying database and then returns the information to the browser. Our method makes use of both structural and visual features of a web page to extract data records from a webpage because the structured data records are retrieved from some database and presented on web pages using some predetermined template. Also, Website developers present their data in such a way that humans easily and quickly figure out and distinguish each data record.

Published
2021-10-04
How to Cite
Neerendra Kumar, S. J. N. E. G. , S. M. N. (2021). Automatic Structured Data Extraction from a Webpage. Design Engineering, 1097-1110. Retrieved from http://www.thedesignengineering.com/index.php/DE/article/view/5008
Section
Articles