Prehanto, Dedy Rahman and Indriyanti, Aries Dwi and Prismana, I Gusti Lanang Eka and Permadi, Ginanjar Setyo and Prastyo, Edwin Hari Agus Implementation of Web Scraping on News Sites Using the Supervised Learning Method. Ilkogretim Online, 20 (3). pp. 432-441.
![]() |
Text
10.JI-Peer reviewer Implementation of Web Scraping on News Sites Using.pdf Download (704kB) |
![]() |
Text
10.Plagiasi jurnal implementasi web scraping.pdf Download (2MB) |
![]() |
Text
10.REPO-Implementation of Web Scraping 2021_compressed.pdf Download (1MB) |
Abstract
Indonesia is one of the highest internet users in the world, including in the penetration of information on the internet, online news media. But in general news sites not only display news information, but most sites also display other information such as advertisements and also forms of navigation that interfere with news site readers and interfere with reader’s comfort, from these problems this study aims to implement web scraping techniques with supervised learning methods and analyzing the form of DOM tree and XPath news sites. The supervised learning approach method is the method used in this study, which is one of the methods of machine learning. By combining these web scraping techniques with supervised learning, the aim is to be able to implement and optimize web scraping techniques to gather news information from various sites. To do basic web scraping namely knowing DOM patterns, XPath structure as a data model or selector at each site. The results of research in the form of a web scrap application that can retrieve news site content without copy paste and the data is stored in a database and displayed to the user application form for the reader without any ads and navigation that disturb the reader.
Item Type: | Article |
---|---|
Subjects: | T Technology > T Technology (General) |
Divisions: | Faculty of Engineering, Science and Mathematics > School of Electronics and Computer Science |
Depositing User: | Mr Rizal Sakhur |
Date Deposited: | 10 Apr 2023 20:34 |
Last Modified: | 10 Apr 2023 20:34 |
URI: | http://eprints.unhasy.ac.id/id/eprint/248 |
Actions (login required)
![]() |
View Item |