Google Extension for Ad Hoc Building / Website Cleaning
To help with the development of an automatic website scraper, a number of steps can be taken. One of these steps is reducing the noise on the website. Secondly to create an automatic xPath extraction which basically defines the path to a certain element within the website.
I was tasked with developing a Google Extension that automatically removes the noise whilst also automatically calculating the xPaths for all the required fields from within the website. These fields are then passed to a system which then starts the automatic scraping.