With scraping data from the internet becoming more difficult and with new technologies being developed at a faster rate than ever before, I decided to build a new system (for personal use) called Prober, which designed to overcome the difficulties of scraping.
Prober scrapes websites (if robots.txt allow it to do so) without code or the use of xPaths. With websites constantly changing results in xPaths requiring changes, despite the fact that often, visually, nothing too drastic changes. Prober can predict visually where certain elements are placed on a page so that you don’t need need to update the xPaths. The system is trained, via visual annotation, on what an element looks like within a webpage; so doesn’t matter if a site’s layout slightly changes. Also It does not interact with the DOM at all which is what most scrapers rely on at the moment.
Prober is still in development but I can see the benefits and I’m loving it!