Scraping Web Application and Bot using Machine Learning and Natural Language Processing

As an ongoing contract, my role involves developing a bespoke scraping infrastructure and UX that manages a bot, for scraping data across the internet to be analysed.

The system is built using some of the latest frameworks and technologies – Bootstrap, React, Tensorflow and Spacy. When a scrape is started, it initialises and executes a job for the bot. The bot will then scrape the data it needs whilst also performing various jobs such as getting the data ready to be stored in the system.

While its scraping, if any problems occur such as captchas, these will be dealt with automatically by the Bot. It will also log these events and send notifications for debugging purposes.

After the data has been scraped, the bot will then run a number of machine learning algorithms, such as classification and image recognition as well as identifying objects required for analysing. With every iteration the machine gets more intelligent. The machine is supervised in the form of controlling the data that it learns from.

Like what you see then, fancy a chat?

Email me Phone me

Testimonials

BAAM has worked with Dean for the last 2 years. He helped us set up a website for BAAMfest 2015 and it was a great experience. After the festival Dean was more than kind to help us revamp the website and personalise it. After initial brainstorming he successfully translated our sketchy ideas into a vibrant website. He has always made time to accommodate our requests and assist with hiccups. He is very easy to communicate with and is fast responsive. I particularly like his work cause he has a strong understanding of how to illustrate and communicate ideas through design. Last but not least Dean has worked voluntarily with us all this time, which speaks volumes for his ethos and character.

Konstantina Samara - Bude Arts and Music