Someone asked me recently, why would I want to web scrape?
So I start thinking: In a perfect world you grab your data-set as you walk out of the lift, go to your desk, plug it in and press the button… *sarcasm* – we didn’t come here for this – here’s why you should think about webscraping.
The data is not always provided
A lot of businesses and organisations have the information, but it’s not in a ready to use format and they won’t give it to you without a whole lot of process. Even when you get the data eventually, you find that you need a fresh extract or the latest version, and it’s back through the whole process again.
In many cases, they’ve simply published it on their website & it is easier to get it from one source of truth.
Can I just use an API instead?
Not every website has an API, but most websites, especially dynamic websites will have a structured way of building their web pages.
You want the data now don’t you? And you can see it on a website, but you just want it in a digestible, tabular, filterable format. In this case, webscraping is a good plan.
Ok, that’s all I’ve got for the moment, but I’ll just add more when they come to me. Look at my other post about using an online webscraping tool or you can begin scraping with Python if you are a developer.