
How to scrape dynamic websites with Scraper API - well be rendering some javascript and passing the text response so if youve done much web scraping youll have come across websites which are predominantly javascript and you will have hit a brick wall when youve been trying to web scrape them so what im going to show here is an example of a site this is a online betting site and quite often you will need to scrape something like an online bank site a financial website where youve got stocks and shares where theres lots of dynamic information and you will need to scrape it im going to show you how to do it using scraper api and im going to use scrapy and also ill include beautiful soup just for a bit of a bonus so from my experience scraper api is the best api which ive come across and if you want to render javascript lets read this scraper api enables you to customize the apis functionality by adding additional parameters to your requests for example you can tell a scraper api to render any javascript on the target website by adding render equals true to your request theres also a sdk which is available for python which is what im going to use so without further ado lets go and have a look at that so if youre using python you just do pip install scraper api hyphen sdk and as you can see then when you want to use it in your code you do from scraper api import scraper api client and then you do client equal scraper api client and where it says token that is where you need to put your api key which is what you get when you register with scraper api okay so this is not a tutorial as such what this is is an example of how to render javascript to a scraper api so you can see import scrapy im doing this example with scrapy im also using beautiful soup the key point is that we are rendering javascript not actually which framework were using so import scrapey from scrapy crawler import crawler process from scraper api import scraper api client as we just saw on the documentation yeah so the first uh bit thats relevant really or that we need to pay attention to is the fourth line and client equals scraper api client and then we need to put our api key in there now ive put some xs in the end and yours will be different because your api key is unique and you get it from scraper api when you register with them next from bs4 import beautiful soap thats just because i was going to then go on and use some selectors to thin out the uh the response um so class bet 365 scrapy spider so if youve ever written a scrapey spider this should all be very familiar to you we do start requests and that is the url or multiple urls that we want to visit um so one of the benefits of using scrapy is that you can easily visit multiple urls and then we say for url in urls then we go off and do scrapey.request then in the brackets we do client.scrapygeturl equals url and then we say render equals true and thats the important bit because that is what does the javascript rendering and then we do callback self.pass then we go which obviously then calls pass and pass then gets the html or creates a variable called html from the response body and then we create a beautiful soup object which we then get the text from so we then export that to a html file just so that we can view it okay so i thought id just show you this in real time because otherwise its a bit of a con so just to give you an appreciation of it does take a minute or so to fully complete this spider and standard scrapey output and scraper api is doing the clever stuff rendering the javascript and we will then get a file at the end of it which should hopefully have um have some text that we can then pass and get the results that we actually need so its still running obviously and um yeah no im very impressed with this its um here we go its already done it so that wasnt too long and its um yep there it is its done so if we go now and we just do im just going to grep for judd because i know that will be one of the results and there we have it the text weve got the text that we wanted and from that we can then go off and do some passing and do whatever we need to do so without scraper api and the render script the javascript render feature we would not have been able to get this and we would have to have done something much more elaborate so yeah please register or you can register without a credit card so you could just get the trial account and see how you go if you like it then register and please use my coupon code and yeah thanks you