cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
jmillsjmills
Frequent Visitor

Web Scrape: Worked before but now timing out...

Hello - can anyone find a way of scraping Bet365 without it timing out? https://www.bet365.com/#/AC/B1/C1/D8/E100842460/F3/I8/

 

I had a code that would work for this URL previously, but several months later it now times out instead of returning the data from that page. Does anyone know whether the site has something deliberately blocking web scraping somehow? It is all I can think of as to why it would stop the initial load process, when it previously worked.

 

Also this appears to connect within Excel Power Query, but not Power BI Desktop - and I need it connecting in the latter given Power BI Desktop's extra functionality for manipulating the data afterwards.

 

Source = Web.BrowserContents("https://www.bet365.com/#/AC/B1/C1/D8/E100842460/F3/I8/", [WaitFor=[Timeout=#duration(0, 0, 0, 2)]])

 

Thanks very much!

4 REPLIES 4
v-angzheng-msft
Community Support
Community Support

Hi, @jmillsjmills 

 

I seem to have found a workaround, try to change Web.BrowserContents to Web.Contents

I tried the following M code, PowerQuery returned me the correct Html information

let
    url="https://www.bet365.com/#/AC/B1/C1/D8/E100842460/F3/I8/",
    web=Text.FromBinary(Web.Contents(url))
in
    web

For references:

https://community.powerbi.com/t5/Desktop/Web-Scraping-Web-BrowserContents-Html-Table-Errors/m-p/1499...

 

Hope this helps.

 

Best Regards,
Community Support Team _ Zeon Zheng
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Thank you so much for your reply! That's very useful to know and I appreciate the workaround, given I had given up!!

 

It looks like the HTML it is now pulling in is a little different to before. I was using CSS selectors to plot an HTML table that pulled in the odds (for example I was looking for .srb_ParticipantLabelWithTeam_Name, .srb_ParticipantLabelWithTeam_Team and .gl-MarketGroupButton_Text, all with a row selector set as .srb-ParticipantLabelWithTeam). You can see these CSS selectors if you inspect element on the odds in the website directly.

 

However the HTML it's pulling in for me (and not sure it's the same for you) appears to be one large booting javascript function of some sort? Not the best at diagnosing these things but there certainly doesn't seem to be the full page content in the same way as before. Is this the case for you? Do you have any more ideas?

 

Thank you so much! The link may now have expired but this URL is a new match:

https://www.bet365.com/#/AC/B1/C1/D8/E100693610/F3/I8/

 

Really appreciate all your efforts!

v-angzheng-msft
Community Support
Community Support

Hi, @jmillsjmills 

 

I think this may not be your problem. Some websites will set up some measures to prevent webpages from crawling. I also tried the above URL, which also prompts a timeout error, but when I try other websites, it works normally. This may require some crawler knowledge for the website to recognize Power BI as a browser and return data.

vangzhengmsft_0-1623920879803.png

 

For references:

https://community.powerbi.com/t5/Desktop/Website-scraping-advice/m-p/762012

https://community.powerbi.com/t5/Desktop/Web-scraping-with-Power-Bi/m-p/927014

 

 

Hope this helps.

 

Best Regards,
Community Support Team _ Zeon Zheng
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

Hi @v-angzheng-msft - thanks very much for your reply! I had a feeling it be something along these lines and that they are just set up to resist web scraping. Thank you for clarifying

Helpful resources

Announcements
PBI_User Group Leader_768x460.jpg

Manage your user group events

Check out the News & Announcements to learn more.

Get Ready for Power BI Dev Camp

Microsoft named a Leader in The Forrester Wave

Microsoft received the highest score of any vendor in both the strategy and current offering categories.

Get Ready for Power BI Dev Camp

Power BI Dev Camp - September 30th, 2021

Mark your calendars and join us for our next Power BI Dev Camp!

PowerPlatform 768x460.png

Microsoft Learn

Check out our new Discover Your Career Path blog post series and get all the details.

Top Solution Authors