Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Anonymous
Not applicable

How to use web link from Kaggle to extract a .csv file using Web extractor?

Hello dear experts,

 

I'm trying to extract covid-19 real-time dataset from this below URL. But unfortunately, I'm not seeing a table in my Power BI - extract data pane, rather I see a .html input. Why is this and how to resolve it?

 

https://www.kaggle.com/sudalairajkumar/novel-corona-virus-2019-dataset?select=covid_19_data.csv

 

Kind regards,

Ambareesh

1 ACCEPTED SOLUTION

You have a few ways of doing this (as far as I'm aware) you can retrieve the download using something like a PHP/python script and extract the .csv to a location where you can subsequently have powerBI look for it. This I think is the only way other than downloading it yourself or by searching for a stream of data. I do see that he in Kaggle references to a source, from which he retrieves and edits the data. Perhaps looking at the source gives you a link that you can in fact use in powerBI

View solution in original post

8 REPLIES 8
Anonymous
Not applicable

Alternatively, I managed to use Google Docs source directly which is present in the same webpage.

 

https://docs.google.com/spreadsheets/d/e/2PACX-1vQU0SIALScXx8VXDX7yKNKWWPKE1YjFlWc6VTEVSN45CklWWf-uW...

 

Thanks,

Ambareesh

Do test to see if this works after publishing your report, I do believe that this only works when logged into google docs. Im not convinced this will work after publishing.

Anonymous
Not applicable

@dzuurman , I used Anonymous login and it still worked.

 

Thanks, 

Ambareesh

Thats great, good to know ! Glad i could be of help, and do share your covid-19 dashboard/report when its done !

themistoklis
Community Champion
Community Champion

@Anonymous 

 

It seems that you are trying to use COVID-19 datasets. Kaggle uses Johns Hopkins COVID-19 datasets (based on the content from web link that you sent). It is the best source of information for updated data.

The source of the information is this one:

https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series

 

You can connect to the dataset by using the following statement in Power Query Editor:

 

 

= Csv.Document(Web.Contents("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_recovered_global.csv"),[Delimiter=",", Encoding=65001, QuoteStyle=QuoteStyle.None])

 

 

 

Or else you can select 'New Source' --> 'Web' --> add the link above (https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_recovered_global.csv)

dzuurman
Helper I
Helper I

Hi,

 

This is because that URL in fact is a link to a webpage and not directly to a .csv  I cannot find a link directly to the CSV file on Kaggle and the download link on the page that your link currently goes to only retrieves a .zip.

 

 

Anonymous
Not applicable

Ok @dzuurman , now I understood why this happened. 

 

But how do I resolve this? How can I use this real-time data in my report?

 

Thanks,

Ambareesh.

You have a few ways of doing this (as far as I'm aware) you can retrieve the download using something like a PHP/python script and extract the .csv to a location where you can subsequently have powerBI look for it. This I think is the only way other than downloading it yourself or by searching for a stream of data. I do see that he in Kaggle references to a source, from which he retrieves and edits the data. Perhaps looking at the source gives you a link that you can in fact use in powerBI

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.