cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Highlighted
Advocate I
Advocate I

Download CSV files from Web then query file(s) automatically

Hello

 

I would like to use Web.Contents() to download all CSV files located on a website. (i.e http://www.nemweb.com.au/REPORTS/CURRENT/Daily_Reports/).

 

With Web.Contents(), I'm able to generate a list which, unfortunately, doesnt save the url path.

It's very basic. Here is an extract of the list generated with Web.Contents() from PowerBi.

webpage data.PNG

 

Do you know a way to save the path and to automaticaly import all of them in my Pbi file (desktop) please ?

7 REPLIES 7
Highlighted
Anonymous
Not applicable

Re: Download CSV files from Web then query file(s) automatically

I dont believe Power Query has the ability to download zip files. I know it has the ability to open them and do transformations once they are downloaded, but as far as just actually downloading and opening the zip file is not supported as far as I can tell. 

Highlighted
Advocate I
Advocate I

Re: Download CSV files from Web then query file(s) automatically

Thanks for your reply, you specified "ZIP". Is it the same with "CSV" files ?

 

I'm usually work on onedrive/sharepoint folders and it transforms automatically. It's pretty cool and i'm a bit surprised that we cannot do the same from different sources even when we are pro-coders.

Highlighted
Anonymous
Not applicable

Re: Download CSV files from Web then query file(s) automatically

That's a good question. I don't believe so because I'm sure Power Query has a way to open the file, but I could be wrong. I know that if the is already downloaded and in a folder (like onedrive) it can definitely be open and used like you mentioned. I will look into a little bit more because this piqued my interest as well now. 

Highlighted
Super User II
Super User II

Re: Download CSV files from Web then query file(s) automatically

Hi @Bryan75 ,

You could copy the function from this blog/

 

Then copy the following as a blank query:

 

let
URL = "http://www.nemweb.com.au/REPORTS/CURRENT/Daily_Reports/",
Source = Web.Page(Web.Contents(URL)),
Data0 = Source{0}[Data],
Children = Data0{0}[Children],
Children1 = Children{1}[Children],
Children2 = Children1{2}[Children],
#"Removed Other Columns" = Table.SelectColumns(Children2,{"Children"}),
#"Expanded Children" = Table.ExpandTableColumn(#"Removed Other Columns", "Children", {"Text"}, {"Children.Text"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded Children", each ([Children.Text] <> null and [Children.Text] <> "[To Parent Directory]")),
#"Added Custom" = Table.AddColumn(#"Filtered Rows", "ZIP", each Web.Contents(URL & [Children.Text])),
#"Filtered Rows1" = Table.SelectRows(#"Added Custom", each ([Children.Text] = "PUBLIC_DAILY_201901260000_20190127040502.zip")),
#"Invoked Custom Function" = Table.AddColumn(#"Filtered Rows1", "ZipContents", each UnzipContents([ZIP])),
#"Expanded ZipContents" = Table.ExpandTableColumn(#"Invoked Custom Function", "ZipContents", {"FileName", "Content"}, {"ZipContents.FileName", "ZipContents.Content"}),
#"ZipContents Content" = #"Expanded ZipContents"{0}[ZipContents.Content],
#"Imported CSV" = Csv.Document(#"ZipContents Content",[Delimiter=",", Columns=120, Encoding=1252, QuoteStyle=QuoteStyle.None]),
#"Removed Other Columns1" = Table.SelectColumns(#"Imported CSV",{"Column1", "Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Column8", "Column9", "Column10", "Column11", "Column12", "Column13", "Column14", "Column15", "Column16", "Column17", "Column18", "Column19", "Column20", "Column21", "Column22", "Column23", "Column24", "Column25"}),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Other Columns1", [PromoteAllScalars=true]),
#"Promoted Headers1" = Table.PromoteHeaders(#"Promoted Headers", [PromoteAllScalars=true]),
#"Changed Type1" = Table.TransformColumnTypes(#"Promoted Headers1",{{"I", type text}, {"DISPATCH", type text}, {"CASESOLUTION", type text}, {"1", Int64.Type}, {"SETTLEMENTDATE", type datetime}, {"RUNNO", Int64.Type}, {"INTERVENTION", Int64.Type}, {"CASESUBTYPE", type text}, {"SOLUTIONSTATUS", Int64.Type}, {"SPDVERSION", type text}, {"NONPHYSICALLOSSES", Int64.Type}, {"TOTALOBJECTIVE", type number}, {"TOTALAREAGENVIOLATION", Int64.Type}, {"TOTALINTERCONNECTORVIOLATION", Int64.Type}, {"TOTALGENERICVIOLATION", Int64.Type}, {"TOTALRAMPRATEVIOLATION", type number}, {"TOTALUNITMWCAPACITYVIOLATION", type number}, {"TOTAL5MINVIOLATION", type text}, {"TOTALREGVIOLATION", type text}, {"TOTAL6SECVIOLATION", type text}, {"TOTAL60SECVIOLATION", type text}, {"TOTALASPROFILEVIOLATION", Int64.Type}, {"TOTALFASTSTARTVIOLATION", Int64.Type}, {"TOTALENERGYOFFERVIOLATION", Int64.Type}, {"LASTCHANGED", type datetime}})
in
#"Changed Type1"

 

Note: I modified the script from, and learned about the unzip function from a previous thread.

 

Hope this helps!

Nathan

Highlighted
Super User II
Super User II

Re: Download CSV files from Web then query file(s) automatically

One other note: I filtered to a single file, so that it wouldn't take so long to process.

Highlighted
Advocate I
Advocate I

Re: Download CSV files from Web then query file(s) automatically

Ow ! What an amazing work !
Does it mean that every day or month, you copy and generate a new request ? It's harsh
Highlighted
Super User II
Super User II

Re: Download CSV files from Web then query file(s) automatically

It depends on what you want to do.

It will refresh on your schedule.

It will load everything that is not filtered out. Potentially, you could create a filter to satisfy your needs. For instance: Compare the dates of the files with the current date and only keep those within the last week.

Helpful resources

Announcements
Community Blog

Community Blog

Visit our Community Blog for articles, guides, and information created by fellow community members.

Using the Community

Using the Community

Need help with the Power BI Community? Our 'Using the Community' support articles are a great place to start.

Community Summit North America

Community Summit North America

Innovate, Collaborate, Grow. The top training and networking event across the globe for Microsoft Business Applications

Power Platform 2020 release wave 2 plan

Power Platform 2020 release wave 2 plan

Features releasing from October 2020 through March 2021

Top Solution Authors
Top Kudoed Authors