Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Download CSV files from Web then query file(s) automatically

Hello

 

I would like to use Web.Contents() to download all CSV files located on a website. (i.e http://www.nemweb.com.au/REPORTS/CURRENT/Daily_Reports/).

 

With Web.Contents(), I'm able to generate a list which, unfortunately, doesnt save the url path.

It's very basic. Here is an extract of the list generated with Web.Contents() from PowerBi.

webpage data.PNG

 

Do you know a way to save the path and to automaticaly import all of them in my Pbi file (desktop) please ?

7 REPLIES 7
Anonymous
Not applicable

Hi @Anonymous ,

You could copy the function from this blog/

 

Then copy the following as a blank query:

 

let
URL = "http://www.nemweb.com.au/REPORTS/CURRENT/Daily_Reports/",
Source = Web.Page(Web.Contents(URL)),
Data0 = Source{0}[Data],
Children = Data0{0}[Children],
Children1 = Children{1}[Children],
Children2 = Children1{2}[Children],
#"Removed Other Columns" = Table.SelectColumns(Children2,{"Children"}),
#"Expanded Children" = Table.ExpandTableColumn(#"Removed Other Columns", "Children", {"Text"}, {"Children.Text"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded Children", each ([Children.Text] <> null and [Children.Text] <> "[To Parent Directory]")),
#"Added Custom" = Table.AddColumn(#"Filtered Rows", "ZIP", each Web.Contents(URL & [Children.Text])),
#"Filtered Rows1" = Table.SelectRows(#"Added Custom", each ([Children.Text] = "PUBLIC_DAILY_201901260000_20190127040502.zip")),
#"Invoked Custom Function" = Table.AddColumn(#"Filtered Rows1", "ZipContents", each UnzipContents([ZIP])),
#"Expanded ZipContents" = Table.ExpandTableColumn(#"Invoked Custom Function", "ZipContents", {"FileName", "Content"}, {"ZipContents.FileName", "ZipContents.Content"}),
#"ZipContents Content" = #"Expanded ZipContents"{0}[ZipContents.Content],
#"Imported CSV" = Csv.Document(#"ZipContents Content",[Delimiter=",", Columns=120, Encoding=1252, QuoteStyle=QuoteStyle.None]),
#"Removed Other Columns1" = Table.SelectColumns(#"Imported CSV",{"Column1", "Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Column8", "Column9", "Column10", "Column11", "Column12", "Column13", "Column14", "Column15", "Column16", "Column17", "Column18", "Column19", "Column20", "Column21", "Column22", "Column23", "Column24", "Column25"}),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Other Columns1", [PromoteAllScalars=true]),
#"Promoted Headers1" = Table.PromoteHeaders(#"Promoted Headers", [PromoteAllScalars=true]),
#"Changed Type1" = Table.TransformColumnTypes(#"Promoted Headers1",{{"I", type text}, {"DISPATCH", type text}, {"CASESOLUTION", type text}, {"1", Int64.Type}, {"SETTLEMENTDATE", type datetime}, {"RUNNO", Int64.Type}, {"INTERVENTION", Int64.Type}, {"CASESUBTYPE", type text}, {"SOLUTIONSTATUS", Int64.Type}, {"SPDVERSION", type text}, {"NONPHYSICALLOSSES", Int64.Type}, {"TOTALOBJECTIVE", type number}, {"TOTALAREAGENVIOLATION", Int64.Type}, {"TOTALINTERCONNECTORVIOLATION", Int64.Type}, {"TOTALGENERICVIOLATION", Int64.Type}, {"TOTALRAMPRATEVIOLATION", type number}, {"TOTALUNITMWCAPACITYVIOLATION", type number}, {"TOTAL5MINVIOLATION", type text}, {"TOTALREGVIOLATION", type text}, {"TOTAL6SECVIOLATION", type text}, {"TOTAL60SECVIOLATION", type text}, {"TOTALASPROFILEVIOLATION", Int64.Type}, {"TOTALFASTSTARTVIOLATION", Int64.Type}, {"TOTALENERGYOFFERVIOLATION", Int64.Type}, {"LASTCHANGED", type datetime}})
in
#"Changed Type1"

 

Note: I modified the script from, and learned about the unzip function from a previous thread.

 

Hope this helps!

Nathan

Anonymous
Not applicable

One other note: I filtered to a single file, so that it wouldn't take so long to process.

Anonymous
Not applicable

Ow ! What an amazing work !
Does it mean that every day or month, you copy and generate a new request ? It's harsh
Anonymous
Not applicable

It depends on what you want to do.

It will refresh on your schedule.

It will load everything that is not filtered out. Potentially, you could create a filter to satisfy your needs. For instance: Compare the dates of the files with the current date and only keep those within the last week.

Anonymous
Not applicable

I dont believe Power Query has the ability to download zip files. I know it has the ability to open them and do transformations once they are downloaded, but as far as just actually downloading and opening the zip file is not supported as far as I can tell. 

Anonymous
Not applicable

Thanks for your reply, you specified "ZIP". Is it the same with "CSV" files ?

 

I'm usually work on onedrive/sharepoint folders and it transforms automatically. It's pretty cool and i'm a bit surprised that we cannot do the same from different sources even when we are pro-coders.

Anonymous
Not applicable

That's a good question. I don't believe so because I'm sure Power Query has a way to open the file, but I could be wrong. I know that if the is already downloaded and in a folder (like onedrive) it can definitely be open and used like you mentioned. I will look into a little bit more because this piqued my interest as well now. 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors