Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi
I am connecting to a csv file at the following URL with a standard "web" connection : https://www.at.govt.nz/media/1975723/patronage-data-dec-2017.csv
The connection works fine, and I can extract the data. However the structure of the URL includes a Month (dec) , and Year (2017) reference. I would expect next month the URL I want to connect to will be updated to :
https://www.at.govt.nz/media/1975723/patronage-data-jan-2018.csv
I have tried using * and ? wildcards to the Year and Month "URL parts" within the Advanced settings component of the web connection, without any luck (see below)- the connection fails.
Can anyone suggest a way to connect to dynamic URLs ?
Cheers Steve
Solved! Go to Solution.
@Anonymous,
Add a new blank query in Power BI Desktop, then paste the following code to the Advanced Editor of the blank query.
(Month as text,year as number)=>
let
Source = Csv.Document(Web.Contents("https://www.at.govt.nz/media/1975723/patronage-data-" & ""&Month&"" & "-" & ""&Number.ToText(year)&"" & ".csv"),[Delimiter=",", Columns=8, Encoding=1252, QuoteStyle=QuoteStyle.None]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type text}, {"Column6", type text}, {"Column7", type text}, {"Column8", type text}})
in
#"Changed Type"
Input month and year in the function and invoke the function to get required table.
Regards,
Lydia
@Anonymous,
Add a new blank query in Power BI Desktop, then paste the following code to the Advanced Editor of the blank query.
(Month as text,year as number)=>
let
Source = Csv.Document(Web.Contents("https://www.at.govt.nz/media/1975723/patronage-data-" & ""&Month&"" & "-" & ""&Number.ToText(year)&"" & ".csv"),[Delimiter=",", Columns=8, Encoding=1252, QuoteStyle=QuoteStyle.None]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type text}, {"Column6", type text}, {"Column7", type text}, {"Column8", type text}})
in
#"Changed Type"
Input month and year in the function and invoke the function to get required table.
Regards,
Lydia
Thanks for this Lydia.
Out of interest do you know if there is any way to avoid needing user input? Ie extract any URL that has a 3 character month and 4 character year, followed by a CSV suffix?
https://www.at.govt.nz/media/1975723/patronage-data-???-????.csv
Cheers
Steve
@Anonymous,
Another method is to create two parameters(Month and year) listing all the possible values, then users are able to choose values to return different queries.
let Source = Csv.Document(Web.Contents("https://www.at.govt.nz/media/1975723/patronage-data-" & ""&Month&"" & "-" & ""&year&"" & ".csv"),[Delimiter=",", Columns=8, Encoding=1252, QuoteStyle=QuoteStyle.None]) in Source
Regards,
Lydia
I'm trying to scrape Amazon and go thru a list of ASINs and an unknown amount of pages per ASIN.
I've tried the multiple parameters option, which works, but I don't have a set number of pages that I know for each instance so I would end up with a million rows and have to delete duplicates.
I've tried another approach that i can get to loop thru all pages without knowing the page number using..
https://www.mattmasson.com/2014/11/iterating-over-an-unknown-number-of-pages-in-power-query/
with my Code looking like..
(Page as number) as table =>
let
Source = Web.BrowserContents("https://www.amazon.com/gp/offer-listing/B01MQWUXZS/ref=olp_page_next?ie=UTF8&f_all=true&startIndex=" & Number.ToText(Page)),
#"Extracted Table From Html" = Html.Table(Source, {{"Seller", ".olpSellerName"}, {"Price", ".a-color-price"}}, [RowSelector=".olpOffer"])
in
#"Extracted Table From Html"
let
PageRange = {0,10,20,30,40,50},
Source = List.Transform(PageRange, each try {_, GetData(_)} otherwise null),
First = List.FirstN(Source, each _ <> null),
Table = Table.FromRows(First, {"Page", "Column1"}),
Expanded = Table.ExpandTableColumn(Table, "Column1", {"Seller", "Price"}, {"Seller", "Price"})
in
Expanded
BUT I'm having trouble combining the two, I need it to also go thru all the different ASINs and I want my code to look more like
(Page as number, Asin as text) as table =>
let
Source = Web.BrowserContents("https://www.amazon.com/gp/offer-listing/" & Asin & "/ref=olp_page_next?ie=UTF8&f_all=true&startIndex=" & Number.ToText(Page)),
#"Extracted Table From Html" = Html.Table(Source, {{"Seller", ".olpSellerName"}, {"Price", ".a-color-price"}}, [RowSelector=".olpOffer"])
in
#"Extracted Table From Html"
How do I adjust the next Query to look at a list of Page numbers (that can go to a crazy amount -but will stop once it hits a error) and a list of Asins at the same time?
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
109 | |
98 | |
80 | |
64 | |
57 |
User | Count |
---|---|
145 | |
111 | |
92 | |
84 | |
66 |