Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi,
I have a Desktop file which is using the web connector and a custom function to iterate through the contents of a web page:
Function:
let
MetaCritic = (page as number) as table =>
let
Source = Web.Page(Web.Contents("https://www.metacritic.com/browse/games/score/metascore/all/all/all?view=condensed&sort=desc&page=" & Number.ToText(page))),
Data0 = Source{0}[Data]
in
#"Data0"
in
MetaCritic
This function is applied to the following query like this:
let
Source = {1..1000},
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "Page"}}),
#"Added Custom" = Table.AddColumn(#"Renamed Columns", "MC", each MC_PageFx_1([Page]))
in
#"Added Custom"
If I connect directly to one of the URL pages, eg. "https://www.metacritic.com/browse/games/score/metascore/all/all/all?view=condensed&sort=desc&page=1" I get prompted through a GUI to choose between a number of pages, the one with the table I need is called "Table1". Checking Table1 and loading will give me the desired table, albeit only page 1.
While the above function does indeed return nested tables for each page number 0 - 1000, the contents of these tables, even when expanded through multiple layers does not contain the Table1 data I am looking for (go to page https://www.metacritic.com/browse/games/score/metascore/all/all/all?view=condensed&sort=desc&page=1 to see what table I need).
I think there must be some way of making it pick specifically the content it calls "Table1" under that URL for each page?
Also, I have a very similar example which is working perfectly fine using this function and query:
Function:
let
MetaCritic = (page as number) as table =>
let
Source = Web.Page(Web.Contents("https://www.metacritic.com/browse/games/score/metascore/all/all/?view=detailed&sort=desc&page=" & Number.ToText(page))),
Data0 = Source{0}[Data]
in
#"Data0"
in
MetaCritic
Query:
let
Source = {1..1000},
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "Page"}}),
#"Added Custom" = Table.AddColumn(#"Renamed Columns", "MC", each MC_PageFx_2([Page]))
in
#"Added Custom"
For this one I get all 176 pages worth of data.
I really appreciate any help on how to make this work, even alternative approaches.
Thanks.
/RSK
Solved! Go to Solution.
Your on the right track.
I usually try using the gui then look at the advanced editor and make it into a function. Web sites can find the table or use Web by example mode. This the example mode can work better on complex sites. This is the funciton I used to get the table using by example. Though double check it has the correct data for the columns.
Notice there is a delay that gives the page time to load and it can also help slow the requests if a site stops returning data if you have too many to quick.
let MetaCritic = (page as number) as table => let Source = Web.BrowserContents("https://www.metacritic.com/browse/games/score/metascore/all/all/all?view=condensed&sort=desc&page=" & Number.ToText(page), [WaitFor=[Timeout=#duration(0, 0, 0, 0)]]), #"Extracted Table From Html" = Html.Table(Source, {{"Rating", ".small"}, {"Game", ".basic_stat:nth-last-child(3) > A:nth-child(1):nth-last-child(1)"}, {"User", ".product_avguserscore:nth-child(1) > .data"}, {"Date", ".full_release_date:nth-child(2) > .data"}}, [RowSelector=".game_product"]) in #"Extracted Table From Html" in MetaCritic
let Source = {1..3}, #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error), #"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "Page"}}), #"Added Custom" = Table.AddColumn(#"Renamed Columns", "MC", each MC_PageFx_4([Page])), #"Expanded MC" = Table.ExpandTableColumn(#"Added Custom", "MC", {"Date", "Game", "Rating", "User"}, {"MC.Date", "MC.Game", "MC.Rating", "MC.User"}) in #"Expanded MC"
Your on the right track.
I usually try using the gui then look at the advanced editor and make it into a function. Web sites can find the table or use Web by example mode. This the example mode can work better on complex sites. This is the funciton I used to get the table using by example. Though double check it has the correct data for the columns.
Notice there is a delay that gives the page time to load and it can also help slow the requests if a site stops returning data if you have too many to quick.
let MetaCritic = (page as number) as table => let Source = Web.BrowserContents("https://www.metacritic.com/browse/games/score/metascore/all/all/all?view=condensed&sort=desc&page=" & Number.ToText(page), [WaitFor=[Timeout=#duration(0, 0, 0, 0)]]), #"Extracted Table From Html" = Html.Table(Source, {{"Rating", ".small"}, {"Game", ".basic_stat:nth-last-child(3) > A:nth-child(1):nth-last-child(1)"}, {"User", ".product_avguserscore:nth-child(1) > .data"}, {"Date", ".full_release_date:nth-child(2) > .data"}}, [RowSelector=".game_product"]) in #"Extracted Table From Html" in MetaCritic
let Source = {1..3}, #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error), #"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "Page"}}), #"Added Custom" = Table.AddColumn(#"Renamed Columns", "MC", each MC_PageFx_4([Page])), #"Expanded MC" = Table.ExpandTableColumn(#"Added Custom", "MC", {"Date", "Game", "Rating", "User"}, {"MC.Date", "MC.Game", "MC.Rating", "MC.User"}) in #"Expanded MC"
I am not sure I understand all that is going on within your extract html table part, but for now I can live with that, as it works perfectly. Thank you!
Good.
I didn't write the code for the extract I just used web by example then took the code and made a function.
It trys to map text to css tags in the web page and then looks for other occurences.
This page shows how to use web by example.
https://docs.microsoft.com/en-us/power-bi/desktop-connect-to-web-by-example
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
114 | |
100 | |
81 | |
70 | |
62 |
User | Count |
---|---|
148 | |
116 | |
104 | |
90 | |
65 |