Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Iterate through web url problems

Hi,

 

I have a Desktop file which is using the web connector and a custom function to iterate through the contents of a web page:

 

Function:

let
   MetaCritic = (page as number) as table =>
let
   Source = Web.Page(Web.Contents("https://www.metacritic.com/browse/games/score/metascore/all/all/all?view=condensed&sort=desc&page=" & Number.ToText(page))),
   Data0 = Source{0}[Data]
 in
   #"Data0"
 in 
   MetaCritic

 

This function is applied to the following query like this: 

 

let
    Source = {1..1000},
    #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "Page"}}),
    #"Added Custom" = Table.AddColumn(#"Renamed Columns", "MC", each MC_PageFx_1([Page]))
in
    #"Added Custom"

 

If I connect directly to one of the URL pages, eg. "https://www.metacritic.com/browse/games/score/metascore/all/all/all?view=condensed&sort=desc&page=1" I get prompted through a GUI to choose between a number of pages, the one with the table I need is called "Table1". Checking Table1 and loading will give me the desired table, albeit only page 1.

 

While the above function does indeed return nested tables for each page number 0 - 1000, the contents of these tables, even when expanded through multiple layers does not contain the Table1 data I am looking for (go to page https://www.metacritic.com/browse/games/score/metascore/all/all/all?view=condensed&sort=desc&page=1 to see what table I need).

 

I think there must be some way of making it pick specifically the content it calls "Table1" under that URL for each page?

 

Also, I have a very similar example which is working perfectly fine using this function and query:

 

Function:

let
    MetaCritic = (page as number) as table =>
let
    Source = Web.Page(Web.Contents("https://www.metacritic.com/browse/games/score/metascore/all/all/?view=detailed&sort=desc&page=" & Number.ToText(page))),
    Data0 = Source{0}[Data]
 in
    #"Data0"
 in 
    MetaCritic

 

Query:

let
    Source = {1..1000},
    #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "Page"}}),
    #"Added Custom" = Table.AddColumn(#"Renamed Columns", "MC", each MC_PageFx_2([Page]))
in
    #"Added Custom"

 

For this one I get all 176 pages worth of data.

 

I really appreciate any help on how to make this work, even alternative approaches.

 

Thanks.

 

/RSK

1 ACCEPTED SOLUTION
stretcharm
Memorable Member
Memorable Member

Your on the right track.

 

I usually try using the gui then look at the advanced editor and make it into a function. Web sites can find the table or use Web by example mode. This the example mode can work better on complex sites. This is the funciton I used to get the table using by example. Though double check it has the correct data for the columns.

 

Notice there is a delay that gives the page time to load and it can also help slow the requests if a site stops returning data if you have too many to quick.

 

let
    MetaCritic = (page as number) as table =>
   let
      Source = Web.BrowserContents("https://www.metacritic.com/browse/games/score/metascore/all/all/all?view=condensed&sort=desc&page=" & Number.ToText(page), [WaitFor=[Timeout=#duration(0, 0, 0, 0)]]),
      #"Extracted Table From Html" = Html.Table(Source, {{"Rating", ".small"}, {"Game", ".basic_stat:nth-last-child(3) > A:nth-child(1):nth-last-child(1)"}, {"User", ".product_avguserscore:nth-child(1) > .data"}, {"Date", ".full_release_date:nth-child(2) > .data"}}, [RowSelector=".game_product"])
   in
      #"Extracted Table From Html"
 in 
    MetaCritic

 

let
    Source = {1..3},
    #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "Page"}}),
    #"Added Custom" = Table.AddColumn(#"Renamed Columns", "MC", each MC_PageFx_4([Page])),
    #"Expanded MC" = Table.ExpandTableColumn(#"Added Custom", "MC", {"Date", "Game", "Rating", "User"}, {"MC.Date", "MC.Game", "MC.Rating", "MC.User"})
in
    #"Expanded MC"

 

 

View solution in original post

3 REPLIES 3
stretcharm
Memorable Member
Memorable Member

Your on the right track.

 

I usually try using the gui then look at the advanced editor and make it into a function. Web sites can find the table or use Web by example mode. This the example mode can work better on complex sites. This is the funciton I used to get the table using by example. Though double check it has the correct data for the columns.

 

Notice there is a delay that gives the page time to load and it can also help slow the requests if a site stops returning data if you have too many to quick.

 

let
    MetaCritic = (page as number) as table =>
   let
      Source = Web.BrowserContents("https://www.metacritic.com/browse/games/score/metascore/all/all/all?view=condensed&sort=desc&page=" & Number.ToText(page), [WaitFor=[Timeout=#duration(0, 0, 0, 0)]]),
      #"Extracted Table From Html" = Html.Table(Source, {{"Rating", ".small"}, {"Game", ".basic_stat:nth-last-child(3) > A:nth-child(1):nth-last-child(1)"}, {"User", ".product_avguserscore:nth-child(1) > .data"}, {"Date", ".full_release_date:nth-child(2) > .data"}}, [RowSelector=".game_product"])
   in
      #"Extracted Table From Html"
 in 
    MetaCritic

 

let
    Source = {1..3},
    #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "Page"}}),
    #"Added Custom" = Table.AddColumn(#"Renamed Columns", "MC", each MC_PageFx_4([Page])),
    #"Expanded MC" = Table.ExpandTableColumn(#"Added Custom", "MC", {"Date", "Game", "Rating", "User"}, {"MC.Date", "MC.Game", "MC.Rating", "MC.User"})
in
    #"Expanded MC"

 

 

Anonymous
Not applicable

I am not sure I understand all that is going on within your extract html table part, but for now I can live with that, as it works perfectly. Thank you!

Good.

I didn't write the code for the extract I just used web by example then took the code and made a function.

It trys to map text to css tags in the web page and then looks for other occurences.

 

This page shows how to use web by example.

https://docs.microsoft.com/en-us/power-bi/desktop-connect-to-web-by-example

 

 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.