Turning "Web by Example" into Power Query Functions

by Super User ‎02-09-2019 09:03 PM - edited ‎02-09-2019 09:10 PM

Introduction

With over 50 quick measures in the Quick Measure Gallery, you could say that I am a fan. Now, I'm not going to lie, a large part of that fandom comes from sheer laziness. While I was answering questions on the forums I started to notice that I was often solving what was essentially the same problem over and over again. It's way more efficient therefore to solve the problem once, post it to the Quick Measure Gallery and then simply refer to that quick measure when answering forum posts.

 

Efficiency, I'm a big fan. Which is also why I am a big fan of "Web by Example" feature recently added to Power BI. For anyone that has futiley struggled expanding their way through the DOM of a web page in Query Editor, "Web by Example" is the best thing since sliced virtual network architectures over 5G. It's the lazy way to get what you need from websites.

 

But, what is not efficient is the whole Quick Measure Gallery user interface. It's not so great honestly. Limited sorting, no search features, pagination all the way at the bottom. It's inefficient and kind of a pain to work with. So, I decided to combine my two loves of efficiency and laziness into a Data Stories Gallery project called "Quick Measures". The idea is simple, create a Power BI Report that ingests all of the information from the Quick Measures Gallery and presents it in a more intuitive and helpful way. Along the way I used the "Web by Example" feature heavily and used its output to create Power Query functions that efficiently and lazily collected all of the information that I needed. Here's how I did it.

 

The Setup

The first thing that I needed was a list of all of the Quick Measures. To do this, I created a new Web query (Get Date | Web) and pointed it to https://community.powerbi.com/t5/Quick-Measures-Gallery/bd-p/QuickMeasuresGallery?sortby=postdate. Once there I clicked on Document and then the Web View tab. At the bottom I then clicked "Add table using examples". This brings up the interface below.

 

qm1.png

 

What's nifty is that I can see the web page and scroll around on it to find the information I am looking for and then start typing the information into the columns. Suggestions come up and I can pick which suggestion fits what I am looking for. After filling in one or two rows, Power BI figures out what I am looking for.

 

qm2.png

 

Once Power BI figures out the first column, it gets REALLY smart as you start to fill in additional columns. However, there is one limitation that I found, Web by Example has some problems pulling out more "embedded" information. For example, when I tried to pull out the URL for each Quick Measure, I got this:

 

qm3.png

 

No matter, Title, Kudos and Author is a start! The generated M code is pretty short.

 

let
    Source = Web.BrowserContents("https://community.powerbi.com/t5/Quick-Measures-Gallery/bd-p/QuickMeasuresGallery?sortby=postdate"),
    #"Extracted Table From Html" = Html.Table(Source, {{"Title", ".message-subject"}, {"Author", ".lia-users-user-login"}, {"Kudos", ".lia-message-stats"}}, [RowSelector=".lia-messages-message-card"]),
    #"Changed Type" = Table.TransformColumnTypes(#"Extracted Table From Html",{{"Title", type text}, {"Author", type text}, {"Kudos", Int64.Type}})
in
    #"Changed Type"

 

Getting Tricky

We can see that the Html.Table command is the key piece that identifies what we are looking for by CSS labels. Nifty. But, I have four pages of Quick Measures to collect. What to do, what to do? Luckily, I can use a technique I developed, "Merge Queries with M" to quickly hack up the M code like this:

 

let
    Source = Web.BrowserContents("https://community.powerbi.com/t5/Quick-Measures-Gallery/bd-p/QuickMeasuresGallery?sortby=postdate"),
    Source1 = Web.BrowserContents("https://community.powerbi.com/t5/Quick-Measures-Gallery/bd-p/QuickMeasuresGallery/page/2?sortby=postdate"),
    Source2 = Web.BrowserContents("https://community.powerbi.com/t5/Quick-Measures-Gallery/bd-p/QuickMeasuresGallery/page/3?sortby=postdate"),
    Source3 = Web.BrowserContents("https://community.powerbi.com/t5/Quick-Measures-Gallery/bd-p/QuickMeasuresGallery/page/4?sortby=postdate"),
    #"Extracted Table From Html" = Html.Table(Source, {{"Title", ".message-subject"}, {"Author", ".lia-users-user-login"}, {"Kudos", ".lia-message-stats"}}, [RowSelector=".lia-messages-message-card"]),
    #"Extracted Table From Html1" = Html.Table(Source1, {{"Title", ".message-subject"}, {"Author", ".lia-users-user-login"}, {"Kudos", ".lia-message-stats"}}, [RowSelector=".lia-messages-message-card"]),
    #"Extracted Table From Html2" = Html.Table(Source2, {{"Title", ".message-subject"}, {"Author", ".lia-users-user-login"}, {"Kudos", ".lia-message-stats"}}, [RowSelector=".lia-messages-message-card"]),
    #"Extracted Table From Html3" = Html.Table(Source3, {{"Title", ".message-subject"}, {"Author", ".lia-users-user-login"}, {"Kudos", ".lia-message-stats"}}, [RowSelector=".lia-messages-message-card"]),
    #"Appended Query" = Table.Combine({#"Extracted Table From Html",#"Extracted Table From Html1",#"Extracted Table From Html2",#"Extracted Table From Html3"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Appended Query",{{"Title", type text}, {"Author", type text}, {"Kudos", Int64.Type}}),
    #"Removed Duplicates" = Table.Distinct(#"Changed Type", {"Title"})
in
    #"Removed Duplicates"

Basically, I just copy-pasta the the "Source" and "Extracted Table From Html" lines and create a Source line for each of my URL's and a corresponding "Extacted Table from Html" line referring to each source in turn. Then I can use a Table.Combine to Append all of these tables together. A couple cleanup operations and POOF! a dynamic table of all of the Quick Measures culled directly from the Power BI Community site!

 

But what about those pesky URL's? We are going to need those. Unfortunately this is the hard work portion of the exercise. Because I couldn't get those URL's dynamically, I had to go get them manually. Here is what I did. Within Desktop (not Query Editor) I right-clicked on my new table of quick measures and selected "Copy Table". I could then copy this data into an "Enter Data" query, delete all of the columns except Title and then create a new URL column in which I manually went through and pasted the URL's of each individual quick measure. Not super fancy or my first choice of how to go about things but, oh well. I saved this as a new query called "GetDetails". And now that I have this, here is the where the real fun begins.

 

Creating a Function

Start by creating a new Web query and point it to an individual page of a quick measure, like Measure Totals The FInal Word. Using the same Web by Example process as before, we can end up with a query that looks like:

 

    let
        Source = Web.BrowserContents(url),
        #"Extracted Table From Html" = Html.Table(Source, {{"Title", ".final-crumb"}, {"Date", ".lia-message-dates:nth-last-child(1) > SPAN.DateTime.lia-message-posted-on.lia-component-common-widget-date:nth-child(1):nth-last-child(2)"}, {"Views", ".lia-message-metrics-views"}, {"Text", "P:nth-last-child(31)"}, {"Thumbnail", ".lia-attachment.attachment-1"}, {"PBIX", ".attachment-4"}, {"Labels", ".label:nth-child(1)"}}, [RowSelector=".final-crumb"]),
        #"Changed Type" = Table.TransformColumnTypes(#"Extracted Table From Html",{{"Title", type text}, {"Date", type datetime}, {"Views", type text}})
    in
        #"Changed Type"

Again, super simple. Here we are grabbing the Title, Date and other information from the individual quick measure page. Now, to turn this into a function, we simply add a few lines to the beginning and end of our query like this:

 

let
    fnGetDetail = (url) =>

    let
        Source = Web.BrowserContents(url),
        #"Extracted Table From Html" = Html.Table(Source, {{"Title", ".final-crumb"}, {"Date", ".lia-message-dates:nth-last-child(1) > SPAN.DateTime.lia-message-posted-on.lia-component-common-widget-date:nth-child(1):nth-last-child(2)"}, {"Views", ".lia-message-metrics-views"}, {"Text", "P:nth-last-child(31)"}, {"Thumbnail", ".lia-attachment.attachment-1"}, {"PBIX", ".attachment-4"}, {"Labels", ".label:nth-child(1)"}}, [RowSelector=".final-crumb"]),
        #"Changed Type" = Table.TransformColumnTypes(#"Extracted Table From Html",{{"Title", type text}, {"Date", type datetime}, {"Views", type text}})
    in
        #"Changed Type"
in
fnGetDetail

So, essentially, we "wrap" our original query in a new "let" statement. We specify the name of our function and that it expects a parameter to be passed in. At the end we return the output of our function. Once we have created this function and named it appropriately (fnGetDetail) then we can use it in an "Invoke Custom Function" custom column from our original "GetDetails" table. 

 

qm4.png

 

This then adds all of the information from that function into our existing table for each row, using the URL from that row as the parameter passed into the function!

 

Conclusion

By using the "Web by Example" feature along with just a little bit of code, we can turn Web by Example queries into powerful functions to collect data for multiple URLs that have similar layouts, such as the Quick Measures Gallery on the Power BI Community Site. This can allow us to quick and efficiently collect data into a report with the least amount of effort possible.

 

Check out the attachment to see exactly what was done or check out my post in the Data Stories Gallery.

Attachment
Comments
by Super User
Friday - last edited Sunday
Web Analytics Made Easy -
StatCounter

Tracking counter