Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
gotmike
Frequent Visitor

how to create a query that paginates?

I'm working with the Hubspot CRM API and when you query for a list of all deals, you only get 100 records at a time, and they want you to send subsequent queries with an "offset" to paginate the results.

 

For instance, if you send:

https://api.hubapi.com/deals/v1/deal/all?hapikey=demo

 

at the very end of the query, you see the following JSON:

 

"hasMore":false
"offset":27939158

so, if hasMore is true, the NEXT query should look like this:

https://api.hubapi.com/deals/v1/deal/all?hapikey=demo&offset=27939158

 

and then, we would want to repeat the process until hasMore comes back with false.

 

i'm completely new to power bi, so would love to know how to handle this type of query process.

 

in another language, this would just be do { } while (hasMore == false);

or something like that...

214 REPLIES 214
Anonymous
Not applicable

Hi,

 

Please see below. Sorry if my code is way off but I am new to this type of language and PBI.

 

let
    Source = Table.FromColumns({{1..500/100}}),
    GetData = Table.AddColumn(Xml.Tables(Web.Contents("http://example.co.uk/feedback-details?from=2016-01-01%2009:00:00&to=2017-05-03%2009:30:00&page="&Text.From([Column1]))))

in
    GetData

Thanks for your help

 

Mike

let
    Source = Table.FromColumns({{1..500/100}}),
    GetData = Table.AddColumn(Source, "New", each Xml.Tables(Web.Contents("http://example.co.uk/feedback-details?from=2016-01-01%2009:00:00&to=2017-05-03%2009:30:00&page="&Text.From([Column1]))))

in
    GetData

You've handcoded it - using the UI would have given you the additional values 😉

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Anonymous
Not applicable

Thank you! That's worked perfectly! 🙂

 

Regards

 

Mike

moritz1
Regular Visitor

Hello!

After reading most of the Thread and a couple of hours of try and error later, frustration and mind block starts to kick in for me.

 

I am trying to adapt the code samples to request data from Airtable.com, which uses the same Pagination with an offset-Parameter like in the last example. Maybe someone can help me?? Smiley Frustrated

 

To make the reproduction easier, I created a sample database with five entries.

 

Example how the API works, I am requesting only two records (with the pageSize=2 Parameter) to allow testing with minimal data...

 

 

// 20170424221134
// https://api.airtable.com/v0/appiXnlh3PAZ46394/Room%20Assessment?api_key=keyi0umQebbJneGDO&pageSize=2

{
  "records": [
    {
      "id": "rec9xyewCK97T01kP",
      "fields": {
        "Room": "Garage",
        "Condition": "Poor",
        "Priority": "Low"
      },
      "createdTime": "2017-04-24T19:53:01.511Z"
    },
    {
      "id": "recZmk4a5kmxdfK7O",
      "fields": {
        "Condition": "Poor",
        "Priority": "Medium",
        "Room": "Kitchen"
      },
      "createdTime": "2015-11-16T22:48:35.000Z"
    }
  ],
  "offset": "itrANA53fD5J9bdLa/recZmk4a5kmxdfK7O"
}

 

 

The offset-Parameter can be inserted in the next call, to get the next two records:

 

 

 

https://api.airtable.com/v0/appiXnlh3PAZ46394/Room%20Assessment?api_key=keyi0umQebbJneGDO&pageSize=2&offset=itrANA53fD5J9bdLa/recZmk4a5kmxdfK7O

 

 

 

To request the whole dataset (without pagination), I would perform the following in Power BI

 

 

let
    Source = Json.Document(Web.Contents("https://api.airtable.com/v0/appiXnlh3PAZ46394/Room%20Assessment?api_key=keyi0umQebbJneGDO")),
    records = Source[records],
    #"Converted to Table" = Table.FromList(records, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"id", "fields", "createdTime"}, {"Column1.id", "Column1.fields", "Column1.createdTime"}),
    #"Expanded Column1.fields" = Table.ExpandRecordColumn(#"Expanded Column1", "Column1.fields", {"Condition", "Priority", "Notes", "Room", "Started?", "Projects"}, {"Column1.fields.Condition", "Column1.fields.Priority", "Column1.fields.Notes", "Column1.fields.Room", "Column1.fields.Started?", "Column1.fields.Projects"})
in
    #"Expanded Column1.fields"

 

Now I tried to bring together this coding and the example coding from earlier:

 

let
Pagination = List.Skip(List.Generate( () => [Last_Key = "", Counter=0], // Start Value
   		each  [Last_Key] <> null and [Last_Key] <> "", // Condition under which the next execution will happen
   		each [ Last_Key = try if [Counter]<=1 then "" else [WebCall][offset] otherwise null,// determine the LastKey for the next execution
                       WebCall = Json.Document(Web.Contents("https://api.airtable.com/v0/appiXnlh3PAZ46394/Room%20Assessment?api_key=keyi0umQebbJneGDO&pageSize=2&offset="&Last_Key&"")), // retrieve results per call
    		       Counter = [Counter]+1// internal counter
                      ],
   		each [WebCall]
),1)
, #"Converted to Table" = Table.FromList(Pagination, Splitter.SplitByNothing(), null, null, ExtraValues.Error)
, #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"id", "fields", "createdTime"}, {"Column1.id", "Column1.fields", "Column1.createdTime"})
, #"Expanded Column1.fields" = Table.ExpandRecordColumn(#"Expanded Column1", "Column1.fields", {"Condition", "Priority", "Notes", "Room", "Started?", "Projects"}, {"Column1.fields.Condition", "Column1.fields.Priority", "Column1.fields.Notes", "Column1.fields.Room", "Column1.fields.Started?", "Column1.fields.Projects"})
in
  #"Expanded Column1.fields"

But I get an error the column "Column1" of the table could not be found. It seems that my "Pagination" is already empty. 

 

Edit: It seems that the errors comes up because the offset parameter cannot be empty on the first call and can only be filled with valid values, otherwise the API call will fail. I removed the and [Last_Key] <> "" from the third row and now I finally get data (hooray!!) but two records are fetched twice, so instead of five entries it returns seven. Is there any way I can initialize the "Last_Key" and only pass it to the offset-Parameter when it is not initial?

 

Can anyone help me? Thank you very much!! Smiley Happy

 

Best regards

Moritz

 

 

Hi @gotmike,

 

there is no concept of loops in M (Power Query), but you can use recursive functions, to reach your goals. The following article by Chris Webb uses this concept to flat a parent/child hierarchy. I am not sure how to use this in combination with your API call, but maybe this is one more step into the right direction for you 🙂

 

Regards,

Lars

You may also want to avoid recursion as well since we don't do tail recursion optimization. It depends how many results you need to iterate over. If the result is large, look into using functions such as List.Accumulate or List.Generate.

This article contains of very good explanation of how List.Generate works: http://blog.crossjoin.co.uk/2014/06/25/using-list-generate-to-make-multiple-replacements-of-words-in...

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

EDIT:  I posted this to the wrong thread(!).  I should have replied here, and I'll continue and close the thread there.

 

Thanks ImkeF

 

My apologies for taking so long to acknowledge your support.  I've finally got around to having a go at List.Generate() and I reckon that I'm on the right path.  However, I think that I've created an infinite loop and thus launch a small-scale DoS attack against the Web API I'm trying to hit 🙂 

 

Would you mind taking a look at my query text and giving me your opinion?  When I run it, I receive no errors, but the query appears to run forever.  At this point all that I'm trying to do is generate a list of URIs where I supply the first, and generate the rest based on the initial return.

 

let
    URIList = List.Generate(
    ()=> [SourceURI="https://api.dovico.com/TimeEntries/?version=5"],
    each Text.PositionOf([SourceURI],"N/A") = -1,
    each [
        Source = Web.Contents([SourceURI],[Headers=[#"Authorization"="WRAP access_token=""client=XXX&user_token=YYY"""]]),
        ImportedXML = Xml.Tables(Source,null,1252),
        ChangedType = Table.TransformColumnTypes(ImportedXML,{{"PrevPageURI", type text

I cannot spot anything obvious.

Have never used it without the counter, but cannot see why this shouldn't work.

Do you know how many loops to expect?

 

Although it should be lazy evaluating - in case it doesn't, you can shorten the command like this:

 

Source = Web.Contents([SourceURI],[Headers=[#"Authorization​"="WRAP access_token=""client=XXX&user_token=YYY"""]]),
        ImportedXML = Xml.Tables(Source,null,1252)[NextPageURI]{0}

  ],
    each [SourceURI]

 

This should only retrieve the value from column "NextPageURI" in the first row {0}

 

Well actually, maybe it's this: Text.PositionOf([SourceURI],"N/A") = -1 : You're searching for something that shouldn't be there. Could you replace it by a positive identification?

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

nakia
Frequent Visitor

Hi,

 

Just wondering if you ever got this to work?

 

I have almost the exact same need right now that I would like to resolve.

 

The API I am talking to has a starting URI and if it has more that 10 records to return it will paginate and provide in the initial response the "NextURI" for me to retrieve the next 10 records.

 

Appreciate any help I could get!

 

 

Hi there

 

Sorry for the delay in getting back to you.  I did get it to work, but it is slow.  I'm unsure where the bottleneck is (my code, or the service I'm hitting).  Either way, here is the code I pulled together:

 

let
    /*
      Purpose: Request all timesheet records from Dovico      
    */

    /*
      Generate a list of XML objects based on the Dovico's pagination of its return set
      The initial URL is:  https://api.dovico.com/TimeEntries/?version=5
      Pagination is controlled by return randomly-generated values in URI attributes PrevPageURI and NextPageURI
      The limit in each direction is denoted by the value "N/A"
      Date filtering can be introduced by using:  https://api.dovico.com/TimeEntries/?daterange=2016-04-01%202016-04-15&version=5
      See the Dovico API for more information:  http://apideveloper.dovico.com/Time+Entries
    */
    DataList = List.Generate(
                    ()=> [SourceURI="https://api.dovico.com/TimeEntries/?version=5",ImportedXML=""],
                    each Text.PositionOf([SourceURI],"N/A") = -1,
                    each [
        Source = Web.Contents([SourceURI],[Headers=[#"Authorization"="WRAP access_token=""client=CLIENT_TOKEN&user_token=USER_TOKEN"""]]),
        ImportedXML = Xml.Tables(Source,null,1252),
        ChangeType = Table.TransformColumnTypes(ImportedXML,{{"PrevPageURI", type text}, {"NextPageURI", type text}}),
        SourceURI = Record.Field(Table.First(ChangeType),"NextPageURI"),
        TimeEntries = ChangeType{0}[TimeEntries],
        TimeEntry = TimeEntries{0}[TimeEntry]
        ],
    each [[SourceURI],[ImportedXML]]
    ),

    /*
      Now expand and shape the list of XML objects into a single table of data
    */
    ConvertToTable = Table.FromList(DataList, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column1" = Table.ExpandRecordColumn(ConvertToTable, "Column1", {"ImportedXML"}, {"ImportedXML"}),
    #"Filtered Rows" = Table.SelectRows(#"Expanded Column1", each ([ImportedXML] <> "")),
    #"Expanded ImportedXML" = Table.ExpandTableColumn(#"Filtered Rows", "ImportedXML", {"TimeEntries"}, {"TimeEntries"}),
    #"Expanded TimeEntries" = Table.ExpandTableColumn(#"Expanded ImportedXML", "TimeEntries", {"TimeEntry"}, {"TimeEntry"}),
    #"Expanded TimeEntry" = Table.ExpandTableColumn(#"Expanded TimeEntries", "TimeEntry", {"Sheet", "Client", "Project", "Task", "Employee", "Date", "TotalHours", "Description"}, {"Sheet", "Client", "Project", "Task", "Employee", "Date", "TotalHours", "Description"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Expanded TimeEntry",{{"Date", type date}, {"TotalHours", type number}, {"Description", type text}}),
    #"Expanded Project" = Table.ExpandTableColumn(#"Changed Type1", "Project", {"Name"}, {"Project.Name"}),
    #"Expanded Task" = Table.ExpandTableColumn(#"Expanded Project", "Task", {"Name"}, {"Task.Name"}),
    #"Expanded Employee" = Table.ExpandTableColumn(#"Expanded Task", "Employee", {"Name"}, {"Employee.Name"}),
    #"Expanded Client" = Table.ExpandTableColumn(#"Expanded Employee", "Client", {"Name"}, {"Client.Name"}),
    #"Expanded Sheet" = Table.ExpandTableColumn(#"Expanded Client", "Sheet", {"Status"}, {"Sheet.Status"}),
    #"Renamed Columns" = Table.RenameColumns(#"Expanded Sheet",{{"Client.Name", "Customer"}, {"Sheet.Status", "Approval status"}, {"Project.Name", "Project"}, {"Task.Name", "Task"}, {"Employee.Name", "Raw Name"}, {"TotalHours", "Effort (hrs)"}}),
    #"Added Week Ending" = Table.AddColumn(#"Renamed Columns", "Week ending", each Date.EndOfWeek([Date],Day.Saturday), type date),
    #"Added Name" = Table.AddColumn(#"Added Week Ending", "Name", each Text.Combine({List.Last(Text.Split([Raw Name],",")), List.First(Text.Split([Raw Name],","))}, " "), type text),
    #"Removed Columns" = Table.RemoveColumns(#"Added Name",{"Raw Name"}),
    #"Trimmed Text" = Table.TransformColumns(#"Removed Columns",{{"Name", Text.Trim}}),
    #"Reordered Columns" = Table.ReorderColumns(#"Trimmed Text",{"Name", "Date", "Week ending", "Customer", "Project", "Task", "Effort (hrs)", "Description", "Approval status"}),
    #"Removed Columns1" = Table.RemoveColumns(#"Reordered Columns",{"Description", "Approval status"}),
    #"Sorted Rows" = Table.Sort(#"Removed Columns1",{{"Date", Order.Descending}}),
    #"Changed Type" = Table.TransformColumnTypes(#"Sorted Rows",{{"Customer", type text}, {"Project", type text}, {"Task", type text}})
in
    #"Changed Type"

Hi @tempranello,

thanks for getting back and providing such a nice code.

 

Nothing really obvious in the List.Generate part, although: What about "TimeEntries" and "TimeEntry" - cannot see where you use them there and you don't return them in the output. So could you omit them? (Although I wouldn't believe that this is a performance killer).

 

The next step to localize the performance-problem would be to skip all steps after #"Expanded ImportedXML" This seems to be the step where the first "real" value from the procedure is shown and before further transformations take place.

 

You can try to put a buffer here:

ConvertToTable = Table.FromList(List.Buffer(DataList), Splitter.SplitByNothing(), null, null, ExtraValues.Error),

making sure to avoid multiple web-calls.

 

Or you use Fiddler to check how long the web-calls actually take.

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Hi ImkeF,

 

I'm trying to link to a dictionary provided by Investopedia.com with a slicer of 26 English letters plus a "#" mark. Under each option there are hundreds of terms available with a hyperlink to blow by blow explanations correspondingly. In order to reiterate the multiple pages over the web data it seems to require two parameters for alphabet and page number respectively in M language. I put a pbix here for your reference. Hopefully you would like to guide me to complete it. Thanks.

 

Julian Chen 

Hi @julianchen,

please have a look at this file: https://1drv.ms/u/s!Av_aAl3fXRbehbJKWOtDP86f5R726A

 

Does the table "Parameter Table_Rows" contain what you're after?

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Hi ImkeF,

 

It's amazing you've solved my major issues in a few seconds with only one tiny problem left as follows:

There are 67 terms with non-alphbet prefix missing. Could you please add them back and assign a "#" index to them for slicer selection? 

 

Julian Chen

Yes, it was late yesterday, so I didn't check it. Please have a look at it again:

1) the "#" is actually a "1" in the URL, so I've added a "TechnicalIndex"-column to your table

2) Also the URL had to be adjusted for all categories with just 1 page, as they don't use a page number as a URL parameter

 

Hope this is what you need now?

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Hi ImkeF,

 

I found the query Parameter Table_Rows just returned 2248 rows covering TechnicalIndex 1, A and B only. The total number is supposed to be 15,937 indeed. Could you please check it again to retrieve all the rows as a result? Sorry for bothering you so much. 

By the way, for those categories with just 1 page, it's still valid to include the page number  = 0 (like https://www.investopedia.com/terms/a/?page=0) as a URL parameter.

 

 

Julian  

Have you refreshed the query? I've used a filter which I might have forgotten to take out.

For learning purposes I'd suggest that you step through the queries: Pretty sure you find the filters (just delete them) and the answer to your URL question in there 😉

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

After removing the filter as you alerted, I also modified the Parameter Table_Rows query as shown below (starting from page = 0, and taking out if clause. Finally, I got 19,561 rows - excess 24 records of gap. I will check technicalIndex one after one to dig out what's the gap if possible. That's the learning purposes as you guided me. Thanks again for your great help. I've been expecting to have this "homework" to be done for quite a long time. 

 

let
    Source = #"Parameter Table",
    #"Added Custom" = Table.AddColumn(Source, "Pages", each {0..[Page]}),
    #"Expanded Pages" = Table.ExpandListColumn(#"Added Custom", "Pages"),
    #"Added Custom1" = Table.AddColumn(#"Expanded Pages", "Custom", each fnGetPages([TechnicalIndex], Text.From([Pages]))),
 // #"Added Custom1" = Table.AddColumn(#"Expanded Pages", "Custom", each if [Pages] = 1 then fnGetPages([TechnicalIndex], null) else fnGetPages([TechnicalIndex], Text.From([Pages]))),
    #"Removed Errors" = Table.RemoveRowsWithErrors(#"Added Custom1", {"Custom"}),
    #"Expanded Custom" = Table.ExpandTableColumn(#"Removed Errors", "Custom", {"Column1", "Term", "Link", "Index"}, {"Column1", "Term", "Link", "Index.1"}),
    #"Inserted Text After Delimiter" = Table.AddColumn(#"Expanded Custom", "Text After Delimiter", each Text.AfterDelimiter([Term], "/", 2), type text),
    #"Renamed Columns" = Table.RenameColumns(#"Inserted Text After Delimiter",{{"Term", "Term_1"}, {"Text After Delimiter", "Term"}})
in
    #"Renamed Columns"

Anonymous
Not applicable

Hi all,

 

I have been struggling to paginate JIRA's rest API where the max results can be obtained for only 1000 rows. The pagination technique in this page so far provides a custom code which is really easy in an ideal scenario for a given scenario but when there is no ideal scenario or use different source with custom code which was my scenario and I stumbled on a very easy and an intiutive manner in which it was explained using Invoke Function out of the box instead of creating custom codes. Please refer to this page for more information and this helped to resolve the issue.

 

https://stackoverflow.com/questions/46904641/play-with-paginated-data-with-power-bi

 

Kind regards,

A!

Anonymous
Not applicable

Hi,

 

I'm trying to understand this code and manipulate it for my own query, but I'm having difficulty. Do you have any thoughts on how I would edit this in order to display all pages on this query?

 

I've attached a screen shot of what I see when I first bring the query in. There are 1132 pages and 56580 records, but when I expand it out, I only get 50 records.

 

Initial Query.jpg

 

Screen shot above and here is the final code below (the URL is bogus for security purposes):

 

 

 

let
Source = Json.Document(Web.Contents("https://api.thewebsite.com/stats/leads?&start=2016-01-01&end=2016-11-25api_key=myapikey")),
#"Converted to Table" = Record.ToTable(Source),
Value = #"Converted to Table"{2}[Value],
#"Converted to Table1" = Table.FromList(Value, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table1", "Column1", {"bad", "campaign_id", "campaign_name", "converted", "created_at", "delivered", "delivery_summary_id", "error_delivery_summary_id", "good", "id", "invalid", "margin", "marked_for_wipe_at", "pickled_at", "profile_id", "profit", "purchase_price", "recipient_id", "rejected", "returned", "sale_price", "source_id", "source_lead_id", "values", "wiped_at"}, {"bad", "campaign_id", "campaign_name", "converted", "created_at", "delivered", "delivery_summary_id", "error_delivery_summary_id", "good", "id", "invalid", "margin", "marked_for_wipe_at", "pickled_at", "profile_id", "profit", "purchase_price", "recipient_id", "rejected", "returned", "sale_price", "source_id", "source_lead_id", "values", "wiped_at"})
in
#"Expanded Column1"

 

 

 

Thanks for any advise!!

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.