Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
gotmike
Frequent Visitor

how to create a query that paginates?

I'm working with the Hubspot CRM API and when you query for a list of all deals, you only get 100 records at a time, and they want you to send subsequent queries with an "offset" to paginate the results.

 

For instance, if you send:

https://api.hubapi.com/deals/v1/deal/all?hapikey=demo

 

at the very end of the query, you see the following JSON:

 

"hasMore":false
"offset":27939158

so, if hasMore is true, the NEXT query should look like this:

https://api.hubapi.com/deals/v1/deal/all?hapikey=demo&offset=27939158

 

and then, we would want to repeat the process until hasMore comes back with false.

 

i'm completely new to power bi, so would love to know how to handle this type of query process.

 

in another language, this would just be do { } while (hasMore == false);

or something like that...

214 REPLIES 214
Kalyankumar
Regular Visitor

Hi Mike I am also looking solution for the same problem

Anonymous
Not applicable

qwe

Anonymous
Not applicable

Getting the total number of results and paginating are two different operations. For the sake of this example, let's assume that the query you're dealing with is

SELECT * FROM Orders WHERE OrderDate >= '1980-01-01' ORDER BY OrderDate

In this case, you would determine the total number of results using:

SELECT COUNT(*) FROM Orders WHERE OrderDate >= '1980-01-01'

...which may seem inefficient, but is actually pretty performant, assuming all indexes etc. are properly set up.

Next, to get actual results back in a paged fashion, the following query would be most efficient:

SELECT  *
FROM    ( SELECT    ROW_NUMBER() OVER ( ORDER BY OrderDate ) AS RowNum, *
          FROM      Orders
          WHERE     OrderDate >= '1980-01-01'
        ) AS RowConstrainedResult
WHERE   RowNum >= 1
    AND RowNum < 20
ORDER BY RowNum

This will return rows 1-19 of the original query. The cool thing here, especially for web apps, is that you don't have to keep any state, except the row numbers to be returned.

Anonymous
Not applicable

Thank you very much for your answer @Anonymous !

 

I am so sorry, but I have no clue, how your answer helps me.

 

JIRA Rest API is limited to 1000 rows. As I understand it right, the problem is, I need some looping mechanism to get all datas. List.Generate seems like to act like a loop. But unfortunately I don´t get to work...

 

So I hope somebody already had the same problem.

 

freiburgc

Anonymous
Not applicable

Hi @Anonymous ,

 

This can be sorted by following the below steps. 

You'll require 4 add ons i.e., 1 parameter  and 3 functions. Replace the information suiting your organization URL.

 

  1. Parameter: URL

e.g., https : // [Replace with your organization URL] / jira

 

  1. Function 01:  FetchPage

 

let

FetchPage = (url as text, pageSize as number, skipRows as number) as table =>

let

//Here is where you run the code that will return a single page

contents = Web.Contents(URL&"/rest/api/2/search",[Query = [maxResults= Text.From(pageSize), startAt = Text.From(skipRows)]]),

json = Json.Document(contents),

Value = json[issues],

table = Table.FromList(Value, Splitter.SplitByNothing(), null, null, ExtraValues.Error)

in

table meta [skipRows = skipRows + pageSize, total = 500]

in

FetchPage

 

  1. Function: FetchPages

 

let

FetchPages = (url as text, pageSize as number) =>

let

Source = GenerateByPage(

(previous) =>

let

skipRows = if previous = null then 0 else Value.Metadata(previous)[skipRows],

totalItems = if previous = null then 0 else Value.Metadata(previous)[total],

table = if previous = null or Table.RowCount(previous) = pageSize then

FetchPage(url, pageSize, skipRows)

else null

in table,

 

type table [Column1])

in

Source

in

FetchPages

 

  1. Function: GeneratebyPage

 

(getNextPage as function, optional tableType as type) as table =>

let

listOfPages = List.Generate(

() => getNextPage(null),

(lastPage) => lastPage <> null,

(lastPage) => getNextPage(lastPage)

),

tableOfPages = Table.FromList(listOfPages, Splitter.SplitByNothing(), {"Column1"}),

firstRow = tableOfPages{0}?,

keys = if tableType = null then Table.ColumnNames(firstRow[Column1])

else Record.FieldNames(Type.RecordFields(Type.TableRow(tableType))),

appliedType = if tableType = null then Value.Type(firstRow[Column1]) else tableType

in

if tableType = null and firstRow = null then

Table.FromRows({})

else

Value.ReplaceType(Table.ExpandTableColumn(tableOfPages, "Column1", keys), appliedType)

 

Once the above is created, the next step will be is to use the function and parameters to retrieve the content. you can start with the below and expand the fields of choice to create your report.

 

Main query:

 

let

Source = FetchPages("", 500),

#"Expanded Column1" = Table.ExpandRecordColumn(Source, "Column1", {"key", "fields"}, {"key", "fields"}),

 

in

 

#"Expanded Column1"

 

Hope this helps you to build the report. Please note this query is only the search information and if you'd need change log information you can modify that in the 2nd function listed above.

 

Good luck.

 

Cheers,

Anand

Anonymous
Not applicable

Hi @Anonymous ,

 

Thanks a lot for your answer!!!

 

Unfortunately I still have a few because, because this topic overtaxes me at the moment a bit.

I added the parameter URL and it works.

 

The next step is to paste all the functions to my query, right?

 

I did it and got a failure on Function: FetchPages, there is a problem with “let”.

I don´t know why.

 

To understand a bit what your code does I just added 1.Function 01:FetchPage to my query.

I got three parameters: URL, SkipRows and pageSize.

 

Do I understood the first function right, that this one can give my table on one page, depending on the pageSize? If I add 2000 to this parameter, I get 2000 rows on one page. However, it´s just static and not dynamical?

 

Sorry for stupid questions, but M-language is difficult for me and try to understand it.

 

thanks a lot for your support and hopefully your patience!

freiburgc

Anonymous
Not applicable

Hi @Anonymous ,

 

I have tried to work with Multi-step function.

 

Unfortunately I got a problem - Expression.Error: The Name "GenerateByPage" wasn't recognized.

 

I tried "Table.GenerateByPage", doesn´t work.

 

let

FetchPages = (url as text, pageSize as number) =>

let

Source =GenerateByPage(
    
    (previous) =>

let

skipRows = if previous = null then 0 else Value.Metadata(previous)[skipRows],

totalItems = if previous = null then 0 else Value.Metadata(previous)[total],

table = if previous = null or Table.RowCount(previous) = pageSize then

FetchPage(url, pageSize, skipRows)

else null

in table,

type table [Column1])

in

Source

in

FetchPages

Maybe you have a look at it or someelse, that would great!!

 

Thanks again!

freiburgc

Anonymous
Not applicable

Hi @Anonymous ,

 

Instead of having them all listed in one query, could you please have them created as indicated by me i.e., to create 3 queries and give it a try?

 

Regards,

A!

Anonymous
Not applicable

Hi @Anonymous ,

 

I have created different queries and it almost works now.

 

Unfortunately another problem has emerged.

I have extracted a customfield of the query and i got a failure.

"Expression Error: Value "null" cannot be converted to Typ List.

Details:

Value=

Type=[Type]

= Table.TransformColumns(#"Erweiterte Column1.fields.customfield_13101", {"Column1.fields.customfield_13100", each Text.Combine(List.Transform(_, Text.From)), type text})

Do have an idea how to solve it?

 

Thanks alot!

freiburgc

Anonymous
Not applicable

Here is a sample that I used on a similar issue. You can use list.generate in a similar way to loop thru all pages.

 

 

let
BaseUrl = "http://xxxx",
Token = [Headers=[#"key"="xxx"]],
EntitiesPerPage = 1000,
WebCall = try Json.Document(Web.Contents(BaseUrl,Token)),
Value = WebCall[Value],
count = Value[totalHits],
countMax = Number.RoundUp(List.Max({1,count/EntitiesPerPage})),

nextURL = (counter,sid ) =>
let
url = BaseUrl & "&pageNumber=" & Text.From(counter) & "&scrollId="&Text.From(Record.Field(sid,"Value")),
call = Web.Contents(url,Token)
in
call,

FnGetOnePage =
(url) as record =>
let
Source = Json.Document(url),
data = try Source[featureMatchEvents] ,
next = try Source[scrollId] ,
res = [Data=data, Next=next]
in
res,


GeneratedList =
List.Generate(
()=>[i=1, res = FnGetOnePage(nextURL(1,[Value=""]))],
each [i]<countMax,
each [i=[i]+1, res = FnGetOnePage(nextURL(i,[res][Next]))],
each [res][Data]),

Anonymous
Not applicable

Hi,

I despair a little of my problem. I have worked my way back and forth in this thread, but haven´t found the right solution.

 

I want to import all data from my JIRA database with a REST API. I get connection to JIRA but my datas are limited to 1000 rows. Unfortunately I have 13300 and it´s getting more and more every day.

 

I have tried almost every solution in this thread, but nothing seems to work for me or I am already so confused I don´t see my mistakes.

 

This is my basic code with the limitation of 1000:

 

let
    Quelle = Json.Document(Web.Contents("https://XXXX.XXXXXX.com/rest/api/2/search?jql=project%20in%20(10001%2C10001%2C13560)&maxResults=2000&fields=aggregatetimeoriginalestimate,timeestimate,timespent,customfield_13100,customfield_13101"))
in
    Quelle

 

Does anybody has a solution for it or can give some hints?

 

Thanks alot for your help!!!!!!!

freiburgc

Anonymous
Not applicable

@ImkeF 

After reading the thead I think this is what I need to fix my issue about the API pull we are trying to figure out for long time.

So basically we have an link given by vendor, and we can pull data.

The issue is its limited to 20,000 row per call.

 

222.PNG

 

I tried to use some of the codes from here and fail (still new to coding).

Here's what i have tried.

let
    Pagination = List.Skip(List.Generate( () => [Last_Key = "init", Counter=0], // Start Value
   		each  [Last_Key] <> null, // Condition under which the next execution will happen
   		each [ Last_Key = try if [Counter]<1 then "" else [WebCall][Value][offset] otherwise null,// determine the LastKey for the next execution
                       WebCall = try if [Counter]<1 then Json.Document(Web.Contents("Weblink+tokenkey")) 
                            else Json.Document(Web.Contents("Weblink+tokenkey&offset="&Last_Key&"")), // retrieve results per call
    		       Counter = [Counter]+1// internal counter
                      ],
   		each [WebCall]
    ),1)
in
    Pagination

 

JSON pull looks like this.

 

4444.PNG

Anonymous
Not applicable

Hi all (& specifically @ImkeF as he is clearly a superhero!)

 

I'm very new to PowerBI, a couple of weeks at best, and very new to querying APIs. I appreciate that's a recipe for trouble, but i'm a relatively quick learner, so bear with me 😉

 

I'm querying an Rest API (specifically GoCardless if anyone is also needing a solution) which returns results in JSON. I'm trying to get a list of items to build a dataset.

 

Unfortunately, a single call is limited to 500 records, and I need in the region of 50,000.

 

I've researched the API Documentation and it's possible to fetch the next set of 500 by passing an ID from the first response.

 

The initial request is just a standard URL with no parameters - something like:

 

https://www.blah.com/events

 

Subsequent calls must include the 'after' parameter and the id so would look something like this:

 

https://www.blah.com/events?after=X0001

 

The JSON response from the first call looks like this:

 

    {
    "events": [
        {
            "id": "X0001",
            "created_at": "2020-03-24T08:04:27.593Z",
            "resource_type": "mandates",
            "action": "cancelled",
            "links": {
                "mandate": "MD0005CS4YKG3P"
            },
            "details": {
                "origin": "bank",
                "cause": "mandate_cancelled",
                "scheme": "bacs",
                "reason_code": "ADDACS-1",
                "description": "The mandate was cancelled at a bank branch."
            },
            "metadata": {}
        },
	],
    "meta": {
        "cursors": {
            "before": null,
            "after": "X0001"
        },
        "limit": 500
    }
}

 

I've done lots of reading so my assumption is I need to write a query that does something like:

 

  • Initial API call is as standard
  • Pick after value from JSON and write to a temp table
  • Additional API call adding parameter including value in table
  • Loop through this until 'after' value becomes 'null'

Does that sound about right, and could someone help me? I've got a bit lost in all the comments although I think I grasp the principle.

Thanks,

Chris

 

Hi @Anonymous 

you can check after each step whether a new should be made by checking if the "after"-value isn't null: 

[Result][meta][cursors][after] <> null 

For the development/testing phase , I've added another condition as well, that prevents running this query into an an infinite loop: 

[Counter] < 200. You should delete that condition once the query runs as you expect it.:

let
    Source = "    {#(cr)#(lf)    ""events"": [#(cr)#(lf)        {#(cr)#(lf)            ""id"": ""X0001"",#(cr)#(lf)            ""created_at"": ""2020-03-24T08:04:27.593Z"",#(cr)#(lf)            ""resource_type"": ""mandates"",#(cr)#(lf)            ""action"": ""cancelled"",#(cr)#(lf)            ""links"": {#(cr)#(lf)                ""mandate"": ""MD0005CS4YKG3P""#(cr)#(lf)            },#(cr)#(lf)            ""details"": {#(cr)#(lf)                ""origin"": ""bank"",#(cr)#(lf)                ""cause"": ""mandate_cancelled"",#(cr)#(lf)                ""scheme"": ""bacs"",#(cr)#(lf)                ""reason_code"": ""ADDACS-1"",#(cr)#(lf)                ""description"": ""The mandate was cancelled at a bank branch.""#(cr)#(lf)            },#(cr)#(lf)            ""metadata"": {}#(cr)#(lf)        },#(cr)#(lf)#(tab)],#(cr)#(lf)    ""meta"": {#(cr)#(lf)        ""cursors"": {#(cr)#(lf)            ""before"": null,#(cr)#(lf)            ""after"": ""X0001""#(cr)#(lf)        },#(cr)#(lf)        ""limit"": 500#(cr)#(lf)    }#(cr)#(lf)}",
    #"Parsed JSON" = Json.Document(Source),
    Custom2 = List.Generate( () =>
        [Result = #"Parsed JSON", Counter = 0],
        each [Result][meta][cursors][after] <> null and [Counter] < 1000,
        each [
            Result = Json.Document(Web.Contents("https://www.blah.com/events?after=" & [NextPage])),
            NextPage = Result[meta][cursors][after],
            Counter = [Counter] + 1
        ]
    )
in
    Custom2

 

You can also check out this video to learn more about the technique: https://www.youtube.com/watch?v=vhr4w5G8bRA

 

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Thank you very much @ImkeF, your answer greatly helped me get the list of pagination tokens for Twitter API v2!

If I may, for your code to work and based on what you wrote it should look like this :

 

    #"Parsed JSON" = Json.Document(Source),
    Custom2 = List.Generate( () =>
        [Result = #"Parsed JSON", Counter = 0],
        each [Result][meta][cursors][after] <> null and [Counter] < 1000,
        each [
            Result = Json.Document(Web.Contents("https://www.blah.com/events?after=" & Result[meta][cursors][after])),
            Counter = [Counter] + 1
        ]
    )
in
    Custom2

 

Personnally, my code looks like this :

    Source = 
        List.Generate( 
        () =>
        [Result = Fx_GetData(P_URL)],
        each [Result]<>null,
        each [Result = try Fx_GetData(P_URL&"&pagination_token="&[Result][meta][next_token]) otherwise null]
    ),

This way it handles error cases.

Thank you again 🙏

Anonymous
Not applicable

Thanks very much @ImkeF I think that makes sense to me.

 

However, isn't this assuming I start with the JSON? The JSON I actually get as an API response, the existing query (that returns the JSON) just looks like this:

 

let
    Source = Json.Document(Web.Contents("https://api.gocardless.com/events", [Headers=[Authorization="Bearer ???accesstoken???", #"GoCardless-Version"="2015-07-06"]])),
    events = Source[events],
    #"Converted to Table" = Table.FromList(events, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"id", "created_at", "resource_type", "action", "links", "details", "metadata"}, {"id", "created_at", "resource_type", "action", "links", "details", "metadata"}),
    #"Expanded links" = Table.ExpandRecordColumn(#"Expanded Column1", "links", {"mandate"}, {"mandate"}),
    #"Expanded details" = Table.ExpandRecordColumn(#"Expanded links", "details", {"origin", "cause", "scheme", "reason_code", "description"}, {"origin", "cause", "scheme", "reason_code", "description"}),
    #"Removed Columns" = Table.RemoveColumns(#"Expanded details",{"metadata"})
in
    #"Removed Columns"

 

Obviously this has me converting the data into tables too.

Hi @Anonymous 

Have you considered adjusting your query?

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Anonymous
Not applicable

@ImkeF  I absolutely have 😆

 

I've rewritten it about 15 times today in notepad++ but the but I can't seem to figure is how to capture the JSON output and feed it back into the query.

 

I understand in the query you've written as in you're feeding in the JSON as the source directly, then parsing that, then using part of that to build the next URL and looping through until done.

 

And in mine i'm just calling the API, capturing the JSON results and converting to a table.

 

The bit I don't understand is how i'd get the results from my query and feed them into what you've written - in one query - because the JSON has to be recieved from the original call. Unless i'm missing something really obvious, as I say, I'm very new to this so could well be.

 

btw, side note, is there any reliable syntax highlighting for this?

Anonymous
Not applicable

Scratch that, found a Power Query XML import for Notepad++


Happy to share if anyone requires.

@Anonymous  thanks for the posts, did you manage to get your code working? I am using the same source as you (GC) and having trouble manipulating the code so that it refreshes in Power BI online, I have managed to create my report so that i can refresh it in Power Bi desktop but obviously this is not the ideal solution. 

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.