Solved: Handling a Paginated API with Prebuilt Next Page U...

Anonymous · ‎04-01-2021

I am trying to access data from the Asana API.

When making a call to the api for say, "projects,"

GET /projects?limit=5&workspace=xxxxxxx

Or in PBI,

Json.Document(Web.Contents("https://app.asana.com/api/1.0" & "/projects?" & "limit=5" & "&workspace=xxxxxxx", [Headers=[#"Content-Type"="application/json", Authorization="Bearer somePAT"]]))

The api will respond with a list of projects and their attributes, along with a "next_page" object like so:

{
  "data": [
    {
      "gid": "12345678910111212",
      "name": "Project 1",
      "resource_type": "project"
    },
    {
      "gid": "12345678910111213",
      "name": "Project 2",
      "resource_type": "project"
    }

.
.
.

],
  "next_page": {
    "offset": "someSequence",
    "path": "/projects?limit=5&workspace=xxxxxxxx&offset=someSequence",
    "uri": "https://app.asana.com/api/1.0/projects?limit=5&workspace=xxxxxxxx&offset=someSequence"
  }
}

Obviously I want to extract the list of projects, but since pagination is compulsory with a maximum of 100 records per call, I need a way to successively make calls based on the contents of "next_page." The "uri" is a fully formed api, so theres no need to do any calculations with offsets or limits. I just need to be able to access the projects, pull the next uri, and then make another call using that next uri. This needs to happen until a "next_page" is no longer returned in the response.

I've done a decent bit of research and have yet to find a workable solution.

What's the best way to go about this in Power BI? Any tips or useful code snippets or references to appropriate documentation would be greatly appreciated.

=================================================

EDIT:

I have since played around a bit and am trying to use List.Generate to solve my problem. In pseudocode, this is my intention

records = {set of records returned from first api call}
while that same call hasNextPage():
    build the next api call from the original uri 
    append the returned records to our existing set of records

Here's what I've come up with so far

let
    uri = "https://app.asana.com/api/1.0/projects?limit=100&workspace=xxxxxxx",
    headers = [Headers=[#"Content-Type"="application/json", Authorization="somePAT"]],

    //Loop over pages
    alldata = List.Generate(
        () =>  Json.Document(Web.Contents(uri, headers))[data],
        each hasNextPage(uri),
        //how do I change the uri here, to apply the new offset each iteration? 
        each getRecords(uri & "&offset=" & getNextOffset(uri & "&offset=" & Json.Document(Web.Contents(uri, headers))[next_page][offset]))
    ),
    output = Table.FromList(alldata, Splitter.SplitByNothing(), null, null, ExtraValues.Error)
in output

Where hasNextPage(uri) makes a call to the uri, then checks if the next_page object is just null, signifying the end of available pages of records. getRecords just returns raw data from source.

let
    Source = (uri) => let
        Source = Json.Document(Web.Contents(uri, [Headers=[#"Content-Type"="application/json", Authorization="Bearer somePAT"]])),
        data = Source[data]
    in data
in Source

This now gives me a single column filled with lists, each of which contains 100 records. The problem is that it runs forever and returns hundreds of thousands of records (far more than SHOULD be returned). It never actually even loads all the rows if I apply changes.

Can someone help me debug the List.Generate function I'm running above?

Anonymous · ‎04-09-2021

Solved it using a custom recursive function which grabs the offset, builds the next page's uri, appends the data to a rolling total of the data, then makes another call with the new uri, provided a next page exists to be called upon.

(baseuri as text) =>
let
    headers = [Headers=[#"Content-Type"="application/json", Authorization="Bearer APIKEY"]],

    initReq = Json.Document(Web.Contents(baseuri, headers)),
    initData = initReq[data],
    //We want to get data = {lastNPagesData, thisPageData}, where each list has the limit # of Records, 
    //then we can List.Combine() the two lists on each iteration to aggregate all the records. We can then
    //create a table from those records
    gather = (data as list, uri) =>
        let
            //get new offset from active uri
            newOffset = Json.Document(Web.Contents(uri, headers))[next_page][offset],
            //build new uri using the original uri so we dont append offsests
            newUri = baseuri & "&offset=" & newOffset,
            //get new req & data
            newReq = Json.Document(Web.Contents(newUri, headers)),
            newdata = newReq[data],
            //add that data to rolling aggregate
            data = List.Combine({data, newdata}),
            //if theres no next page of data, return. if there is, call @gather again to get more data
            check = if newReq[next_page] = null then data else @gather(data, newUri)
        in check,
    //before we call gather(), we want see if its even necesarry. First request returns only one page? Return.
    outputList = if initReq[next_page] = null then initData else gather(initData, baseuri),
    //then place records into a table. This will expand all columns available in the record.
    expand = Table.FromRecords(outputList)
in
    expand

This returns a fully expanded table of records from from all pages of data.

Extensions of functionality or efficieny modifications are more than welcome!

View solution in original post

Syndicate_Admin · ‎01-22-2024

I want to do the exact same thing as you want to do. But with other data (i.e. Portfolios, which are not part of the existing power bi asana connector).
The solution I came up with is the following:

A function called fGetAsanaData:

= (startPageUri as text, nextPageUri as text, accessToken as text) =>
let
    targetPage = if nextPageUri = "-" then startPageUri else nextPageUri,
    source = Json.Document(Web.Contents(targetPage, [Headers=[authorization="Bearer " & accessToken]]))
in
    source

it checks if a nextPageUri has been passed (I used - as default because it was easy with typecasting and null etc.). If no nextPageUri ist defined it uses the startPageUri. It also adds the accessToken to the authorization header (PAT).

A second function called fPopulateAsanaData:

= (uri as text, accessToken as text) =>
let
    source = List.Generate( () => 
    [ result = fGetAsanaData( uri, "-", accessToken) , nextPage = result[next_page][uri] , stop = 0 ],
    each [stop] = 0,
    each [result = fGetAsanaData("-", [nextPage], accessToken), nextPage = [result][next_page][uri], stop = if [result][next_page] <> null then 0 else 1 ],
    each [result]
)
in
    source

this uses List.Generate and loops trough all the data received. when the result does not have a next_page field it sets the stop switch, so that the condition on the next run is false and it will stop retrieving the results.

you can then easily test it like so:

= fPopulateAsanaData ("https://app.asana.com/api/1.0/projects?limit=5&workspace=123456789","yourfanceprivateaccesstoken")

I hope it will help someone someday 🙂

KJanssens · ‎06-08-2022

For anyone looking for a way to limit the calls, when there are a lot of pages:

you can add a rowcount limit like below:

... SNIP ...
            //add that data to rolling aggregate
            data = List.Combine({data, newdata}),
            datacount = List.Count(data),
            //if theres no next page of data, return. if there is, call @gather again to get more data
            check = if newReq[links] = null or datacount >= 10000 then data else @gather(data, newUri)
... SNIP ...

in this case it will stop when 10000 rows have been fetched...

shriti1995 · ‎02-07-2022

(baseurl as text)=>
let
initReq = Json.Document(Web.Contents(baseurl)),
nextUrl = initReq[#"@odata.nextLink"],
initValue= initReq[value],
gather=(data as list, url)=>
let
newReq=Json.Document(Web.Contents(url)),
newNextUrl = newReq[#"@odata.nextLink"],
newData= newReq[value],
data=List.Combine({data,newData}),
Converttotable = Record.ToTable(newReq),
Pivot_Columns = Table.Pivot(Converttotable, List.Distinct(Converttotable[Name]), "Name", "Value", List.Count),
Column_Names=Table.ColumnNames(Pivot_Columns),
Contains_Column=List.Contains(Column_Names,"@odata.nextLink"),
check = if Contains_Column = true then @gather(data, newNextUrl) else data
in
check,
Converttotable = Record.ToTable(initReq),
Pivot_Columns = Table.Pivot(Converttotable, List.Distinct(Converttotable[Name]), "Name", "Value", List.Count),
Column_Names=Table.ColumnNames(Pivot_Columns),
Constain_Column=List.Contains(Column_Names,"@odata.nextLink"),
outputList= if Constain_Column= true then @gather(initValue,nextUrl) else initValue,
expand=Table.FromRecords(outputList)

in
expand

@Anonymous Your code helped me a lot to get the values. Here is another way to get the data, when the last API page doesn't bring null but has Blank.

AmandaMulryan · ‎06-01-2022

@Syndicate_Admin @rwaanders @Anonymous @wes2015

I am using this code and very close to getting it correct. Is there anyone that could help me if I provided some more details? Please & thank you, I have been stuck for over a week trying to figure this out and can't find any good examples or documentation that fit my exact scenario.

rwaanders · ‎07-09-2021

Hi there,

I'm having the same kind of API (https://manual.yesplan.be/en/developers/rest-api/#pagination), but i'm getting stuck with the "loop".

See below my code, which gets stuck on different pages each time. Most of the time on page 2, but sometimes on page 3.

When I try the URL (for example: https://odeon.yesplan.nl/api/events?page=2&book=4758791937&api_key=********) which throws error 404 via the internet, I get more detailed information:

{"contents":{},"message":"Page key \"2\" is not the current page for the book, which is \"3\"."}

When I then change the page=2 to page=3, i do get the correct results. If i then refresh the page again, it throws the above error once again stating that the current page should be 4. Then editing the code to page=4 gives me the correct information.

It's alsmost like as if the query makes a double request, which causes the page index to change to 3 where the query still tries to get page 2.

Do you know how to fix this?

See code:

(baseuri as text) =>
let
initReq = Json.Document(Web.Contents(baseuri)),
initData = initReq[data],
gather = (data as list, uri) =>
let
//get new offset from active uri
newOffset = Json.Document(Web.Contents(uri))[pagination][next],
//build new uri using the original uri so we dont append offsests
newUri = newOffset&"&api_key=**********",
//get new req & data
newReq = Json.Document(Web.Contents(newUri)),
newdata = newReq[data],
//add that data to rolling aggregate
data = List.Combine({data, newdata}),
//if theres no next page of data, return. if there is, call @gather again to get more data
check = if newReq[pagination][next] = null then data else @gather(data, newUri)
in check,
//before we call gather(), we want see if its even necesarry. First request returns only one page? Return.
outputList = if initReq[pagination][next] = null then initData else gather(initData, baseuri),
//then place records into a table. This will expand all columns available in the record.
expand = Table.FromRecords(outputList)
in
expand

Syndicate_Admin · ‎07-09-2021

Note, if i change this part in the code:

check = if newReq[pagination][next] = null then data else @gather(data, newUri)

To this:

check = if newReq[pagination][next] = null then data else data

It will only loop once, since I removed the recurring loop. I will then get the first 200 rows (which is 2 pages, 100 rows per page). So it looks like in the loop part it's using the wrong/old url[pagination][next].

Any idea on how to fix this?

rwaanders · ‎07-09-2021

Note, if i change this part in the code:

check = if newReq[pagination][next] = null then data else @gather(data, newUri)

To this:

check = if newReq[pagination][next] = null then data else data

It will only loop once, since I removed the recurring loop. I will then get the first 200 rows (which is 2 pages, 100 rows per page). So it looks like in the loop part it's using the wrong/old url[pagination][next].

Any idea on how to fix this?

Anonymous · ‎04-09-2021

Solved it using a custom recursive function which grabs the offset, builds the next page's uri, appends the data to a rolling total of the data, then makes another call with the new uri, provided a next page exists to be called upon.

(baseuri as text) =>
let
    headers = [Headers=[#"Content-Type"="application/json", Authorization="Bearer APIKEY"]],

    initReq = Json.Document(Web.Contents(baseuri, headers)),
    initData = initReq[data],
    //We want to get data = {lastNPagesData, thisPageData}, where each list has the limit # of Records, 
    //then we can List.Combine() the two lists on each iteration to aggregate all the records. We can then
    //create a table from those records
    gather = (data as list, uri) =>
        let
            //get new offset from active uri
            newOffset = Json.Document(Web.Contents(uri, headers))[next_page][offset],
            //build new uri using the original uri so we dont append offsests
            newUri = baseuri & "&offset=" & newOffset,
            //get new req & data
            newReq = Json.Document(Web.Contents(newUri, headers)),
            newdata = newReq[data],
            //add that data to rolling aggregate
            data = List.Combine({data, newdata}),
            //if theres no next page of data, return. if there is, call @gather again to get more data
            check = if newReq[next_page] = null then data else @gather(data, newUri)
        in check,
    //before we call gather(), we want see if its even necesarry. First request returns only one page? Return.
    outputList = if initReq[next_page] = null then initData else gather(initData, baseuri),
    //then place records into a table. This will expand all columns available in the record.
    expand = Table.FromRecords(outputList)
in
    expand

This returns a fully expanded table of records from from all pages of data.

Extensions of functionality or efficieny modifications are more than welcome!

wes2015 · ‎10-08-2021

Hi @Anonymous , this looks like something I could use in my case.

I have borrowed your script and applied it to the API im working with, but I'm hitting a wall when the looping is finished.

I suspect this has something to do with the fact that when there are no more results the API does not return the "next_page" part of the JSON, its just not there.

When running the API with a limit of 1, the paging is there

When running with a limit of 100 (there are 33 rows in the demo database) the "paging" part is gone

I have played around with a try/otherwise solution, but it just wont work.

Did you have the same issue?

Wes

Anonymous · ‎10-08-2021

This isnt something I had to deal with because the API I was using returns "next_page: null" once we've exhausted all the data instead of just returning nothing. That said, all you should need to do is modify my stop condition

check = if newReq[next_page] = null then data else @gather(data, newUri)

and the initial request checker

outputList = if initReq[next_page] = null then initData else gather(initData, baseuri),

to check if the paging key is contained in the json. You might be able to do it with error handling in the way youve been trying, but off the top of my head, you could also try and cast the json to regular text then use Text.Contains to see if ""paging": {" exists as a substring.

wes2015 · ‎10-11-2021

Thanks, yes I mended the stop condition to look for the "paging" column as shown below, and the query run without errors now.

But its not returning more than 100 rows still.

The List.Contains ( Table.ColumnNames() part works by itselft in a standalone query, but not sure how to check if it works inside the stop condition inside the "gather" function.

Do you see any major flaws?

let
    baseuri = "https://api.hubapi.com/crm/v3/objects/contacts
limit=100&archived=false&hapikey="  & apikey ,

    //headers = [Headers=[#"Content-Type"="application/json", Authorization="Bearer " & apikey ]],

    initReq = Json.Document(Web.Contents(baseuri)),
    #"Converted to Table" = Record.ToTable(initReq),
    initData = initReq[results],
    
    //We want to get data = {lastNPagesData, thisPageData}, where each list has the limit # of Records, 
    //then we can List.Combine() the two lists on each iteration to aggregate all the records. We can then
    //create a table from those records
    
    gather = (data as list, uri) =>
        
        
        let
            
            //get new offset from active uri
            newOffset = Json.Document(Web.Contents(uri))[paging][next][after],
            
            //build new uri using the original uri so we dont append offsests
            newUri = baseuri & "&after=" & newOffset,
            
            //get new req & data
            newReq = Json.Document(Web.Contents(newUri)),
            newdata = newReq[results],
            
            //add that data to rolling aggregate
            data = List.Combine({data, newdata}),
            
            //if theres no next page of data, return. if there is, call @gather again to get more data
            check = if List.Contains ( Table.ColumnNames(newReq as table), "paging" ) = true  then @gather(data, newUri) else data
        in 
        
        check,
    
    //before we call gather(), we want see if its even necesarry. First request returns only one page? Return.
    outputList = if List.Contains (Table.ColumnNames( #"Converted to Table" as table) , "paging" ) = true then gather(initData, baseuri) else initData,
    
    //then place records into a table. This will expand all columns available in the record.
    expand = Table.FromRecords(outputList)


in
    
    
    expand

JackSelman · ‎03-08-2023

Hi there @wes2015 dide you manage to solve this? I'm trying to build a paginated call to the HubSpot API too. I've tried your code but only returned 100 records.

wes2015 · ‎03-08-2023

Hi @JackSelman , was just working with a project on the Hubspot API.

This is the code currently working for me. Let me know if it works for you.

let
baseuri = baseurlContacts & apikey,

initReq = Json.Document(Web.Contents(baseuri)),

initData = initReq[results],

//We want to get data = {lastNPagesData, thisPageData}, where each list has the limit # of Records,

//then we can List.Combine() the two lists on each iteration to aggregate all the records. We can then

//create a table from those records

gather = (data as list, uri) =>

let
//get new offset from active uri
newOffset = Json.Document(Web.Contents(uri))[paging][next][after],

//build new uri using the original uri so we dont append offsests
newUri = baseuri & "&after=" & newOffset,

//get new req & data
newReq = Json.Document(Web.Contents(newUri)) ,
newdata = newReq[results] ,

//add that data to rolling aggregate
data = List.Combine({data, newdata}),

//if theres no next page of data, return. if there is, call @gather again to get more data

check = if Table.Contains ( Record.ToTable(newReq) , [Name = "paging"] ) = true then @gather (data , newUri) else data

in check,

//before we call gather(), we want see if its even necesarry. First request returns only one page? Return.
outputList = if Table.Contains ( Record.ToTable (initReq) , [Name = "paging"] ) = true then gather( initData , baseuri ) else initData ,

//then place records into a table. This will expand all columns available in the record.
expand = Table.FromRecords(outputList)



in 



expand

@

Ginoc · ‎03-20-2024

Brilliant wes2015. I've been beating my head against a wall trying to use the List.Generate() function but not matter what I did I could not find a method that tested the existance of the "next page" content so it fell over after retrieving the last page and there was no more pages to follow. The List.Generate() just does not work in this circumstance where the next page field does not exist.

The structure of your code does the job nicely. I used the Record.HasFields() function call instead to test for the existance of the "paging" field instead as a minor point of difference.

// process the JSON response
apiResults = if Record.HasFields(apiResponse,"results") = true then List.Combine( {apiResults, apiResponse[results]} ) else apiResults,

// Get the next page of results if there are more to fetch
apiHasNextPage = if Record.HasFields(apiResponse,"paging") then
@GetDealsRecursive(apiResults,apiResponse[paging][next][after])
else
apiResults

JackSelman · ‎03-08-2023

Thanks for sharing this @wes2015! I'm struggling to get this to work as I'm not using an API key, but a private access token, so there's no key parameter to include. I'm getting stick when specifying the Header Authorization = Bearer, and so I can't even authenticate

let
baseuri = "https://api.hubapi.com/crm/v3/objects/line_items?limit=100&archived=false&properties=name&properties=folder&properties=sku&properties=product_type&properties=hs_sku&properties=amount&properties=quantity",
    headers = [Headers=[#"Content-Type"="application/json", Authorization="Bearer PAT"]],

initReq = Json.Document(Web.Contents(baseuri)),

initData = initReq[results],

//We want to get data = {lastNPagesData, thisPageData}, where each list has the limit # of Records,

//then we can List.Combine() the two lists on each iteration to aggregate all the records. We can then

//create a table from those records

gather = (data as list, uri) =>

let
//get new offset from active uri
newOffset = Json.Document(Web.Contents(uri))[paging][next][after],

//build new uri using the original uri so we dont append offsests
newUri = baseuri & "&after=" & newOffset,

//get new req & data
newReq = Json.Document(Web.Contents(newUri)) ,
newdata = newReq[results] ,

//add that data to rolling aggregate
data = List.Combine({data, newdata}),

//if theres no next page of data, return. if there is, call @gather again to get more data

check = if Table.Contains ( Record.ToTable(newReq) , [Name = "paging"] ) = true then @gather (data , newUri) else data

in check,

//before we call gather(), we want see if its even necesarry. First request returns only one page? Return.
outputList = if Table.Contains ( Record.ToTable (initReq) , [Name = "paging"] ) = true then gather( initData , baseuri ) else initData ,

//then place records into a table. This will expand all columns available in the record.
expand = Table.FromRecords(outputList)



in 



expand

"PAT" part.

I don't suppose this is an issue you've encountered?

wes2015 · ‎03-08-2023

Hi, Nope, i do all the work with the APIkey, would definately reccomend you to do the same.

JackSelman · ‎03-08-2023

Thanks @wes2015! Sorry if I'm looking at the wrong one, but I thought API keys were being sunsetted?

biancag · ‎09-13-2021

Hi,

I am a newbie to API paginations. This code is the easiest to follow on pagination i have found.
When i copy the code and update the credentials, i get the below error. Please help!!

Thanks, Bianca

Anonymous · ‎10-08-2021

Hey Bianca, not all APIs are built the same! Depending on the API, the output json will have a different structure and may not have a "next_page" field. You need to figure out how your API handles pagination and modify the code accordingly. Add some details about the API youre using and maybe I can help

igorop · ‎05-14-2021

@Anonymous thank you so much!! this code helped me a lot!

Thanks to you, I learned a new way to paginate API requests in Power Query!

Superb work! Congratulations!!

Handling a Paginated API with Prebuilt Next Page URI in Response [EDITED]

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024