Paginated REST API - Reduce number of requests and...

ZekO · ‎02-20-2024

Hi everyone,

I am using a REST API with a paginated method to retreive data from a system handling all the loading/unloading activity at a wharehouse.

The pagination is working fine, I am also able to use the relative path method to allow scheduled update on PowerBI service.

Now, I want to improve my query because for now I am requesting all the dataset from the system.

I want to retreive only the data modified since my last update and merge the result to my previous dataset. That way, I will reduce the number of request and it will be much faster too.

Each row of my dataset is representing a "Visitor" with a unique "visitor ID". (a visitor is a truck)

I am updating my dataset one time per day.

Here are my API parameters :

Page number - integer (zero based)

Page size - integer from 0 to 100

Update since - YYYY-MM-DDThh:mm:ss

If provided, only visitors that have changed since the given datetime are returned.

With the "Update since" header I have the solution to retreive the data modified since my last update, Ok.

How am I supposed to handle the data now ?

I can have new Visitors and updated one (A truck can "wait for loading/unloading" on day 1 and being "checked out" on day 2) I am not able to append the queries because I will create duplicates with the updated visitors. how can I manage that ?
Even if I am able to merge the queries (add new visitors and update existing visitor with new data), How can I handle the next call without erasing the data from the previous update ? If I touch the source, the table related to it will be affected too.

Can I do all these steps in a single request ? I have an intermediate level with M language.

Thank you for your time !

ZekO · ‎02-28-2024

Hi @lbendlin,

Thank you for your answer. That is very useful !

I am indeed only interested in the latest status.

Based on the join kind schema from Power Query :

If we consider my update table as the left one (Query name : Day1-Update) and the right one as the initial table (query name : Day1-Initial).

Should I do as follow ?

A Right anti join as you said to retreive all the visitors without updates from "Day1-Initial"
Merge "Day1-Update" with "Day1-Initial" (query name : Day1-Result)

I am missing something, How am i supposed to keep the data for the next day ?

On day 2, "Day2-Initial" must be equal to "Day1-Result" otherwise, the result of "Day1-update" will be lost and erased by "Day2-Update". Does that make sens ?

Thank you for your time !

lbendlin · ‎02-28-2024

I am missing something, How am i supposed to keep the data for the next day ?

You could take a risk and self reference your semantic model for that. Personally I would rather store my snapshots in a safe place.

ZekO · ‎03-05-2024

Hi @lbendlin

Here is the model that I try to implement thanks to your insights :

Can I apply an incremental refresh to this case in order to have the following result

The parameters RangeStart and RangeEnd (parameters requested for the Incremental refresh in PowerBI) would be :

RangeStart = Last modified Date

RangeEnd = Timestamp of refresh

Is it tricky with a paginated REST API ?

Thank you

lbendlin · ‎03-05-2024

Incremental refresh won't help you with that as the partition ranges are managed automatically by the service. You can only choose between day, month, quarter or year.

I would instead use an external storage.

lbendlin · ‎02-20-2024

That's a great challenge.

You will have to decide if you are only interested in the latest status for each visitor/truck, or if you are interested in the status change history as well.

For the former option you use an anti join (all the rows from the "updated" table plus all non-matching rows from the other table/self reference). For the latter option you would have to consult Nyquist/Shannon and then create your own storage solution with the change tracker.

Paginated REST API - Reduce number of requests and handling updated data

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024