Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
ZekO
Frequent Visitor

Paginated REST API - Reduce number of requests and handling updated data

Hi everyone,

 

I am using a REST API with a paginated method to retreive data from a system handling all the loading/unloading activity at a wharehouse.

The pagination is working fine, I am also able to use the relative path method to allow scheduled update on PowerBI service.

Now, I want to improve my query because for now I am requesting all the dataset from the system.

I want to retreive only the data modified since my last update and merge the result to my previous dataset. That way, I will reduce the number of request and it will be much faster too.

 

Each row of my dataset is representing a "Visitor" with a unique "visitor ID". (a visitor is a truck)

I am updating my dataset one time per day.

 

Here are my API parameters :

Page number - integer (zero based)

Page size - integer from 0 to 100

Update since YYYY-MM-DDThh:mm:ss

If provided, only visitors that have changed since the given datetime are returned.

 

With the "Update since" header I have the solution to retreive the data modified since my last update, Ok.

How am I supposed to handle the data now ?

 

  1. I can have new Visitors and updated one (A truck can "wait for loading/unloading" on day 1 and being "checked out" on day 2) I am not able to append the queries because I will create duplicates with the updated visitors. how can I manage that ?
  2. Even if I am able to merge the queries (add new visitors and update existing visitor with new data), How can I handle the next call without erasing the data from the previous update ? If I touch the source, the table related to it will be affected too.

Can I do all these steps in a single request ? I have an intermediate level with M language.

 

Thank you for your time !

 

5 REPLIES 5
ZekO
Frequent Visitor

Hi @lbendlin,

 

Thank you for your answer. That is very useful !

I am indeed only interested in the latest status.

 

Based on the join kind schema from Power Query :

If we consider my update table as the left one (Query name : Day1-Update) and the right one as the initial table (query name : Day1-Initial).

ZekO_0-1709124066214.png

 

Should I do as follow ?

  1. A Right anti join as you said to retreive all the visitors without updates from "Day1-Initial"
  2. Merge "Day1-Update" with "Day1-Initial" (query name : Day1-Result)

I am missing something, How am i supposed to keep the data for the next day ?

On day 2,  "Day2-Initial" must be equal to "Day1-Result" otherwise, the result of "Day1-update" will be lost and erased by "Day2-Update". Does that make sens ?

 

Thank you for your time !

 

 

I am missing something, How am i supposed to keep the data for the next day ?

 

You could take a risk and self reference your semantic model for that. Personally I would rather store my snapshots in a safe place.

ZekO
Frequent Visitor

Hi @lbendlin 

 

Here is the model that I try to implement thanks to your insights :

ZekO_0-1709653341964.png

 

Can I apply an incremental refresh to this case in order to have the following result

 

ZekO_1-1709653488923.png

 

The parameters RangeStart and RangeEnd (parameters requested for the Incremental refresh in PowerBI) would be :

RangeStart = Last modified Date

RangeEnd = Timestamp of refresh

 

Is it tricky with a paginated REST API ?

 

Thank you

Incremental refresh won't help you with that as the partition ranges are managed automatically by the service. You can only choose between day, month, quarter or year.  

 

lbendlin_0-1709655763844.png

 

 

I would instead use an external storage.

lbendlin
Super User
Super User

That's a great challenge.

 

You will have to decide if you are only interested in the latest status for each visitor/truck, or if you are interested in the status change history as well.

 

For the former option you use an anti join (all the rows from the "updated"  table plus all non-matching rows from the other table/self reference). For the latter option you would have to consult Nyquist/Shannon and then create your own storage solution with the change tracker.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors