Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi everyone,
I am using a REST API with a paginated method to retreive data from a system handling all the loading/unloading activity at a wharehouse.
The pagination is working fine, I am also able to use the relative path method to allow scheduled update on PowerBI service.
Now, I want to improve my query because for now I am requesting all the dataset from the system.
I want to retreive only the data modified since my last update and merge the result to my previous dataset. That way, I will reduce the number of request and it will be much faster too.
Each row of my dataset is representing a "Visitor" with a unique "visitor ID". (a visitor is a truck)
I am updating my dataset one time per day.
Here are my API parameters :
Page number - integer (zero based)
Page size - integer from 0 to 100
Update since - YYYY-MM-DDThh:mm:ss
If provided, only visitors that have changed since the given datetime are returned.
With the "Update since" header I have the solution to retreive the data modified since my last update, Ok.
How am I supposed to handle the data now ?
Can I do all these steps in a single request ? I have an intermediate level with M language.
Thank you for your time !
Hi @lbendlin,
Thank you for your answer. That is very useful !
I am indeed only interested in the latest status.
Based on the join kind schema from Power Query :
If we consider my update table as the left one (Query name : Day1-Update) and the right one as the initial table (query name : Day1-Initial).
Should I do as follow ?
I am missing something, How am i supposed to keep the data for the next day ?
On day 2, "Day2-Initial" must be equal to "Day1-Result" otherwise, the result of "Day1-update" will be lost and erased by "Day2-Update". Does that make sens ?
Thank you for your time !
I am missing something, How am i supposed to keep the data for the next day ?
You could take a risk and self reference your semantic model for that. Personally I would rather store my snapshots in a safe place.
Hi @lbendlin
Here is the model that I try to implement thanks to your insights :
Can I apply an incremental refresh to this case in order to have the following result
The parameters RangeStart and RangeEnd (parameters requested for the Incremental refresh in PowerBI) would be :
RangeStart = Last modified Date
RangeEnd = Timestamp of refresh
Is it tricky with a paginated REST API ?
Thank you
Incremental refresh won't help you with that as the partition ranges are managed automatically by the service. You can only choose between day, month, quarter or year.
I would instead use an external storage.
That's a great challenge.
You will have to decide if you are only interested in the latest status for each visitor/truck, or if you are interested in the status change history as well.
For the former option you use an anti join (all the rows from the "updated" table plus all non-matching rows from the other table/self reference). For the latter option you would have to consult Nyquist/Shannon and then create your own storage solution with the change tracker.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.