Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
CL7777
Helper III
Helper III

incremental refresh with an API call in power query

I have been really struggling with this issue. I have a really large data set with limited quota on API calls/cores. The API call has a parameter called "lastOrderDate" that I pass it in m code that I wrote. The m code then only goes and fetches with the API call only data whose date is on or after the lastOrderDate. This keeps the API calls small enough where I dont use too many resources in the call. The issue is that I then want to update the full data set with all orders fetched from the API call (orders after lastOrderDate). I tried doing this by combining this m code with incremental refresh in power BI.. but what happens is that the API call that gets the data since the lastOrderDate ends up replacing the full data set in BI,  it does not append to this data set. So, for example, if I have a data set with 1000 rows or order data and there are 15 new orders since lastOrderDate, I want the data set to end up with 1015 rows of orders. Instead what happens is that I end up with a data set that has just 15 orders. The data obtained from the API call replaces the full data set instead of adding to it. I cannot figure out a way to have the full data set remain intact and have the API call for all orders since lastOrderDate accumulate at the bottom of the full data set. (its almost like I need to do imcremental refresh in power query, not in BI). When I use incremental refresh in power BI itself, it doesnt solve my problem because the API is still seeking out the full data set, even if its only refreshing incrementally in the BI report. So it doesnt save me API resources

 

Here is the m code I use to fetch the new data after lastOrderDate... 

 

let
URL = "https://stoneage-dev-orders-api.us-w1.cloudhub.io/api/orders?lastOrderDate="&Date.ToText(Date.From(S...") ,
// URL = "https://stoneage-dev-orders-api.us-w1.cloudhub.io/api/orders?lastorderdate="&Text.From(StartDate) ,

Source = Json.Document(Web.Contents(URL, [Headers=[Source="EPICOR", Authorization="***************"]])) ,

// Web.Contents("http://api.weatherapi.com/v1/history.xml",

// [Query=[key="0f44dea1e8d74e8d8f8160719200112", q="32212", lastOrderDate=StartDate]])),

// Source = Json.Document(Web.Contents("https://stoneage-dev-orders-api.us-w1.cloudhub.io/api/orders?lastOrderDate=2019-11-10T00:00:00-07:00", [Headers=[Source="EPICOR", Authorization="Basic cmFqZXNoOjEyMzQ1"]])),

#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),

#"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"OrderOpen", "OrderHed_ChangeDate", "OrderCompany", "OrderNumber", "OrderCustomerKey", "OrderDate", "DiscountPercent", "OrderBranch", "BTCustNumCustID", "BTCustNumName", "CustomerCustID", "CustomerName", "CurrencyCodeCurrencyID", "TerritoryID", "ShipToTerritoryID", "OrderLine"}),

#"Sorted Rows" = Table.Sort(#"Expanded Column1",{{"OrderDate", Order.Descending}}),
#"Changed Type" = Table.TransformColumnTypes(#"Sorted Rows",{{"OrderDate", type datetime}}),
#"Expanded OrderLine" = Table.ExpandListColumn(#"Changed Type", "OrderLine"),
#"Expanded OrderLine1" = Table.ExpandRecordColumn(#"Expanded OrderLine", "OrderLine", {"OrderLine", "PartNum", "LineDesc", "RevisionNum", "DocUnitPrice", "OrderQty", "OrderLineAmount", "ProdCode"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Expanded OrderLine1",{{"OrderLine", type text}, {"OrderNumber", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type1", "OLR", each [OrderNumber]&[OrderLine])
in
#"Added Custom"

 

Any help would be much appreciated, I have been working at this for quite some time

 

 

5 REPLIES 5
ImkeF
Super User
Super User

Hi @CL7777 ,

yes, it's all workarounds.
My latest one includes storing the data in CDS/Dataverse. At least that can be refreshed programatically.

But not sure how it deals with large datasets.

Create a load history or stage in CDS instead of incremental load in Power BI – The BIccountant

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

CL7777
Helper III
Helper III

that would work, but its not practical to manually do this with hundreds of data sets. Im looking for a power query m code solution to this problem that would take the manual work out of it. maybe not possible?

mahoneypat
Employee
Employee

Not exactly what I meant.  Let me clarify.

 

- the excel file would have data up through the 10th

- one query in PBI pulls in the Excel file

- another query pulls data from the API from the 10th until today

- the two queries are appended

 

Over time, it will slow down as the API query is pulling more data.  Periodically, you add the rows from the API query into the Excel file and update the API query to start pulling from the more recent date (where the Excel now leaves off).

 

Regards,

Pat





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


CL7777
Helper III
Helper III

This solution is a good idea but Im not sure it will work. the reason is as follows: (am I missing something?)

 

Say Excel file is loaded with data up until Dec 9th, 2020.. on Dec 10th and 11th the following things happen (if I understand this correctly):

 

1. Dec 10th, API gets the new data for 12/10 and appends it to the Excel file (all good so far)

2. Dec 11th, API gets the new data for 12/11 and appends it to the Excel file (so since 12/10 was not originally in the Excel file, the new data set wont have 12/10 data in it anymore.

 

Please let me know if I am missing something in understanding.. thank you

mahoneypat
Employee
Employee

You can simulate an incremental refresh as follows.

 - Get all your data and export it to an Excel file up to a recent date

- make a query that brings in the Excel data and disable load on that query

- make a query that gets the API data from the recent date and disable load

- append the two queries and load that table

- periodically update the two queries to a more recent date

 

This way the bulk of your refresh is against the Excel file and is more efficient and doesn't make API calls.

 

Regards,

Pat





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors