Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
ddfreedie
Helper I
Helper I

PowerBi Incremental load issue

Hi

 

I am facing issue with Incremental loading in PBI.

 

Basically i am working on setting up a table on my own and checking how to enable incremental loads.

 

As part of that, i created a table with 

ID, StartDate, EndDate, Name

 

StartDate is the day the record is entered in DB. End date is NULL for all active records. For historical records end date will have value (similar to SCD2)

 

I used Startdate as a date for incremental and EndDate for Data capture.

 

By PBI incremental logic, i select 120 months data to be retained, refresh data for 120 months (this is just for working assumption for now)

 

When i run the incremental, it queries 120 max(enddate) for each monthly partitions. Works perfect.

 

But the problem comes when the actual partition is queried. I am seeing top 1000 for each of the 120 partition. Ideally only when the max(enddate) is changed from the previous load, for a partition, that partition is queried.

Now all the paritions are queried with top 1000 and reads 1000 records everytime. Even though it is 1000, it still uses the database resources and tends to slow the loads if i need 10y of data (for some reason)..

 

Now my questions as i said,

PBI should only query the partition whose max(enddate) is changed from its previous load. Why so i get a query for all partitions?

Can anyone help me on this.

 

Thanks..

 

Edited:

I also see another query which says top 1000 from table with no filters...

And the order of execution 

It runs in random that it checks top 1000 for that partition first and then max(enddate). Inbetween i see the top 1000 with no filters.

 

Pasting the first 20 queries just for reference based with execution time in desc order

 

  • select max([rows].[enddate]) as [enddate] from ( select [_].[enddate] from [dbo].[emp] as [_] where [_].[startdate] >= convert(datetime2, '2014-06-01 00:00:00') and [_].[startdate] <= convert(datetime2, '2014-07-01 00:00:00') ) as [rows] 2021-05-20 16:40:33.047
  • select max([rows].[enddate]) as [enddate] from ( select [_].[enddate] from [dbo].[emp] as [_] where [_].[startdate] >= convert(datetime2, '2012-01-01 00:00:00') and [_].[startdate] <= convert(datetime2, '2012-02-01 00:00:00') ) as [rows] 2021-05-20 16:40:32.990
  • select max([rows].[enddate]) as [enddate] from ( select [_].[enddate] from [dbo].[emp] as [_] where [_].[startdate] >= convert(datetime2, '2016-06-01 00:00:00') and [_].[startdate] <= convert(datetime2, '2016-07-01 00:00:00') ) as [rows] 2021-05-20 16:40:32.950
  • select max([rows].[enddate]) as [enddate] from ( select [_].[enddate] from [dbo].[emp] as [_] where [_].[startdate] >= convert(datetime2, '2020-05-01 00:00:00') and [_].[startdate] <= convert(datetime2, '2020-06-01 00:00:00') ) as [rows] 2021-05-20 16:40:32.883
  • select max([rows].[enddate]) as [enddate] from ( select [_].[enddate] from [dbo].[emp] as [_] where [_].[startdate] >= convert(datetime2, '2020-11-01 00:00:00') and [_].[startdate] <= convert(datetime2, '2020-12-01 00:00:00') ) as [rows] 2021-05-20 16:40:32.877
  • select top 1000 [_].[id], [_].[startdate], [_].[enddate], [_].[name] from [dbo].[emp] as [_] where [_].[startdate] >= convert(datetime2, '2012-01-01 00:00:00') and [_].[startdate] <= convert(datetime2, '2012-02-01 00:00:00') 2021-05-20 16:40:32.833
  • select max([rows].[enddate]) as [enddate] from ( select [_].[enddate] from [dbo].[emp] as [_] where [_].[startdate] >= convert(datetime2, '2018-10-01 00:00:00') and [_].[startdate] <= convert(datetime2, '2018-11-01 00:00:00') ) as [rows] 2021-05-20 16:40:32.830
  • select top 1000 [_].[id], [_].[startdate], [_].[enddate], [_].[name] from [dbo].[emp] as [_] where [_].[startdate] >= convert(datetime2, '2014-06-01 00:00:00') and [_].[startdate] <= convert(datetime2, '2014-07-01 00:00:00') 2021-05-20 16:40:32.783
  • select max([rows].[enddate]) as [enddate] from ( select [_].[enddate] from [dbo].[emp] as [_] where [_].[startdate] >= convert(datetime2, '2012-12-01 00:00:00') and [_].[startdate] <= convert(datetime2, '2013-01-01 00:00:00') ) as [rows] 2021-05-20 16:40:32.757
  • select max([rows].[enddate]) as [enddate] from ( select [_].[enddate] from [dbo].[emp] as [_] where [_].[startdate] >= convert(datetime2, '2013-06-01 00:00:00') and [_].[startdate] <= convert(datetime2, '2013-07-01 00:00:00') ) as [rows] 2021-05-20 16:40:32.757
  • select max([rows].[enddate]) as [enddate] from ( select [_].[enddate] from [dbo].[emp] as [_] where [_].[startdate] >= convert(datetime2, '2015-06-01 00:00:00') and [_].[startdate] <= convert(datetime2, '2015-07-01 00:00:00') ) as [rows] 2021-05-20 16:40:32.643
  • select top 1000 [_].[id], [_].[startdate], [_].[enddate], [_].[name] from [dbo].[emp] as [_] where [_].[startdate] >= convert(datetime2, '2016-06-01 00:00:00') and [_].[startdate] <= convert(datetime2, '2016-07-01 00:00:00') 2021-05-20 16:40:32.630
  • select top 1000 [_].[id], [_].[startdate], [_].[enddate], [_].[name] from [dbo].[emp] as [_] where [_].[startdate] >= convert(datetime2, '2020-05-01 00:00:00') and [_].[startdate] <= convert(datetime2, '2020-06-01 00:00:00') 2021-05-20 16:40:32.617
  • select top 1000 [_].[id], [_].[startdate], [_].[enddate], [_].[name] from [dbo].[emp] as [_] where [_].[startdate] >= convert(datetime2, '2020-11-01 00:00:00') and [_].[startdate] <= convert(datetime2, '2020-12-01 00:00:00') 2021-05-20 16:40:32.573
  • select top 1000 [_].[id], [_].[startdate], [_].[enddate], [_].[name] from [dbo].[emp] as [_] where [_].[startdate] >= convert(datetime2, '2018-10-01 00:00:00') and [_].[startdate] <= convert(datetime2, '2018-11-01 00:00:00') 2021-05-20 16:40:32.507
  • select top 1000 [_].[id], [_].[startdate], [_].[enddate], [_].[name] from [dbo].[emp] as [_] where [_].[startdate] >= convert(datetime2, '2012-12-01 00:00:00') and [_].[startdate] <= convert(datetime2, '2013-01-01 00:00:00') 2021-05-20 16:40:32.500
  • select top 1000 [_].[id], [_].[startdate], [_].[enddate], [_].[name] from [dbo].[emp] as [_] where [_].[startdate] >= convert(datetime2, '2013-06-01 00:00:00') and [_].[startdate] <= convert(datetime2, '2013-07-01 00:00:00') 2021-05-20 16:40:32.437
  • select max([rows].[enddate]) as [enddate] from ( select [_].[enddate] from [dbo].[emp] as [_] where [_].[startdate] >= convert(datetime2, '2020-12-01 00:00:00') and [_].[startdate] <= convert(datetime2, '2021-01-01 00:00:00') ) as [rows] 2021-05-20 16:40:32.433
  • select top 1000 [$Table].[id] as [id], [$Table].[startdate] as [startdate], [$Table].[enddate] as [enddate], [$Table].[name] as [name] from [dbo].[emp] as [$Table] 2021-05-20 16:40:32.340
  • select max([rows].[enddate]) as [enddate] from ( select [_].[enddate] from [dbo].[emp] as [_] where [_].[startdate] >= convert(datetime2, '2015-05-01 00:00:00') and [_].[startdate] <= convert(datetime2, '2015-06-01 00:00:00') ) as [rows] 2021-05-20 16:40:32.250
  • select max([rows].[enddate]) as [enddate] from ( select [_].[enddate] from [dbo].[emp] as [_] where [_].[startdate] >= convert(datetime2, '2019-10-01 00:00:00') and [_].[startdate] <= convert(datetime2, '2019-11-01 00:00:00') ) as [rows] 2021-05-20 16:40:32.210
  • select top 1000 [_].[id], [_].[startdate], [_].[enddate], [_].[name] from [dbo].[emp] as [_] where [_].[startdate] >= convert(datetime2, '2015-06-01 00:00:00') and [_].[startdate] <= convert(datetime2, '2015-07-01 00:00:00') 2021-05-20 16:40:31.537
  • select top 1000 [_].[id], [_].[startdate], [_].[enddate], [_].[name] from [dbo].[emp] as [_] where [_].[startdate] >= convert(datetime2, '2020-12-01 00:00:00') and [_].[startdate] <= convert(datetime2, '2021-01-01 00:00:00') 2021-05-20 16:40:31.463
5 REPLIES 5
GilbertQ
Super User
Super User

Yeah I would not think that should happen.





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!







Power BI Blog

GilbertQ
Super User
Super User

Hi @ddfreedie 

 

It should only do that the first time it ever creates the partitons





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!







Power BI Blog

Ok i tried doing it. I again see queries for all paritions for all the incremental loads not just for the first time...Strange!!!

GilbertQ
Super User
Super User

Hi @ddfreedie 

 

The reason for the top 1000 is to make sure that the table matches the definitions before loading the data.

 

It will only do this for the first incremental loading of the data. 

 

If your incremental refresh is also set to be the last 120 months, then it will have to do that for each and every month. This is how it works by design.





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!







Power BI Blog

Thank you for the response. 
but I tried doing a few incremental and every time it does 120 queries to get max date + 120 queries to get data top 1000 from all partitions + 1 query to get top 1000 with no filters

 

Also does this 1000 record load into the dataset/data flow or is it only for some validation purpose. 

It could just be once top 1 * as the partitions are PBI own and end of the day it is the same table. 🙂 this makes no sense to me unfortunately!! 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors