cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
TeeGee
Helper II
Helper II

DirectQuery (bug?): all possible Date filter values are being output to SQL

I have a report with multiple visuals and 4 or 5 slicers on it, accessing an Azure SQL database via DirectQuery.

 

Performance is worse than I would expect, and in debugging I'm noticing that several SQL statements seem to be outputting a very suboptiomal where clause related to a Date filter.  It seems like every possible date value is being included in a huge IN() statement, rather than no filter (if none has been specified), or just a concise less than or equal or bwteen statement in the case of a range.

 

What might be going on here?

 

Here is just a very tiny excerpt of some SQL that I see:

 

 

WHERE 
(
([t1].[DateKey] IN (CAST( '20140315 00:00:00' AS datetime),CAST( '20140206 00:00:00' AS datetime),CAST( '20140620 00:00:00' AS datetime),CAST( '20141118 00:00:00' AS datetime),
CAST( '19000101 00:00:00' AS datetime),CAST( '20140531 00:00:00' AS datetime),CAST( '20141012 00:00:00' AS datetime),CAST( '20140424 00:00:00' AS datetime),CAST( '20140905 00:00:00' AS datetime),
CAST( '20140318 00:00:00' AS datetime),CAST( '20140816 00:00:00' AS datetime),CAST( '20141228 00:00:00' AS datetime),CAST( '20140209 00:00:00' AS datetime),CAST( '20140710 00:00:00' AS datetime),
CAST( '20141121 00:00:00' AS datetime),CAST( '20141208 00:00:00' AS datetime),CAST( '20140120 00:00:00' AS datetime),CAST( '20140603 00:00:00' AS datetime),CAST( '20141101 00:00:00' AS datetime),
CAST( '20140427 00:00:00' AS datetime),CAST( '20140514 00:00:00' AS datetime),CAST( '20140925 00:00:00' AS datetime),CAST( '20140407 00:00:00' AS datetime),CAST( '20140819 00:00:00' AS datetime),
CAST( '20140301 00:00:00' AS datetime),CAST( '20140713 00:00:00' AS datetime),CAST( '20140730 00:00:00' AS datetime),CAST( '20141211 00:00:00' AS datetime),CAST( '20140123 00:00:00' AS datetime),
CAST( '20140623 00:00:00' AS datetime),CAST( '20141104 00:00:00' AS datetime),CAST( '20140103 00:00:00' AS datetime),CAST( '20140517 00:00:00' AS datetime),CAST( '20141015 00:00:00' AS datetime),
CAST( '20140410 00:00:00' AS datetime),CAST( '20140908 00:00:00' AS datetime),CAST( '20140321 00:00:00' AS datetime),CAST( '20140802 00:00:00' AS datetime),CAST( '20141231 00:00:00' AS datetime),
CAST( '20140212 00:00:00' AS datetime),CAST( '20140626 00:00:00' AS datetime),CAST( '20141124 00:00:00' AS datetime),CAST( '20140106 00:00:00' AS datetime),CAST( '20140606 00:00:00' AS datetime),
CAST( '20141018 00:00:00' AS datetime),CAST( '20140430 00:00:00' AS datetime),CAST( '20140911 00:00:00' AS datetime),CAST( '20140928 00:00:00' AS datetime),CAST( '20140324 00:00:00' AS datetime),
CAST( '20140822 00:00:00' AS datetime),CAST( '20140215 00:00:00' AS datetime),CAST( '20140304 00:00:00' AS datetime),CAST( '20140716 00:00:00' AS datetime),CAST( '20141214 00:00:00' AS datetime),
CAST( '20140126 00:00:00' AS datetime),CAST( '20140609 00:00:00' AS datetime),CAST( '20141107 00:00:00' AS datetime),CAST( '20140520 00:00:00' AS datetime),CAST( '20141001 00:00:00' AS datetime),
CAST( '20140413 00:00:00' AS datetime),CAST( '20140825 00:00:00' AS datetime),CAST( '20140307 00:00:00' AS datetime),CAST( '20140805 00:00:00' AS datetime),CAST( '20141217 00:00:00' AS datetime),
CAST( '20140129 00:00:00' AS datetime),CAST( '20140629 00:00:00' AS datetime),CAST( '20141110 00:00:00' AS datetime),CAST( '20141127 00:00:00' AS datetime),CAST( '20140109 00:00:00' AS datetime),
CAST( '20140523 00:00:00' AS datetime),CAST( '20141021 00:00:00' AS datetime),CAST( '20140416 00:00:00' AS datetime),CAST( '20140503 00:00:00' AS datetime),CAST( '20140914 00:00:00' AS datetime)

 

 

 

 

 

11 REPLIES 11
srinivt
Microsoft
Microsoft

We are making some small improvements in the product to handle cases where the large IN slice can be eliminated. However, there could still be a valid use-case where we have large IN slice.

Have you tried rewriting the SQL query with the large IN slice into an equivalent range filter? Does that improve the query performance for you on the SQL side? 

Running the SQL in SSMS, if I change:

 

WHERE 
(
([t1].[DateKey] IN (CAST( '20140315 00:00:00' AS datetime),CAST( '20140206 00:00:00' AS datetime),CAST( '20140620 00:00:00' AS datetime),CAST( '20141118 00:00:00' AS datetime),

 

to:

 

WHERE t1.Year = '2015'

The execution time changes from 4m52s to 1 second.

 

([t1] is the Date table)

Thanks for confirming that.

I'm thinking the root cause may be a Measure containing ALL(), for example:

 

 

AP Balance =
CALCULATE (
    -sum('3 Accounts Payable'[Amount Master]),
    FILTER (
        ALL ( 'Date' ),
       'Date'[DateKey] <= MAX ( 'Date'[DateKey] )
    )
)

 

If you think about it, perhaps this makes sense:  when I set a slicer filter on [Date].[Year], the engine has no way of knowing that [Date].[Year] = YEAR([Date].[DateKey]), at least not in my case as the data is physically populated in the underlying SQL table.

 

So the question now becomes:  is there a way for me to re-architect this such that in can infer that fact and properly optimize this query?

 

 

We are considering a general optimization to deal with big OR slices on a date column. Hopefully, that would help you. But, currently, running-total like calculations will have big OR slices by design. Having said that, I would have expected SQL engine to have handled it better. 

True, MSSQL should handle that statement better (at the very least the huge IN() statement should be used to create a temp table which is then joined to) but even so, Power BI shouldn't be sending this statement in the first place.

 

Should I open a bug report for this?

I will internally forward your feedback here to relevant folks internally within the team and that should help prioritize the optimization higher on the backlog. 

@srinivt  Do you know if Microsoft Team was able to research this more?  This thread is over 2 years old but it's the best piece of information I've found reseraching my issue, which is exactly the same as what this thread was created for.

 

A 'date between' filter on my Date table sends over to sql every distinct date separately in the where clause.   Modifying that line in the sql query it generates takes my result down from a few mintues to less than 2 seconds. 

This could be the biggest item to make Direct Query more efficient with dates!

 

Well, I tried adding a new calculated (in M) column and using that in the slicer, same problem.

 

= Table.AddColumn(dbo_DimDate, "YEAR_CALC", each Date.Year([DateKey]), Int64.Type)

 

Also tried adding a new calculated column in the model and putting a slicer on that, same problem.

 

Thinking a bit more though....my cannot optimize theory above is actually wrong.  The slicer options are basically a "select distinct [SlicerColumn] from the underlying table, the fact that it is (or is not) YEAR(DateKey) is completely irrelevant; all [Date] rows that match can be selected simply by: WHERE [Date].[Year] = {selected slicer value}

 

Any ORM has to handle this scenario, so I can't understand why it can't be optimized here.  What it is doing now is actually far more complicated than the efficient statement would be to generate, so something weird must be going on????

v-ljerr-msft
Microsoft
Microsoft

Hi @TeeGee,

 

What type(Between, Before, After, List, Dropdown, Relative) of Date Slicer are you using on the report? Could you try a different slicer type to see if there is any improvement?

 

type1.PNG

 

In addition, if the issue persists, I would suggest you create a support ticket on Power BI Support page for better assistance on this issue. Smiley Happy

 

Regards

The slicer is on the [Year] column in the [Date] table, and it is a simple single select dropdown slicer.

Helpful resources

Announcements
PBI_User Group Leader_768x460.jpg

Manage your user group events

Check out the News & Announcements to learn more.

Get Ready for Power BI Dev Camp

Microsoft named a Leader in The Forrester Wave

Microsoft received the highest score of any vendor in both the strategy and current offering categories.

Get Ready for Power BI Dev Camp

Power BI Dev Camp - September 30th, 2021

Mark your calendars and join us for our next Power BI Dev Camp!

PowerPlatform 768x460.png

Microsoft Learn

Check out our new Discover Your Career Path blog post series and get all the details.

Top Solution Authors