Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Dave_Gugg
Helper I
Helper I

Distinct Count of Dimension Attribute in Matrix taking forever to calculate

I have a model set up as follows:

  • Demand Fact Table
  • Product Dimension Table
    • Price Range Bucket
    • Product Division
    • Item Number
  • Order Shipped Date Dimension
    • Order Shipped Year

I've left out all other tables and columns that aren't important to the questions.

 

I'm trying to set up a matrix that will use the Price Range bucket as the row headers and the Product Division and Order Shipped Year as column headers.  This should give me the number of distinct items that are sold in each price range and year, broken up by Product Division.  

I set up a new metric in the Demand Fact Table as Item Count2 = CALCULATE(DISTINCTCOUNT('Product'[Item]), Demand).  The calculation appears to work, but it takes forever to run.  If I let it run for 20+ minutes, I evetually get the visual to load correctly.

 

If I run Profiler while it is loading, I get hundreds of statements like this being run:

 

SELECT 
(COUNT_BIG(DISTINCT [t1].[Item]) + MAX(CASE WHEN [t1].[Item] IS NULL THEN 1 ELSE 0 END))
 AS [a0]
FROM 
(
(SELECT  ProductKey
      , Style AS Item
      , StyleDescription AS [Item Description]
      , SkuCode AS [SKU Code]
      , SkuDescription AS [SKU Description]
      , ColorCode AS [Color Code]
      , ColorName AS [Color Name]
      , SizeCode AS [Size Code]
      , SizeName AS [Size Name]
      , ItemClass AS [Item Class]
      , LongSkuDivision AS [Long SKU Division]
      , LongSkuDepartment AS [Long SKU Department]
      , SkuGroup AS [SKU Group]
      , Skud AS [Sku'd Product]
      , CurrentItemStatus AS [Current Item Status]
      , CurrentSkuStatus AS [Current SKU Status]
      , LatestVendor AS [Current Vendor]
      , OriginalRetailPrice AS [List Price]
      , CurrentWebPrice AS [Current Web Price]
      , CreatedDate AS [Created Date]
      , CountryOfOrigin AS [Country Of Origin]
      , Brand
      , CamoBrand AS [Camo Brand]
      , Camo
      , Pattern
      , PrimaryColor AS [Primary Color]
      , SecondaryColor AS [Secondary Color]
      , UPC
      , ItemSkuCount AS [Item SKU Count]
      , ItemColorCount AS [Item Color Count]
      , SetHeader AS [Is Set Header]
      , BigAndTall AS [Big and Tall]
      , LongSkuItemClass AS [Long SKU Item Class]
      , Buyer
      , PriceSpread AS [Price Range]
FROM    databasename.dbo.dimProduct)
)
 AS [t1]
WHERE 
(
(
([t1].[ProductKey] IN (10691,4845,15332,4787,12218,15321,622,11814,1538,7582,4481,13842,12780,5057,969,3006,6683,6605,5735,4982,5492,15796,14528,9141,17527,10338,13930,15176,14957,...))
)
 AND 
(
([t1].[ProductKey] IN (6638,5265,5035,3560,15046,10251,6657,14057,13635,4928,4609,6234,9141,7086,3279,9778,12065,7513,17151,15066,8282,16578,11281,10739,4873,7136,8650,352,4263,12382,...))
 OR 
([t1].[ProductKey] IN (7986,17205,15993,6630,14303,9687,11663,1029,12545,11977,10061,13539,6591,5724,457,7022,15293,4976,1131,11715,1894,10868,10930,7346,7122,15707,12533,8932,16272,...))
 OR 
([t1].[ProductKey] IN (6996,13110,6548,313,9317,5436,2163,2251,11001,13250,17150,14734,17334,17528,3134,2123,6177,9835,15792,9779,479,8184,9116,7942,3181,16351,15798,6837,17486,6346,...))
 OR 
([t1].[ProductKey] IN (755,2618,3157,10672,12277,5900,2532,4171,2828,10847,1969,10694,7939,5234,13326,14389,13711,13020,1656,15795,13333,3425,432,15609,3161,11942,4419,17614,2972,...))
 OR 
([t1].[ProductKey] IN (14701,5927,16099,14945,5334,13396,8109,17015,15611,17053,13683,3797,2329,4429,15065,11070,10484,8468,11192,1728,3239,11484,8370,3461,12184,15658,1175,6199,...))
)
)

 

Any ideas on how I can either create a better performing function or get Power BI to use a different method to get the data?

1 ACCEPTED SOLUTION

The VALUES function will give you the unique list of ProductID in your Demand fact table (under the current filter context).  Then, rather than use the CountX function, you use the SUMX function, but just add "1's" for each.  I have found it to be very performant.

 

CALCULATE(

   SUMX(VALUES('Demand'[ProductId]),1))

 

 

View solution in original post

7 REPLIES 7
v-shex-msft
Community Support
Community Support

Hi @Dave_Gugg,

 

Based on my test,  you can use related function to see if it works:

 

Old:

Item Count2 = CALCULATE(DISTINCTCOUNT('SalesRecords'[ProductID]),ProductInfo)

 

New:

Measure = COUNTAX(DISTINCT(SELECTCOLUMNS(FILTER(SalesRecords,SalesRecords[ProductID]=RELATED(ProductInfo[ID])),"ProductID",[ProductID])),[ProductID])

 

Besides, there are a set of DAX functions that can get very expensive to run in the server then cause performance issue. You can see this article: Ability to create measures when using DirectQuery mode. And when we type T-SQL in DirectQuery mode, Power BI will generate it in a complicate logic at backend. So I would suggest you try to change DirectQuery mode to Import mode.

 

Regards,

Xiaoxin Sheng

 

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

Thanks for the reply @v-shex-msft.

I was not able to make your new measure work.  I typed it in as 

Item Count 2 = COUNTAX(DISTINCT(SELECTCOLUMNS(FILTER(Demand,Demand[ProductKey]=RELATED('Product'[ProductKey])),"Item",Demand[ProductKey])),Demand[ProductKey])

I got the message "A single value for column 'ProductKey' in table 'Demand' cannot be determined.  This can happen when a measure formula refers to a column that contains many values without specifying an aggregation such as min, max, count, or sum to get a single result."  The last ProductKey in the formula is highlighted with a red squiglly line.

 

I was able to get it to run in about 20 - 30 seconds by setting up a new column as:

Item = RELATED(Product[Item])

Then creating a new measure as:

Item Count = DISTINCTCOUNT(Demand[Item])

 

It would be great if I could get better performance, but I can live with 20 - 30 seconds.

 

Regarding Import mode, we have far too much data to use that.  There is a 10 GB limit for Pro users.

Thank you

@Dave_Gugg

 

Try with this option with the new measure:

 

ItemCount=CountRows(Summarize(Demand,Demand[item]))

 

 

 




Lima - Peru

Thanks for the reply Vvelarde.  That ran about 20% faster than the way I had it set up.  It's a small improvement, but every bit helps.  Thank you!

The VALUES function will give you the unique list of ProductID in your Demand fact table (under the current filter context).  Then, rather than use the CountX function, you use the SUMX function, but just add "1's" for each.  I have found it to be very performant.

 

CALCULATE(

   SUMX(VALUES('Demand'[ProductId]),1))

 

 

Wow, BraneyBI that is outstanding.  The visual now loads in about a second.  Thank you so much.

I clearly have much to learn about DAX.  There are several ways to do things, and some are clearly better than others.

So very glad I could help! 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.