Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
12Bowers12
Helper V
Helper V

Summarize and Distinct Count

Good afternoon, everyone,

In the following code, I created a new Data LossByClaimantCoverage based on data LossFile. In the new Data LossByClaimantCoverage, I summarized the dollar [AMOUNT] and rename as Loss Reserve.

Then I created a Measure 1 for counting based on [Loss Reserve]>0.

Then I created another Measure 2 based on the original data LossFile but got a much higher number.

Appreciate your help in advance.

Dennis

 

LossByClaimantCoverage =

SUMMARIZE ( LossFile,

LossFile[PREFIX],

LossFile[ClaimantCov],

"Loss Reserve", SUM ( LossFile[AMOUNT]) )

 

 

Measure 1 Loss Reserve Claim Count =

CALCULATE ( DISTINCTCOUNT ( LossByClaimantCoverage[ClaimantCov] ), LossByClaimantCoverage[Loss Reserve] > 0 )

 

Measure 2 Loss Reserve Claim Count =

CALCULATE ( DISTINCTCOUNT ( LossFile[ClaimantCov] ), FILTER(LossFile, SUM(LossFile[AMOUNT]) > 0))

1 ACCEPTED SOLUTION
Anonymous
Not applicable

@12Bowers12 , let's look at your measures in detail.  I'll also show you how to optimize the calculated table.

 

SUMMARIZE() is not goot dor adding columns.  It's just....slow.  Refer to this article by www.sqlbi.com for more info:

 

https://www.sqlbi.com/articles/introducing-summarizecolumns/

 

 

LossByClaimantCoverage =
ADDCOLUMNS (
    SUMMARIZE (
        LossFile,
        LossFile[PREFIX],
        LossFile[ClaimantCov]
    ),
    "Loss Reserve", CALCULATE ( SUM ( LossFile[AMOUNT] ) )
)

This will provide a table of unique combinations of Lossfile[PREFIX] and LossFile[ClaimantCov], and then return the TOTAL amount for each combination.

 

[Measure1] is counting the distinct number of ClaimantCov where the TOTAL amount is > 0.  This is NOT taking [PREFIX] into account.  If the same [ClaimantCov] value is used for two different [PREFIX] values and the Total is > 0 for both of them, this measure will still only return 1.

 

[Measure2] is counting the distinct values of [ClaimantCov] for the original table, but adding a new filter:

 

"only keep the rows where SUM(Amount) > 0"

 

Here's the issue. 

FILTER() introduces a row context.  it's iterating over every row in the table, and performing the calculation SUM(LossFile[AMOUNT]).  Since there's no filter context anymore, it's summing ALL of the values in the table, and then checking to see if that amount is > 0.  If it is (and I'm guessing it will be), it keeps the row.

 

Once FILTER() has been applied, then it counts the distinct values.  I'm guessing your FILTER() parameter is just returning the entire table (albeit, after a costly calculation).

 

If you want Measure2 to work, and without having to create a calculated table (which will consume additional memory in your model), try this measure:

 

Measure2_Updated =
VAR TempTable =
ADDCOLUMNS (
	VALUES(LossFile[ClaimantCov]),
    "Loss Reserve", CALCULATE ( SUM ( LossFile[AMOUNT] )
)
VAR Filtered_Table = 
	FILTER(
		TempTable,
		[Loss Reserve] > 0
	)
VAR Result =
	COUNTROWS(Filtered_Table)
RETURN
Result

 

View solution in original post

1 REPLY 1
Anonymous
Not applicable

@12Bowers12 , let's look at your measures in detail.  I'll also show you how to optimize the calculated table.

 

SUMMARIZE() is not goot dor adding columns.  It's just....slow.  Refer to this article by www.sqlbi.com for more info:

 

https://www.sqlbi.com/articles/introducing-summarizecolumns/

 

 

LossByClaimantCoverage =
ADDCOLUMNS (
    SUMMARIZE (
        LossFile,
        LossFile[PREFIX],
        LossFile[ClaimantCov]
    ),
    "Loss Reserve", CALCULATE ( SUM ( LossFile[AMOUNT] ) )
)

This will provide a table of unique combinations of Lossfile[PREFIX] and LossFile[ClaimantCov], and then return the TOTAL amount for each combination.

 

[Measure1] is counting the distinct number of ClaimantCov where the TOTAL amount is > 0.  This is NOT taking [PREFIX] into account.  If the same [ClaimantCov] value is used for two different [PREFIX] values and the Total is > 0 for both of them, this measure will still only return 1.

 

[Measure2] is counting the distinct values of [ClaimantCov] for the original table, but adding a new filter:

 

"only keep the rows where SUM(Amount) > 0"

 

Here's the issue. 

FILTER() introduces a row context.  it's iterating over every row in the table, and performing the calculation SUM(LossFile[AMOUNT]).  Since there's no filter context anymore, it's summing ALL of the values in the table, and then checking to see if that amount is > 0.  If it is (and I'm guessing it will be), it keeps the row.

 

Once FILTER() has been applied, then it counts the distinct values.  I'm guessing your FILTER() parameter is just returning the entire table (albeit, after a costly calculation).

 

If you want Measure2 to work, and without having to create a calculated table (which will consume additional memory in your model), try this measure:

 

Measure2_Updated =
VAR TempTable =
ADDCOLUMNS (
	VALUES(LossFile[ClaimantCov]),
    "Loss Reserve", CALCULATE ( SUM ( LossFile[AMOUNT] )
)
VAR Filtered_Table = 
	FILTER(
		TempTable,
		[Loss Reserve] > 0
	)
VAR Result =
	COUNTROWS(Filtered_Table)
RETURN
Result

 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.