Solved: translate SQL code for unique count into Power BI ...

12Bowers12 · ‎05-22-2019

Good morning, everyone,

I have a Power BI data called PolicyData which is imported from SQL dataset PoicyData. IT department gave me the SQL code to get unique count as shown below.
I tried days to write a Measure with Calculated Column and earlier DAX to try to get the count but failed.

I appreciate your help to “translate” this SQL code into a Power BI measure.

Dennis

select count(distinct Claimant)

from PolicyData f1

where

Account_Date >= '2019-01-01' and

Claimant not in

(select Claimant from PolicyData where

Account_Date >= '2019-01-01' and Record_Type = 'P') and

Claimant in

(select Claimant

from PolicyData

where Account_Date < '2019-05-01'

group by Claimant having sum(Amount) = 0

) ;

12Bowers12 · ‎05-22-2019

Hi, Pattem,

I added one more criteria [TRASN_TYPE] = “INDEM” and tested the all the situations but changing NOT in differetn filtering colujmns but still got the number which is higher than the SQL result (the unique count is 923). The lowest number based on your code is 1,619.

Had a chance, could you take a look?

Appreciate your help.

Dennis

L16 CWOP DistinctCount =

VAR Exclusion =

SELECTCOLUMNS (

FILTER (

LossFile,

LossFile[RECORD_TYPE] = "P"

&& LossFile[TRANS_TYPE] = "INDEM"

&& LossFile[ACCOUNT_DATE] >= DATE ( 2019, 1, 1 )

),

"CLAIMANT", LossFile[CLAIMANT]

)

VAR Inclusion =

SELECTCOLUMNS (

FILTER (

SUMMARIZE (

FILTER (

LossFile,

LossFile[TRANS_TYPE] = "INDEM"

&& LossFile[ACCOUNT_DATE] < DATE ( 2019, 5, 1 )

),

LossFile[CLAIMANT],

"Total", SUM ( LossFile[AMOUNT] )

),

[Total] = 0

),

"CLAIMANT", LossFile[CLAIMANT]

)

RETURN

CALCULATE (

DISTINCTCOUNT ( LossFile[CLAIMANT] ),

FILTER (

LossFile,

(LossFile[CLAIMANT]) IN Exclusion

&& NOT(LossFile[CLAIMANT]) IN Inclusion

)

View solution in original post

PattemManohar · ‎05-22-2019

@12Bowers12 It's always recommended to post sample test data and expected output for an accurate solution. Please try this.... which is not tested (without any sample data)

DistinctCount = 

VAR _Exclusion = SELECTCOLUMNS(FILTER(PolicyData,Account_Date >= "2019-01-01" & Record_Type = "P"),"Claimant",[Claimant])
VAR _Inclusion = SELECTCOLUMNS(FILTER(SUMMARIZE(FILTER(PolicyData,Account_Date < "2019-05-01"),Claimant,"Total",SUM(Amount)),[Total]>0),"Claimant",[Claimant])

RETURN CALCULATE(DISTINCTCOUNT(Claimant),FILTER(PolicyData,Claimant NOT IN _Exclusion & Claimant IN _Inclusion))

Did I answer your question? Mark my post as a solution!

Proud to be a PBI Community Champion

12Bowers12 · ‎05-22-2019

Hi, Pattem,

I added one more criteria [TRASN_TYPE] = “INDEM” and tested the all the situations but changing NOT in differetn filtering colujmns but still got the number which is higher than the SQL result (the unique count is 923). The lowest number based on your code is 1,619.

Had a chance, could you take a look?

Appreciate your help.

Dennis

L16 CWOP DistinctCount =

VAR Exclusion =

SELECTCOLUMNS (

FILTER (

LossFile,

LossFile[RECORD_TYPE] = "P"

&& LossFile[TRANS_TYPE] = "INDEM"

&& LossFile[ACCOUNT_DATE] >= DATE ( 2019, 1, 1 )

),

"CLAIMANT", LossFile[CLAIMANT]

)

VAR Inclusion =

SELECTCOLUMNS (

FILTER (

SUMMARIZE (

FILTER (

LossFile,

LossFile[TRANS_TYPE] = "INDEM"

&& LossFile[ACCOUNT_DATE] < DATE ( 2019, 5, 1 )

),

LossFile[CLAIMANT],

"Total", SUM ( LossFile[AMOUNT] )

),

[Total] = 0

),

"CLAIMANT", LossFile[CLAIMANT]

)

RETURN

CALCULATE (

DISTINCTCOUNT ( LossFile[CLAIMANT] ),

FILTER (

LossFile,

(LossFile[CLAIMANT]) IN Exclusion

&& NOT(LossFile[CLAIMANT]) IN Inclusion

)

12Bowers12 · ‎05-22-2019

Hi, Pattem, I just got the right number matched with SQL by adding one more variable. Thank you. I will test Doobie code later.

Have a good day.

Dennis

L16 CWOP DistinctCount =

VAR Base =

SELECTCOLUMNS (

FILTER (

LossFile,

LossFile[TRANS_TYPE] = "INDEM"

&& LossFile[ACCOUNT_DATE] >= DATE ( 2019, 1, 1 )

),

"CLAIMANT", LossFile[CLAIMANT]

)

VAR Exclusion =

SELECTCOLUMNS (

FILTER (

LossFile,

LossFile[RECORD_TYPE] = "P"

&& LossFile[TRANS_TYPE] = "INDEM"

&& LossFile[ACCOUNT_DATE] >= DATE ( 2019, 1, 1 )

),

"CLAIMANT", LossFile[CLAIMANT]

)

VAR Inclusion =

SELECTCOLUMNS (

FILTER (

SUMMARIZE (

FILTER (

LossFile,

LossFile[TRANS_TYPE] = "INDEM"

&& LossFile[ACCOUNT_DATE] < DATE ( 2019, 5, 1 )

),

LossFile[CLAIMANT],

"Total", SUM ( LossFile[AMOUNT] )

),

[Total] = 0

),

"CLAIMANT", LossFile[CLAIMANT]

)

RETURN

CALCULATE (

DISTINCTCOUNT ( LossFile[CLAIMANT] ),

FILTER (

LossFile,

LossFile[CLAIMANT] IN Base

&& NOT ( LossFile[CLAIMANT] ) IN Exclusion

&& LossFile[CLAIMANT] IN Inclusion

)

12Bowers12 · ‎05-23-2019

Sorry, Pattem and Doobie, I am back for more question:

I got the exact Total number as SQL result (923). However, when I tried to create a Power BI Report by Accident Year, the total remains 923, the Summary of the each Accident Year (I tested in Excel) is always less than the Total 923. I tried by using other criteria to filter the count but still get lower but different number.

Any ideas?

Appreciate your help.

Dennis

Anonymous · ‎05-23-2019

That is likely a data relationship issue. Are you able to take a screenshot of your data model?

12Bowers12 · ‎05-23-2019

Sorry, Doobie,

I tried to attach a screenshot here but could not get it.

The relationship is very simple:

Accident Year is on one side LossClaim, the many side is LossFile which hosts the measure. LossClaim and LossFile is related through ClaimNumber. I also tried to create a Report by using LossFile itself column but still got different result.

Sincerely,

Dennis

12Bowers12 · ‎05-23-2019

Good afternoon, Doobie,

I pasted a sample data here for you.

Appreciate your help.

Dennis

CLAIM	CLAIMANT	COVERAGE	TRANS_TYPE	RECORD_TYPE	AMOUNT	ACCOUNT_DATE
138272	000138272-001	PIP	INDEM	P	-1480	4/29/2019
138272	000138272-001	PIP	INDEM	P	1480	4/29/2019
138272	000138272-001	PIP	INDEM	P	-1480	4/29/2019
138272	000138272-001	PIP	INDEM	C	1480	4/29/2019
182677	000182677-003	PIP	INDEM	P	-10404	3/29/2019
182677	000182677-003	PIP	INDEM	C	10404	3/29/2019
194602	000194602-005	BI	INDEM	P	-1000	2/21/2019
194602	000194602-005	BI	INDEM	C	1000	2/21/2019
199016	000199016-001	PIP	INDEM	P	-698.13	1/26/2019
199016	000199016-001	PIP	INDEM	P	698.13	1/17/2019
200509	000200509-001	PIP	INDEM	P	-443.45	1/11/2019
200509	000200509-001	PIP	INDEM	C	443.45	1/11/2019
200646	000200646-001	PIP	INDEM	P	-4500	4/4/2019
200646	000200646-001	PIP	INDEM	P	-996.34	4/4/2019
200646	000200646-001	PIP	INDEM	C	4500	4/4/2019
200646	000200646-001	PIP	INDEM	C	996.34	4/4/2019
201129	000201129-002	PIP	INDEM	P	840	1/17/2019
202194	000202194-001	BI	INDEM	P	-2500	1/9/2019
202194	000202194-001	BI	INDEM	C	2500	1/9/2019
204366	000204366-001	PIP	INDEM	P	-525	1/24/2019
204366	000204366-001	PIP	INDEM	P	525	1/17/2019

12Bowers12 · ‎05-22-2019

Thank you, Pattem, I am also testing your solution. I will paste data next time. Dennis

Anonymous · ‎05-22-2019

Without looking at your data it's a bit tricky to decipher this.

What is the end goal of the code, to obtain a distinct count of policy items that fall into a specific category? If so, is the category that the Account_Date >= '2019-01-01' and Record_Type = 'P' (what does 'P' represent and is this stored in the same table as the other account data) or Account_Date < '2019-05-01'?

12Bowers12 · ‎05-22-2019

Thank you,

1. Yes, the goal is to get a distinct count of Claim Items fall into such categories:

First, Account_Date >= '2019-01-01'

Second, Claimant not in (select Claimant from PolicyData where Account_Date >= '2019-01-01' and Record_Type = 'P')

Third, Claimant in (select Claimant from PolicyData where Account_Date < '2019-05-01' group by Claimant having sum(Amount) = 0 ) ;

2. All the categories including Account_Date, Record_Type are stored in the same table PolicyData.

3. Under Record_Type, there are three types: P, C and O. P means Payment.

Anonymous · ‎05-22-2019

Gotcha. So I think the below might work. First I would create a calculated column with your criteria for calculating the distinct values as including it all in one measure may be tricky with multiple filters. Below is a calculated column which I think accuratley summarizes the SQL code.

Column = 
SWITCH(TRUE(), 
    AccountDate >= "2019-01-01", "True",
    PolicyAccountDate >= "2019-01-01" && RecordType = "P", "False",
    PolicyAccountDate < "2019-05-01" && Amount = "0", "True"
)

From there you can create a measure to count the number of "True" values.

Measure = 
VAR Column = Sheet1'Column'
Return

CALCULATE(
    DISTINCTCOUNT(Sheet1[Column]),
        FILTER(Column, "True")
)

You might even be able to avoid the measure and throw the calculated column into a 'Card' visual and filter to fit your needs.

Hope this helps!

12Bowers12 · ‎05-22-2019

Thank you, Doobie,

I added one more criteria [Trans_Type]=”INDEM”, and created a calculated column [C7 CWOP] based on your code . Based on this calculated column, I copied your measure as [L16 CWOP Count].

But the count is much higher. The count based on SQL is 923, but [L16 CWOP Count] shows as 12,130.

Had time, could you take a look?

Appreciate your help.

Dennis

C7 CWOP =

SWITCH(TRUE(),

LossFile[ACCOUNT_DATE] >= Date(2019,1,1) && LossFile[TRANS_TYPE] = "INDEM", "True",

LossFile[ACCOUNT_DATE] >= Date(2019,1,1) && LossFile[TRANS_TYPE] = "INDEM" && LossFile[RECORD_TYPE] = "P", "False",

LossFile[ACCOUNT_DATE] < Date(2019,5,1) && LossFile[TRANS_TYPE] = "INDEM" && LossFile[AMOUNT] = 0, "True"

)

L16 CWOP Count =

CALCULATE(

DISTINCTCOUNT(LossFile[CLAIMANT]),

LossFile[C7 CWOP]="True"

)

12Bowers12 · ‎05-22-2019

Thank you, Doobie,

I am testing now. let you know later.

Appreciate your help.

Dennis

translate SQL code for unique count into Power BI Code unique code

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024

How to Get Your Question Answered Quickly