Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
hxkresl
Helper V
Helper V

DAX equivalent to complex SQL query

I have SQL query that takes a two column table (ReportName, Attribute) and arranges data to show how much any two reports have in common, in terms of shared attributes, as count and percentage. 

 

Here's what power bi report looks like using SQL dataset.

pivot7.PNGI want to create exactly same using DAX against source table (eliminate SQL query dependency).  SQL query and inputs below.

CREATE TABLE #ReportTable (
    ReportName varchar (20),
    AttributeName varchar(30),
);


INSERT INTO #ReportTable (ReportName, AttributeName)
    VALUES    ('Compliance Details', 'Vendor Name'),
            ('Compliance Details', 'Area'),
            ('Compliance Details', 'Area'),
            ('Compliance Details', 'Area'),
            ('Compliance Details', 'Region'),
            ('Compliance Details', 'SubRegion'),
            ('Delivery and Invoice', 'Customer Name'),
            ('Delivery and Invoice', 'Area'),
            ('Delivery and Invoice', 'Region'),
            ('Delivery and Invoice', 'SubRegion'),
            ('Operations Review', 'Customer Name'),
            ('Operations Review', 'Approver'),
            ('Operations Review', 'Approval Status');

WITH ReportAttributeCount AS (

    SELECT ReportName as ReportA, COUNT(AttributeName) AS ReportA_AttributeCount
    FROM #ReportTable
    GROUP BY ReportName
),
    CommonAttributesCount AS (

        SELECT RT.ReportName AS ReportA, RT2.ReportName AS ReportB, COUNT(DISTINCT RT.AttributeName) AS CommonAttributeCount --convert to DAX 1
        FROM #ReportTable AS RT
            INNER JOIN #ReportTable AS RT2
                ON RT.ReportName < RT2.ReportName
                AND RT.AttributeName = RT2.AttributeName
        GROUP BY RT.ReportName, RT2.ReportName
)
SELECT DISTINCT
    RAC.ReportA,
    CACA.CommonAttributeCount,
    CACA.ReportB, --convert to DAX 2
	(CACA.CommonAttributeCount * 1.00) / RAC.ReportA_AttributeCount AS reportApercentage,--convert to DAX 3
	RAC.ReportA_AttributeCount,     
    (CACA.CommonAttributeCount * 1.00) / RACB.ReportA_AttributeCount AS reportBpercentage,--convert to DAX 4
	RACB.ReportA_AttributeCount AS ReportB_AttributeCount --convert to DAX 5
FROM ReportAttributeCount AS RAC
    INNER JOIN CommonAttributesCount AS CACA
        ON RAC.ReportA = CACA.ReportA
    INNER JOIN ReportAttributeCount AS RACB
        ON CACA.ReportB = RACB.ReportA;


DROP TABLE #ReportTable;





 I want to begin by converting the CommonAttributeCount to DAX, and this requires a self join.  I don't know where to begin.  Pls help walk me through this. 

1 ACCEPTED SOLUTION
Phil_Seamark
Employee
Employee

Hi there,

 

Please try adding the following three calcuated tables:

 

ReportAttributeCount = SELECTCOLUMNS(
                SUMMARIZECOLUMNS(
                    'Table2'[ReportName],
                    "ReportA_AttributeCount" , COUNTROWS('Table2')
                    ),
                        "ReportA",[ReportName],
                        "ReportA_AttributeCount",[ReportA_AttributeCount]
                        )
CommonAttributesCount = 
               SUMMARIZE( FILTER(
                GENERATE(
                             SELECTCOLUMNS(
                                    Table2,
                                    "ReportName",'Table2'[ReportName],
                                    "AttributeName",'Table2'[AttributeName]
                                    ) ,                   
                            SELECTCOLUMNS(
                                Table2,
                                "ReportName2",'Table2'[ReportName],
                                "AttributeName2",'Table2'[AttributeName]
                                )
                              ),[AttributeName]=[AttributeName2] && [ReportName]< [ReportName2]
                              ),
                              [ReportName],
                              [ReportName2],
                              [AttributeName]
                              )



and finally

 

ReportTable = 

VAR CommonAttributesCount2 = 
    SELECTCOLUMNS(
        SUMMARIZE(
        'CommonAttributesCount',
        [ReportName],
        [ReportName2],
        "CommonAttributeCount",DISTINCTCOUNT('CommonAttributesCount'[AttributeName])),"CACA.ReportA",[ReportName],"CACA.ReportB",[ReportName2],"CACA.DC",[CommonAttributeCount])

VAR 
    FINAL = ADDCOLUMNS(
        FILTER(
            CROSSJOIN(
                FILTER(
                    CROSSJOIN(
                        SELECTCOLUMNS(
                            ReportAttributeCount,
                            "RAC.ReportA",[ReportA],
                            "RAC.ReportA_AttributeCount",
                            [ReportA_AttributeCount]
                            ),
                       CommonAttributesCount2),
                      [RAC.ReportA]=[CACA.ReportA]
                   ),
                SELECTCOLUMNS(
                    'ReportAttributeCount',
                    "RACB.ReportA",[ReportA],
                    "RACB.ReportA_AttributeCount",[ReportA_AttributeCount]
                    )
                ),[CACA.ReportB]=[RACB.ReportA])
                ,
                "reportApercentage",DIVIDE([CACA.DC],[RAC.ReportA_AttributeCount]),
                "reportBpercentage",DIVIDE([CACA.DC],[RACB.ReportA_AttributeCount]))
    
RETURN FINAL 

which for me returns this as a result

 

ReportA.png


To learn more about DAX visit : aka.ms/practicalDAX

Proud to be a Datanaut!

View solution in original post

7 REPLIES 7
Phil_Seamark
Employee
Employee

Hi there,

 

Please try adding the following three calcuated tables:

 

ReportAttributeCount = SELECTCOLUMNS(
                SUMMARIZECOLUMNS(
                    'Table2'[ReportName],
                    "ReportA_AttributeCount" , COUNTROWS('Table2')
                    ),
                        "ReportA",[ReportName],
                        "ReportA_AttributeCount",[ReportA_AttributeCount]
                        )
CommonAttributesCount = 
               SUMMARIZE( FILTER(
                GENERATE(
                             SELECTCOLUMNS(
                                    Table2,
                                    "ReportName",'Table2'[ReportName],
                                    "AttributeName",'Table2'[AttributeName]
                                    ) ,                   
                            SELECTCOLUMNS(
                                Table2,
                                "ReportName2",'Table2'[ReportName],
                                "AttributeName2",'Table2'[AttributeName]
                                )
                              ),[AttributeName]=[AttributeName2] && [ReportName]< [ReportName2]
                              ),
                              [ReportName],
                              [ReportName2],
                              [AttributeName]
                              )



and finally

 

ReportTable = 

VAR CommonAttributesCount2 = 
    SELECTCOLUMNS(
        SUMMARIZE(
        'CommonAttributesCount',
        [ReportName],
        [ReportName2],
        "CommonAttributeCount",DISTINCTCOUNT('CommonAttributesCount'[AttributeName])),"CACA.ReportA",[ReportName],"CACA.ReportB",[ReportName2],"CACA.DC",[CommonAttributeCount])

VAR 
    FINAL = ADDCOLUMNS(
        FILTER(
            CROSSJOIN(
                FILTER(
                    CROSSJOIN(
                        SELECTCOLUMNS(
                            ReportAttributeCount,
                            "RAC.ReportA",[ReportA],
                            "RAC.ReportA_AttributeCount",
                            [ReportA_AttributeCount]
                            ),
                       CommonAttributesCount2),
                      [RAC.ReportA]=[CACA.ReportA]
                   ),
                SELECTCOLUMNS(
                    'ReportAttributeCount',
                    "RACB.ReportA",[ReportA],
                    "RACB.ReportA_AttributeCount",[ReportA_AttributeCount]
                    )
                ),[CACA.ReportB]=[RACB.ReportA])
                ,
                "reportApercentage",DIVIDE([CACA.DC],[RAC.ReportA_AttributeCount]),
                "reportBpercentage",DIVIDE([CACA.DC],[RACB.ReportA_AttributeCount]))
    
RETURN FINAL 

which for me returns this as a result

 

ReportA.png


To learn more about DAX visit : aka.ms/practicalDAX

Proud to be a Datanaut!

Thank you, that's great that it can be done.  

 

Working well with demo data:

Capture3.PNG

 

 

 

 

 

Woohoo, nice work

 

If you break apart the code it kinda reads like SQL.  Although in DAX, to do something similar to an SQL inner join, you can combine the FILTER(CROSSJOIN(....)...) functions.  the rest is just renaming columns so you don't end up with two column names the same.


To learn more about DAX visit : aka.ms/practicalDAX

Proud to be a Datanaut!

I've applied DISTINCT to the DAX you gave since there are often duplicate reportname/fieldname pairs in the report table.  In the sample dataset this fixes issues of duplicate count, but in real dataset this DISTINCT isn't doing enough to handle for higher than "manual count" of numbers.  (SQL Query is handling duplicates well)

 

You can see I've added DISTINCT

#ReportAttributeCount = SELECTCOLUMNS(
                           SUMMARIZECOLUMNS(
                           'Report-AttributeMap'[Report Name],
                           "rptA - Total Attributes", COUNTROWS(DISTINCT('Report-AttributeMap'))
                           ),
                             "ReportA", 'Report-AttributeMap'[Report Name],
                             "rptA - Total Attributes", [rptA - Total Attributes]
)

 

 

to get

Capture6.PNG

notice Compliance Details now has 4 totalreportA attributes instead of 6.

 

No issues of inflated RACB.ReportA_AttributeCount, only with RAC.ReportA_AttributeCounts.

 

 Is there any other place in the DAX formulas that could be generating duplicates?

 

 

 

Hi @hxkresl,

 

Just having a look at this now for you 🙂


To learn more about DAX visit : aka.ms/practicalDAX

Proud to be a Datanaut!

Hi @hxkresl

 

Just a couple of questions that will help me.  So I understand the code is ok for the small sample dataset, yes?  🙂

 

Any chance you can give me a slightly bigger dataset including some duplication that you want to ignore.  It would be great if you can do this and also provide the expected result for the bigger dataset to help me pick the best place to apply the de-dup.

 

Cheers,

 

Phil


To learn more about DAX visit : aka.ms/practicalDAX

Proud to be a Datanaut!

Hi @Phil_Seamark ,

 

I had heavy code writen in the stored procs to get the data which involves lot of temp table joins and where conditions .(Currently this proc is inserting data into sql table from there SSRS repor is rendering)

 

If i want to convert this SSRS  to powerBi report which one would be faster DAX or get data from stored proc as source

Which once would be technically best suits .

 

Could you plese suggest.

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.