Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
AmitTiwari
Frequent Visitor

Summing the Data where grouping level is different

Hey All,

 

I have data at the below grouping level and there are around more than a billion rows in the fact table. As you can see in below example, we have a particular prescription and that contains multiple products. In case of prescription #1, this prescription contains product #1, 2 & 7. This prescription with the particular product combinatin was prescribed to 200 patients.  Overall, i have 700 Patients from below dataset. For example, product #1, was presribed to 700 patients & Product #7, was prescribed to 200 patients. 

 

Goal: Create a measure for # of Patients with optimized performance. 

Note:

Technically, I am able to achive this, but it's taking around 5 mins to return the results which is unacceptable. Below implementation works, but takes lot of time and this is on the server with huge configuration. 

 

SUMX(DISTINCT(FactTable[PrescriptionKey]),FIRSTNONBLANK(FactTable[NoOfPatient],0))​

 

 

I am aware that we can break up these tables and create multiple fact tables, but then the Prescription Key records are in around 50-60 million and product is around 1 billion. So, in that case as well, the performance is not good.

 

 

 

PrescriptionKeyProductKeyNoOfPatientsQtyPrescribed
1120050
1220025
17200100
2150075
22500100
23500250
24500525
 Total7001125

 

Any help is appreciated.

 

Thanks

Amit

4 REPLIES 4
amitchandak
Super User
Super User

@AmitTiwari , I think this should be done like

 

sumx(Values(FactTable[PrescriptionKey]), max(FactTable[NoOfPatient])) // or use Min in place of Max

or
sumx(summarize(FactTable,FactTable[PrescriptionKey],"_1", max(FactTable[NoOfPatient])),[_1])

Hi @amitchandak

 

Thanks for the quick response. Below dax does not give correct results. It takes the max/min of all the result sets and sums it up. Gives me a result of 1000 in case of MAX and 400 in case of MIN. 

sumx(Values(FactTable[PrescriptionKey]), max(FactTable[NoOfPatient])) 

 

This one is working correctly, but for my prod data, it's taking around 485 seconds to complete. 

sumx(summarize(FactTable,FactTable[PrescriptionKey],"_1", max(FactTable[NoOfPatient])),[_1])

 

AmitTiwari_0-1599466511448.png

 

 

 

Is there any way we can get faster results? 

Hi, @AmitTiwari , you might try these measures out,

 

SUMX (
    VALUES ( FactTable[PrescriptionKey] ),
    CALCULATE ( MAX ( FactTable[NoOfPatients] ) )
)

 

and

SUMX (
    SUMMARIZECOLUMNS (
        FactTable[PrescriptionKey],
        "col", MAX ( FactTable[NoOfPatients] )
    ),
    [col]
)

 


Thanks to the great efforts by MS engineers to simplify syntax of DAX! Most beginners are SUCCESSFULLY MISLED to think that they could easily master DAX; but it turns out that the intricacy of the most frequently used RANKX() is still way beyond their comprehension!

DAX is simple, but NOT EASY!

@AmitTiwari , if there is guaranteed product key 1 always there then it would be the fastest 

 

calculate(sum(FactTable[NoOfPatient]) , FactTable[productkey]=1)

 

Or create a column they use in measure as filter , like above. This will distribute cost
if(minx(filter(FactTable,[PrescriptionKey] = earlier([PrescriptionKey])),[productkey]) = [productkey],1,0)

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors