cancel
Showing results for
Did you mean:
Helper I

## Compare to reference group

hey power bi users,

i've got following issue with my power bi report:

What I'm going to do?

I want to compare one single supplier to a reference group consisting of 1...n different suppliers with same properties.

For Example:

Supplier 1:

column_customers = '50 - 100 customers'

column_region = germany

column_employees = '<100 employees'

So the reference group consists of all suppliers, which have the same properties (except supplier 1) as well:

column_customers = '50 - 100 customers'

column_region = germany

column_employees = '<100 employees'

Lets say, there are 200000 suppliers and 5 of them (No. 7, 100, 450, 1230, 1999) meet the requirements/properties of supplier 1.

Here is my dax formula:

number of suppliers in reference group =
VAR customers= max(supplier[column_customers])
VAR region = max(supplier[column_region])
VAR employees = max(supplier[column_employees])

VAR supplier = max(supplier[column_supplier_id])

RETURN CALCULATE(
distinctcount(supplier[column_supplier_id]); ALL(supplier[column_supplier_id]);supplier[column_customers]=customers;supplier[column_region]=region;supplier[column_employees] = employees; supplier[column_supplier_id] <> supplier)

So the result in a straight table looks like this:

|

 supplier_id number of suppliers in reference group 1 5 2 13 3 25 ... ... 200000 133

This one is already working, but pretty slow. Is there a way to optimize the formula? There are millions of rows in my data model, so i have to increase the performance.

Hopefully someone of you can help me 😄

1 ACCEPTED SOLUTION
Super User

Here is a measure you could try. Using some test data at my end, it performed significantly better than the original measure:

``````number of suppliers in reference group =
VAR FilteredSupplierKey =
VALUES ( supplier[column_supplier_id] )
VAR SimilarSupplierKey =
CALCULATETABLE (
VALUES ( supplier[column_supplier_id] );
ALL ( supplier[column_supplier_id] );
SUMMARIZE (
supplier;
supplier[column_customers];
supplier[column_region];
supplier[column_employees]
)
)
RETURN
COUNTROWS ( EXCEPT ( SimilarSupplierKey; FilteredSupplierKey ) )``````

This should follow the same logic as your original measure when filtered on a single Supplier ID.

This measure also handles cases where you have filtered on multiple Supplier IDs, by counting the union of "similar" Suppliers.

Does this perform better with your data, and still produce correct results?

Regards

Owen

Owen Auger

My Blog
3 REPLIES 3
Super User

Here is a measure you could try. Using some test data at my end, it performed significantly better than the original measure:

``````number of suppliers in reference group =
VAR FilteredSupplierKey =
VALUES ( supplier[column_supplier_id] )
VAR SimilarSupplierKey =
CALCULATETABLE (
VALUES ( supplier[column_supplier_id] );
ALL ( supplier[column_supplier_id] );
SUMMARIZE (
supplier;
supplier[column_customers];
supplier[column_region];
supplier[column_employees]
)
)
RETURN
COUNTROWS ( EXCEPT ( SimilarSupplierKey; FilteredSupplierKey ) )``````

This should follow the same logic as your original measure when filtered on a single Supplier ID.

This measure also handles cases where you have filtered on multiple Supplier IDs, by counting the union of "similar" Suppliers.

Does this perform better with your data, and still produce correct results?

Regards

Owen

Owen Auger

My Blog
Helper I

@OwenAugerThank you so much for your solution. This one is incredible fast compared to my solution. My load time has decreased from 30 seconds to 1 second...amazing 😉

regards

robert

Community Support

Create variables in a measure can lead to extra calculations, try put them into the return expression see if it goes faster:

``````number of suppliers in reference group =
CALCULATE(
distinctcount(supplier[column_supplier_id]); Filter(ALL(supplier[column_supplier_id]);
supplier[column_customers]=max(supplier[column_customers]);
supplier[column_region]=max(supplier[column_region]);
supplier[column_employees] = max(supplier[column_employees]);
supplier[column_supplier_id] <> max(supplier[column_supplier_id]))``````

However, there is no way to increase performance speed dramatically since measure is calculated at the time you interact. The other option is to change the measure to a calculated column, that would increase pbix size, it pre-calculates the formula at the time you open the pbix, so it would be faster when you interact on the visuals.

Paul Zheng _ Community Support Team
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Announcements

#### Launching new user group features

Learn how to create your own user groups today!