Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Market Basket Analysis - finding the revenue from clients who bought the product combo

Hi everyone,

 

Appreciate if anyone could help me on this.

 

I'm currently doing a project on market basket analysis (also called product association analysis).

 

The purpose is to find combinations of products that clients buy and that we can potentially cross-sell to future customers. For example, from our data, we may have seen that there were a number of clients buying milk (product A) and cookies (product B) from us. This would mean that we can cross-sell cookies to clients who buy only milk from us in the future.

 

In usual cases, this analysis would be done by finding the NUMBER of clients who buy both products (reference: here).

E.g. 20 clients have bought milk, 13 clients have bought cookies, intersecting them = 7 clients bought both products, etc.

 

However, in my case, I want to find the revenue received from the clients who have bought both products.

E.g. 20 clients have spent $20 buying milk, 13 clients have spent $13 buying cookies, intersecting them = 7 clients have spent $14 buying both products.

 

My dataset is structured such that it's by client & transactions and only 1 product per transaction.

Please see below for the simplified sample. (Of course there are more than 2 products in my data source.)

ClientTransactionNoProductRevenue
AAA1Milk20
BBB2Cookies10
BBB3Milk16
CCC4Cookies8
DDD5Cookies6
DDD6Milk24

 

I have tried to adapt the formulas and analysis guide from here and here.

I extracted the product column from the data table into 2 queries, 'product dim row' for the matrix row and 'product dim col' for the matrix column.

 

Screenshot 2021-01-21 171616.png

With the measure below, I could replicate this table above:

PurchasedBoth =

VAR productrow = VALUES(data[Client])

VAR productcol = CALCULATETABLE(VALUES(data[Client]),ALL(data), TREATAS(VALUES('Product dim column'[Product]),data[Product]))

RETURN

COUNTROWS(INTERSECT(productrow, productcol)))

 

However, I am so far unable to get the revenue received. The measure I have used is below (a bunch of gibberish text because I kept edting and testing):

RevPurchasedBoth =

VAR productrow = SUMX(SUMMARIZE(data, data[Client], data[Product]),CALCULATE(SUM(data[Revenue])))

VAR productcol = SUMX(SUMMARIZE(data, data[Client], data[Product]),CALCULATE(SUM(data[Revenue])))

// TREATAS(VALUES('Product dim col'[Product]),data[Product]))

VAR tow1 = VALUES('Product dim row'[Product])

VAR tow2 = VALUES('Product dim col'[Product])

RETURN CALCULATE(productrow+productcol, INTERSECT(tow1,tow2))

//CALCULATE(SUM(data[Revenue]),FILTER(ALL(data), CONTAINS(VALUES('Product dim row'[Product]), 'Product dim row'[Product], data[Product])), FILTER(ALL(data), CONTAINS(VALUES('Product dim col'[Product]), 'Product dim col'[Product]], data[Product])))))

 

This is what I have gotten:
Screenshot 2021-01-21 170431.png

 

Appreciate any PBI guru's input on where I've gone wrong or pointing me to the right resource/direction!! Thank you very much.
 

1 ACCEPTED SOLUTION
Anonymous
Not applicable

Update2:

I got the values I needed!

 
RevPurchaseBoth =
VAR productrow= VALUES(data[Client])
VAR productcol = CALCULATETABLE(VALUES(data[Client]), ALL('Product dim row'),
TREATAS(VALUES('Product dim col'[Product]),data[Product]))
RETURN
CALCULATE(data[revrow]+data[revcol], INTERSECT(productrow,productcol))

 

revrow = SUM(data[Revenue])

revcol = CALCULATE(data[revrow],ALL('Product dim row'), TREATAS(VALUES('Product dim col'[Product]),data[Product]))

 

Honestly, I don't know why it works now when I created 2 separate measures (revrow and revcol). It didn't work when I put them as VARs in RevPurchasedBoth measure. Appreciate it if anyone could explain!

View solution in original post

2 REPLIES 2
Anonymous
Not applicable

Update:

 

I'm so close to getting the correct values!!

I have edited my measure to this:

RevPurchaseBoth =
VAR productrow= VALUES(data[Client])
VAR productcol = CALCULATETABLE(VALUES(data[Client]), ALL('Product dim row'),
TREATAS(VALUES('Product dim col'[Product]),data[Product]))

RETURN
CALCULATE(SUM(data[Revenue]), INTERSECT(productrow,productcol))
 
The cells are showing up correctly (blank where it's supposed to be blank, filled where it's supposed to be filled), but the values in the cells are wrong. They are only showing the revenue of the productrow now. What I need is the sum of revenue from both productrow and productcol.
 
Does anyone know how to resolve this error?
Anonymous
Not applicable

Update2:

I got the values I needed!

 
RevPurchaseBoth =
VAR productrow= VALUES(data[Client])
VAR productcol = CALCULATETABLE(VALUES(data[Client]), ALL('Product dim row'),
TREATAS(VALUES('Product dim col'[Product]),data[Product]))
RETURN
CALCULATE(data[revrow]+data[revcol], INTERSECT(productrow,productcol))

 

revrow = SUM(data[Revenue])

revcol = CALCULATE(data[revrow],ALL('Product dim row'), TREATAS(VALUES('Product dim col'[Product]),data[Product]))

 

Honestly, I don't know why it works now when I created 2 separate measures (revrow and revcol). It didn't work when I put them as VARs in RevPurchasedBoth measure. Appreciate it if anyone could explain!

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.