Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi,
I need some advice on how to properly model / calculate the product attach rate, for the following use case:
A retailer sells computers and insurances for the computers. Each category of computers (laptops or desktops) have a specific insurance product (i.e. "laptop insurance" and "desktop insurance". The retailer wants to calculate the number of sold insurances versus the number of sold computers in each category.
Sales data for both computers and insurance is stored in a single fact table, where each row is a line item:
The product dimension looks like this:
The category dimension includes the product id of the related insurance product:
The expected output is a table like this:
Category name | Insurance product | Product sales | Insurance sales | Attach rate |
Laptops | Laptop insurance | 4 | 2 | 50% |
Desktops | Desktop insurance | 1 | 0 | 0% |
How would you achieve this, in terms of modeling and measures? I really appreciate any help, as I am completely stuck!
I am open to changing the data structure, if that makes it easier. I.e. moving the InsuranceProductID to the product dimension and somekind of lookup/self-relation.
Here is PBIX file with the sample data: https://drive.google.com/file/d/1t_VD2HyPORUTfWN4HwvRfhycMGmVzTUD/view?usp=sharing
Solved! Go to Solution.
@Anonymous @daxer-almighty
I was able to solve this by creating a new dimension for insurance products:
There are two different (both inactive, since USERELATIONSHIP() won't work unless they are both inactive for some reason) realtionships between the insurance dimension and product dimension:
- CategoryID -> CategoryID, to calculate the number of compatible computers sold per insurance product
- InsuranceProductID -> ProductID, to calculate the number of insurances sold
The measures then become:
Which produce the expected result, while keeping all sales in a single FACT table:
The only problem with this setup is maintenance and understanding. If you take a look at my model, things are straightforward and very flexible (not to mention will be fast no matter how many rows the fact table has). Your model can decay very quickly over time and is overly complex. Later, please don't tell me I did not warn you!
As usual, the old rule stands: The fact that you can do something, does not mean you should.
@Anonymous @daxer-almighty
I was able to solve this by creating a new dimension for insurance products:
There are two different (both inactive, since USERELATIONSHIP() won't work unless they are both inactive for some reason) realtionships between the insurance dimension and product dimension:
- CategoryID -> CategoryID, to calculate the number of compatible computers sold per insurance product
- InsuranceProductID -> ProductID, to calculate the number of insurances sold
The measures then become:
Which produce the expected result, while keeping all sales in a single FACT table:
I suggest you use what I created because it's the correct way to model the data. Yours will not suffice. If you have one line per item, even if there are multiple same items (some of them with insurance, others without), then it's even easier. You just cannot store data the way you do because you won't be able to get what you want. My model gets you what you want practically without any effort.
By the way, you can reshape your data using Power Query to look like mine. This is no problem at all.
Here's a file to get you started. I didn't know all the details so I had to invent some of the rules. You need to understand as well that since one has to know which insurance belongs to which product on each individual invoice, the products that are associated with insurances have to have entries in the Sales table where their quantity is 1. This is because the table keeps associations between products and insurances as well as products and invoices. If, on the other hand, you've got many copies of the same product on the same invoice and they are not insured, then you can aggregate them into one entry in the table. This is why there's a field for the quantity of products but there's no such field for insurance products (each line means exactly one insurance product). Note as well how the measures are coded. The quantity of products uses SUM over the quantitiy field but the quantity of insurance products uses COUNTROWS. It's important to understand this and the reason behind it.
Of course, you can have different rules but then you'll have to change the measures accordingly.
Hi @daxer-almighty !
Thank you so much for the file! It is helpful, but it seems the logic is a bit different than how my actual data is. In my Sales table, both computers (and other products) and insurances are separate line items, like this (I have used names instead of id's here for clarity):
Caluclating the overall ratio of insurances sold to computers is straight forward:
The problem arise when I need to have a measure that can work on the category dimension, i.e. calculating the attach rate per category of computers, because the "Insurances sold" measure will not have the relation between insurance product id and category id.
I might be using the term attach rate wrong here - I am actually after the ratio of sold insurances versus compatible computers sold. Sorry if that was confusing. So it is actually not necessary to associated each insurance sold with the correct computer line item, only to have a relation on the category level.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
109 | |
99 | |
77 | |
66 | |
54 |
User | Count |
---|---|
144 | |
104 | |
102 | |
87 | |
64 |