Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi,
Could someone help solve this problem please?
Example fTable
ID | Store | Category | Date Opening | Date Closing |
1 | 1 | A | 05/08/2019 | 10/08/2019 |
2 | 2 | A | 05/09/2019 | 11/09/2019 |
3 | 2 | B | 10/09/2019 | 11/09/2019 |
4 | 4 | C | 20/09/2019 | 25/09/2019 |
5 | 5 | D | 04/10/2019 | 05/10/2019 |
6 | 7 | D | 03/11/2019 | 04/12/2019 |
7 | 7 | E | 04/12/2019 | 07/12/2019 |
8 | 8 | E | 01/02/2020 | 08/01/2020 |
9 | 9 | E | 02/01/2020 | 03/01/2020 |
Then I tried two types of logic to extract distinct values of fTable[Classification]; by day (between opening and closing) and store.
I created a table to visualize easily
Extract Table =
ADDCOLUMNS(
SUMMARIZE(
fTable;
dCalendar[Date];
dStores[StoreID] //contains the store list
);
"CategoryCount";
CALCULATE(
DISTINCTCOUNT(fTable[Category]);
FILTER(
ALL(fTable);
fTable[Date Opening] >= MIN(dCalendar[Date])
&& fTable[Date Closing] < MAX(dCalendar[Date]));
FILTER(VALUES(fTable[Store]);fTable[Store]=dStores[StoreID]
)
)
)
the problem of Extract Table is for each StoreID it repeats the max of distinct value founded.
StoreID | Date | CategoryCount |
1 | 30/12/2019 | 2 |
1 | 31/12/2019 | 2 |
1 | 01/01/2020 | 2 |
1 | 02/01/2020 | 2 |
1 | 03/01/2020 | 2 |
2 | 30/12/2019 | 1 |
2 | 31/12/2019 | 1 |
2 | 01/01/2020 | 1 |
2 | 02/01/2020 | 1 |
2 | 03/01/2020 | 1 |
4 | 30/12/2019 | 1 |
4 | 31/12/2019 | 1 |
4 | 01/01/2020 | 1 |
4 | 02/01/2020 | 1 |
4 | 03/01/2020 | 1 |
After having no sucess on that I tried another way to Extract
Extract Table =
ADDCOLUMNS(
SUMMARIZE(
fTable;
dCalendar[Date];
dStores[StoreID] //contains the store list
);
"CategoryCount";
CALCULATE( DISTINCTCOUNT(fTable[Category]);
FILTER( VALUES(fTable[Date Opening]);
fTable[Date Opening]<=MAX(dCalendar[Date]));
FILTER( VALUES(fTable[Date Closing]);
fTable[Date Closing]>MIN(dCalendar[Date]));
FILTER( VALUES(fTable[StoreID]); fTable[Store]=dStore[StoreID]) //Tables are related by StoreID
)
)
The output of this table measure. The problem is that it only takes de opening date in consideration.
StoreID | Date | CategoryCount |
1 | 30/12/2019 | 0 |
1 | 31/12/2019 | 1 |
1 | 01/01/2020 | 1 |
1 | 02/01/2020 | 0 |
1 | 03/01/2020 | 0 |
2 | 30/12/2019 | 0 |
2 | 31/12/2019 | 1 |
2 | 01/01/2020 | 1 |
2 | 02/01/2020 | 0 |
2 | 03/01/2020 | 0 |
4 | 30/12/2019 | 0 |
4 | 31/12/2019 | 0 |
4 | 01/01/2020 | 0 |
4 | 02/01/2020 | 1 |
4 | 03/01/2020 | 0 |
Ideal output should be like this, considering the quantity of distinct count of categories each store has already opened on that exacly day
StoreID | Date | CategoryCount |
1 | 30/12/2019 | 0 |
1 | 31/12/2019 | 1 |
1 | 01/01/2020 | 2 |
1 | 02/01/2020 | 2 |
1 | 03/01/2020 | 0 |
2 | 30/12/2019 | 0 |
2 | 31/12/2019 | 1 |
2 | 01/01/2020 | 1 |
2 | 02/01/2020 | 1 |
2 | 03/01/2020 | 0 |
4 | 30/12/2019 | 0 |
4 | 31/12/2019 | 0 |
4 | 01/01/2020 | 0 |
4 | 02/01/2020 | 1 |
4 | 03/01/2020 | 0 |
In this case I need to make an average measure by day and store, thats why i'm trying to create a table. If there is a measure that execute that it would solve my problem too.
Thanks in advance
hi @Anonymous
For your case, I think you have created a relationship between fTable table and dCalendar table by opening date field.
So it only takes de opening date in consideration.
You may try this way as below:
For the start/end date problem, you need a separate date table.
Then create a table by this logic:
Extract Table2 =
ADDCOLUMNS (
GENERATE(dStores,dCalendar),
"CategoryCount", CALCULATE (
DISTINCTCOUNT ( fTable[Category] ),
FILTER (
fTable ,
fTable[Date Opening] <= dCalendar[Date]
&& fTable[Date Closing] > dCalendar[Date]
),
FILTER ( VALUES ( fTable[Store] ), fTable[Store] = dStores[StoreID] )
)
)
and here is similar post, you could refer to it.
If not your case, please share a simple sample pbix file and your expected output,I could not find the logic from above sample data and expected output
Regards,
Lin
@v-lili6-msft Thank you! The solution gives right result as expected with small dataset.
But now I'm facing another issue that is bad performance. Because dStores has more than 600 rows, dCalendar has 3 entire years and fTable has more than 20 thousant records, causing a pretty long delay.
Is there a easy way to avoid this issue? Or that's an intrinsic problem of creating a table.
hi @Anonymous
Yes, It will go through and calculate two tables, for big dataset, it will be bad performance.
maybe we could use a measure insead.
Please share a sample pbix file and your expected output with the main columns you want to display.
Regards,
Lin
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
110 | |
97 | |
78 | |
64 | |
55 |
User | Count |
---|---|
143 | |
109 | |
89 | |
84 | |
66 |