Extract a table information with disctinccount wit...

Anonymous · ‎01-08-2020

Hi,

Could someone help solve this problem please?

Example fTable

ID	Store	Category	Date Opening	Date Closing
1	1	A	05/08/2019	10/08/2019
2	2	A	05/09/2019	11/09/2019
3	2	B	10/09/2019	11/09/2019
4	4	C	20/09/2019	25/09/2019
5	5	D	04/10/2019	05/10/2019
6	7	D	03/11/2019	04/12/2019
7	7	E	04/12/2019	07/12/2019
8	8	E	01/02/2020	08/01/2020
9	9	E	02/01/2020	03/01/2020

Then I tried two types of logic to extract distinct values of fTable[Classification]; by day (between opening and closing) and store.
I created a table to visualize easily

Extract Table = 
ADDCOLUMNS(
    SUMMARIZE(
        fTable;
        dCalendar[Date];
        dStores[StoreID] //contains the store list
    );
    "CategoryCount"; 
    CALCULATE(
        DISTINCTCOUNT(fTable[Category]);
            FILTER(
                ALL(fTable);
                    fTable[Date Opening] >= MIN(dCalendar[Date])
                    && fTable[Date Closing] < MAX(dCalendar[Date]));
            FILTER(VALUES(fTable[Store]);fTable[Store]=dStores[StoreID]
            )
    )
)

the problem of Extract Table is for each StoreID it repeats the max of distinct value founded.

StoreID	Date	CategoryCount
1	30/12/2019	2
1	31/12/2019	2
1	01/01/2020	2
1	02/01/2020	2
1	03/01/2020	2
2	30/12/2019	1
2	31/12/2019	1
2	01/01/2020	1
2	02/01/2020	1
2	03/01/2020	1
4	30/12/2019	1
4	31/12/2019	1
4	01/01/2020	1
4	02/01/2020	1
4	03/01/2020	1

After having no sucess on that I tried another way to Extract

Extract Table = 
ADDCOLUMNS(
    SUMMARIZE(
        fTable;
        dCalendar[Date];
        dStores[StoreID] //contains the store list
    );
    "CategoryCount"; 
     CALCULATE( DISTINCTCOUNT(fTable[Category]);
            FILTER( VALUES(fTable[Date Opening]); 
                           fTable[Date Opening]<=MAX(dCalendar[Date]));
            FILTER( VALUES(fTable[Date Closing]); 
                            fTable[Date Closing]>MIN(dCalendar[Date]));
            FILTER( VALUES(fTable[StoreID]); fTable[Store]=dStore[StoreID]) //Tables are related by StoreID
    )
)

The output of this table measure. The problem is that it only takes de opening date in consideration.

StoreID	Date	CategoryCount
1	30/12/2019	0
1	31/12/2019	1
1	01/01/2020	1
1	02/01/2020	0
1	03/01/2020	0
2	30/12/2019	0
2	31/12/2019	1
2	01/01/2020	1
2	02/01/2020	0
2	03/01/2020	0
4	30/12/2019	0
4	31/12/2019	0
4	01/01/2020	0
4	02/01/2020	1
4	03/01/2020	0

Ideal output should be like this, considering the quantity of distinct count of categories each store has already opened on that exacly day

StoreID	Date	CategoryCount
1	30/12/2019	0
1	31/12/2019	1
1	01/01/2020	2
1	02/01/2020	2
1	03/01/2020	0
2	30/12/2019	0
2	31/12/2019	1
2	01/01/2020	1
2	02/01/2020	1
2	03/01/2020	0
4	30/12/2019	0
4	31/12/2019	0
4	01/01/2020	0
4	02/01/2020	1
4	03/01/2020	0

In this case I need to make an average measure by day and store, thats why i'm trying to create a table. If there is a measure that execute that it would solve my problem too.

Thanks in advance

v-lili6-msft · ‎01-08-2020

hi @Anonymous

For your case, I think you have created a relationship between fTable table and dCalendar table by opening date field.

So it only takes de opening date in consideration.

You may try this way as below:

For the start/end date problem, you need a separate date table.

Then create a table by this logic:

Extract Table2 = 
ADDCOLUMNS (
    GENERATE(dStores,dCalendar),
    "CategoryCount", CALCULATE (
        DISTINCTCOUNT ( fTable[Category] ),
        FILTER (
            fTable ,
            fTable[Date Opening] <=  dCalendar[Date] 
                && fTable[Date Closing] >  dCalendar[Date] 
        ),
        FILTER ( VALUES ( fTable[Store] ), fTable[Store] = dStores[StoreID] )
    )
)

and here is similar post, you could refer to it.

If not your case, please share a simple sample pbix file and your expected output,I could not find the logic from above sample data and expected output

Regards,

Lin

Community Support Team _ Lin
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Anonymous · ‎01-12-2020

@v-lili6-msft Thank you! The solution gives right result as expected with small dataset.

But now I'm facing another issue that is bad performance. Because dStores has more than 600 rows, dCalendar has 3 entire years and fTable has more than 20 thousant records, causing a pretty long delay.

Is there a easy way to avoid this issue? Or that's an intrinsic problem of creating a table.

v-lili6-msft · ‎01-12-2020

hi @Anonymous

Yes, It will go through and calculate two tables, for big dataset, it will be bad performance.

maybe we could use a measure insead.

Please share a sample pbix file and your expected output with the main columns you want to display.

Regards,

Lin

Community Support Team _ Lin
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Extract a table information with disctinccount with date logic and category groupping

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024

How to Get Your Question Answered Quickly