Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
Hi,
I have a table consisiting of approx 50k rows and 15 columns. The file looks, for example, as followes:
ID | Period ID | Category |
1 | 2020 Q1 | 1 |
2 | 2020 Q2 | 1 |
3 | 2020 Q4 | 1 |
4 | 2020 Q1 | 2 |
5 | 2020 Q2 | 2 |
7 | 2020 Q4 | 2 |
9 | 2020 Q1 | 3 |
10 | 2021 Q1 | 1 |
12 | 2020 Q4 | 1 |
11 | 2021 Q1 | 3 |
6 | 2021 Q1 | 2 |
20 | 2020 Q1 | 2 |
18 | 2020 Q2 | 3 |
16 | 2021 Q1 | 3 |
30 | 2020 Q1 | 3 |
40 | 2020 Q4 | 2 |
35 | 2020 Q4 | 3 |
34 | 2021 Q1 | 3 |
38 | 2021 Q1 | 2 |
29 | 2020 Q2 | 2 |
21 | 2021 Q1 | 1 |
22 | 2020 Q3 | 1 |
33 | 2020 Q3 | 2 |
44 | 2020 Q3 | 3 |
8 | 2020 Q3 | 1 |
41 | 2020 Q2 | 3 |
42 | 2020 Q2 | 1 |
From this table, I would like to create random table. The random table must only contain data from the latest quarter and only one 1 row per category. The new table would look as follows:
ID | Period ID | Category |
12 | 2021 Q1 | 1 |
30 | 2021 Q1 | 3 |
4 | 2021 Q1 | 2 |
I found the following expression posted on this forum:
Random Selection =
VAR SampleSize = 10
VAR TableWithRand = ADDCOLUMNS(Table,"Rand", RAND())
RETURN TOPN(SampleSize, TableWithRand, [Rand])
However, I want the samplesize to be the same size as the number of categories available during the latest quarter. For example:
SampleSize = calculate(distinctcount(Category), allexcept(Table, PeriodID))
How do I add this and that it only selects samples from the latest period to the previous code sample?
Solved! Go to Solution.
Hi @bamba98
Please try summarize function to get "Period ID" and "Category".
Rand LAST Period =
VAR _LastPeriod = MAX('Table'[Period ID])
RETURN
ADDCOLUMNS(SUMMARIZE(FILTER('Table','Table'[Period ID] = _LastPeriod),'Table'[Period ID],'Table'[Category]),"RAND",INT(RAND()*100))
Result is as below.
Best Regards,
Rico Zhou
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
I've added the following for selecting the correct number of samples:
Random Selection =
VAR SampleSize = CALCULATE(DISTINCTCOUNT(Table[Category]), ALLEXCEPT(Table, Table[Period]),FILTER(ALL(Table), Table[Latest Period]=Table[Period]))
VAR TableWithRand = ADDCOLUMNS(Table,"Rand", RAND())
RETURN TOPN(SampleSize, TableWithRand, [Rand])
Where:
Latest Period = MAX(Table[Period])
This only gets me the correct number of samples to select. However, it still selects samples from different periods and same categories.
Hi @bamba98
Please try summarize function to get "Period ID" and "Category".
Rand LAST Period =
VAR _LastPeriod = MAX('Table'[Period ID])
RETURN
ADDCOLUMNS(SUMMARIZE(FILTER('Table','Table'[Period ID] = _LastPeriod),'Table'[Period ID],'Table'[Category]),"RAND",INT(RAND()*100))
Result is as below.
Best Regards,
Rico Zhou
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
User | Count |
---|---|
101 | |
90 | |
78 | |
70 | |
69 |
User | Count |
---|---|
110 | |
96 | |
95 | |
74 | |
71 |