cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
bamba98
Helper I
Helper I

Create table with random samples based on conditions from another table

Hi,

 

I have a table consisiting of approx 50k rows and 15 columns. The file looks, for example, as followes:

IDPeriod IDCategory
12020 Q11
22020 Q21
32020 Q41
42020 Q12
52020 Q22
72020 Q42
92020 Q13
102021 Q11
122020 Q41
112021 Q13
62021 Q12
202020 Q12
182020 Q23
162021 Q13
302020 Q13
402020 Q42
352020 Q43
342021 Q13
382021 Q12
292020 Q22
212021 Q11
222020 Q31
332020 Q32
442020 Q33
82020 Q31
412020 Q23
422020 Q21

From this table, I would like to create random table. The random table must only contain data from the latest quarter and only one 1 row per category. The new table would look as follows:

IDPeriod IDCategory
122021 Q11
302021 Q13
42021 Q12

I found the following expression posted on this forum:

 

 

 

Random Selection = 
VAR SampleSize = 10
VAR TableWithRand = ADDCOLUMNS(Table,"Rand", RAND())
RETURN TOPN(SampleSize, TableWithRand, [Rand])

 

 

 

However, I want the samplesize to be the same size as the number of categories available during the latest quarter. For example:

 

 

 

SampleSize = calculate(distinctcount(Category), allexcept(Table, PeriodID))

 

 

 

How do I add this and that it only selects samples from the latest period to the previous code sample? 

 

 

1 ACCEPTED SOLUTION
RicoZhou
Community Support
Community Support

Hi @bamba98 

Please try summarize function to get "Period ID" and "Category".

Rand LAST Period = 
VAR _LastPeriod = MAX('Table'[Period ID])
RETURN
ADDCOLUMNS(SUMMARIZE(FILTER('Table','Table'[Period ID] = _LastPeriod),'Table'[Period ID],'Table'[Category]),"RAND",INT(RAND()*100))

Result is as below.

1.png

 

Best Regards,

Rico Zhou

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly. 

View solution in original post

2 REPLIES 2
bamba98
Helper I
Helper I

I've added the following for selecting the correct number of samples:

 

 

Random Selection = 
VAR SampleSize = CALCULATE(DISTINCTCOUNT(Table[Category]), ALLEXCEPT(Table, Table[Period]),FILTER(ALL(Table), Table[Latest Period]=Table[Period]))
VAR TableWithRand = ADDCOLUMNS(Table,"Rand", RAND())
RETURN TOPN(SampleSize, TableWithRand, [Rand])

 

Where:

Latest Period = MAX(Table[Period])

This only gets me the correct number of samples to select. However, it still selects samples from different periods and same categories. 

RicoZhou
Community Support
Community Support

Hi @bamba98 

Please try summarize function to get "Period ID" and "Category".

Rand LAST Period = 
VAR _LastPeriod = MAX('Table'[Period ID])
RETURN
ADDCOLUMNS(SUMMARIZE(FILTER('Table','Table'[Period ID] = _LastPeriod),'Table'[Period ID],'Table'[Category]),"RAND",INT(RAND()*100))

Result is as below.

1.png

 

Best Regards,

Rico Zhou

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly. 

View solution in original post

Helpful resources

Announcements
PBI_User Group Leader_768x460.jpg

Manage your user group events

Check out the News & Announcements to learn more.

Get Ready for Power BI Dev Camp

Microsoft named a Leader in The Forrester Wave

Microsoft received the highest score of any vendor in both the strategy and current offering categories.

Get Ready for Power BI Dev Camp

Power BI Dev Camp - September 30th, 2021

Mark your calendars and join us for our next Power BI Dev Camp!

PowerPlatform 768x460.png

Microsoft Learn

Check out our new Discover Your Career Path blog post series and get all the details.

Top Solution Authors