Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
E_Hutch
Frequent Visitor

Pull Random 3% of Data

Good Morning!

 

I have a table of data that includes sales data, including the Customer Number and details of purchase. I would like to create a new table that pulls a random 3% (rounded up) of rows for each customer. So for example, if I have a table like this:

 

Customer IDOrder NumberOrder DateColor
1101/01/2022Red
3201/01/2022Red
3301/05/2022Blue
3401/06/2022Red
2501/08/2022Blue

 

Etc., and the table has 296 orders for Customer 1, 120 orders for customer 2, and 389 orders for customer 3, then I would like the new table to consist of the below number of randomly selected rows:

 

Customer 1: 9

Customer 2: 4

Customer 3: 12

 

Bonus points if I can also get an approximately equal amount of Red orders and Blue orders!

 

I assume this is pretty easy but I have been googling all day and cannot figure it out. 

 

Thank you in advance!!

1 ACCEPTED SOLUTION
E_Hutch
Frequent Visitor

Alright, I remember how I did this previously. Took me a while, but I knew it could be done. Foregoing the half Red and half Blue on my wishlist, the way to create a new table with a random n% of rows from an existing table is to create a measure within the existing table as follows (using 3% as an example)

 

 3% = CALCULATE(DISTINCTCOUNT('Table1'[Order Number])*.03)

 

Then create a new Table as using that measure:

 

3% = SAMPLE('Table1'[3%], 'Table1',0)

 

Then I can create a table visual based off of my new table, and add a slicer to select the Customer I want and it will pull a random 3% of that customer's orders. 

 

Thanks all hope this helps someone else. 

View solution in original post

4 REPLIES 4
E_Hutch
Frequent Visitor

Alright, I remember how I did this previously. Took me a while, but I knew it could be done. Foregoing the half Red and half Blue on my wishlist, the way to create a new table with a random n% of rows from an existing table is to create a measure within the existing table as follows (using 3% as an example)

 

 3% = CALCULATE(DISTINCTCOUNT('Table1'[Order Number])*.03)

 

Then create a new Table as using that measure:

 

3% = SAMPLE('Table1'[3%], 'Table1',0)

 

Then I can create a table visual based off of my new table, and add a slicer to select the Customer I want and it will pull a random 3% of that customer's orders. 

 

Thanks all hope this helps someone else. 

Hi - I am trying to do the same - so i have managed to follow some of this and adapt it to a different dataset with names and ID numbers. Mine is 10% - trouble is the rows retrieved is correct in terms of quantity, but the actual rows themselves don't randomise each time i refresh... is there a way that i can get them to be random? but 10% of the total number of rows of data available? 

Vera_33
Resident Rockstar
Resident Rockstar

Hi @E_Hutch 

 

One way is to add a random number column, sort by the column, then select top rows accordingly...it is done in Power Query, you can use DAX to do it as well, Power Pivot in Excel, the same logic

Vera_33_0-1644627953622.png

 

Table.FirstN( Table.SelectRows(Table, each [Customer ID]=1 and [Color]="Red"),4)
&Table.FirstN( Table.SelectRows(Table, each [Customer ID]=2 and [Color]="Red"),2)
&Table.FirstN( Table.SelectRows(Table, each [Customer ID]=3 and [Color]="Red"),6)
&Table.FirstN( Table.SelectRows(Table, each [Customer ID]=1 and [Color]="Blue"),5)
&Table.FirstN( Table.SelectRows(Table, each [Customer ID]=2 and [Color]="Blue"),2)
&Table.FirstN( Table.SelectRows(Table, each [Customer ID]=3 and [Color]="Blue"),6)

 

 

Hello! Thanks for your help, though I am not sure how feasible this option is. There are thousands of Customer ID's so I would not be able to enter them all in that way. Additionally, I don't think selecting top rows would work because I need to pull as close to 3% as possible which would differ each month due to change in volume. Thank you for your help though!

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.