Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
If I have a table, "Table" with an ID column and three columns, "A", "B" and "C".
Will the following produce a table with a new column of random numbers that are evenly distributed within each partition of A, B and C?
Solved! Go to Solution.
yes, as long as your assumption holds true. The usual disclaimer applies - RAND is not truly random.
I'm looking to produce random sample for each partition of A, B and C. I use RAND() to return a random number and then select everything with the random number below a certain value to get a propotion of each partition. This is reliant on the random numbers for each parition being evenly distributed, for example no skew to the upper end or lower ends of RAND(), which produces a number between 0 and 1. Assuming each partition contains many data points, can I use WINDOW and RAND() in the described way to get a evenly distributed set of random numbers for each partition?
Thank you for responding.
yes, as long as your assumption holds true. The usual disclaimer applies - RAND is not truly random.
Please explain this part
random numbers that are evenly distributed within each partition of A, B and C
You can only get an "evenly" distributed result with large amounts of data.
User | Count |
---|---|
58 | |
21 | |
18 | |
16 | |
12 |
User | Count |
---|---|
85 | |
54 | |
39 | |
21 | |
18 |