Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
micky123
Frequent Visitor

Percentile table like Excel

Hi everyone,

 

I am trying to create a percentile table in Powerbi but having trouble doing that.

 

Below screenshot shows the Excel version:

 

percentile.PNG

 

From this we can see that my percentile jumps from increaing by 0.01 at 0.05 to 0.05 at 0.10.

 

So I created a "Enter Data" table in PowerBi and pasted this column values:

percentile_powerbi.PNG

 

This allowed me to create a table in PowerBi with the percentile values I wanted to use but I encountered a problem after this step.

 

I have 2 tables,

stats table - contains Total Sale & Cost

size table - where I have size of place

These 2 tables have relationship.

 

Now I don't know how to use PERCENTILE.INC to create a table as per first screenshot.

I hope my description is clear, if not please let me know.

 

Thanks for your time.

 

Kind Regards,

 

Rahul

1 ACCEPTED SOLUTION

@McCow Thanks for explination, I have tried to follow the same steps for prod data and there seems to be issue.
Could you point out what I am doing wrong please? Here's the updated pbi file: https://1drv.ms/u/s!Aghc93erq8rBcUhH5N_y1X_MXN8

I have created 3 new tables exactly in the same structure:
size_prod, stats_prod, Persentille_prod

 

Capture.PNG
Thanks again for your time, patience. I am learning alot from you!

 

Kind Regards

View solution in original post

16 REPLIES 16
AlanPhan
Frequent Visitor

@micky123 , @McCow ,
I can understand how to calculate Percentile as solution. But now I want to calculate Percentile and group by [size_id] in the table stats_prod. How could it possible? 
Any your help will be appriciated.

Greg_Deckler
Super User
Super User

Nope, sorry, not clear. What is the problem you encountered? What is the final result you are looking to achieve? Can you post sample data from your second table? How are the tables related?


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Hi @Greg_Deckler,

 

The final result I am looking to achieve is like this:
percentile.PNG

 

 

Table 1

DateIDSalesCost
01/01/201740505114357
02/01/201772878901416
03/01/2017363704633810
04/01/2017410067402805
05/01/2017579049873220
06/01/2017456321572348
07/01/2017338438562135
08/01/2017714005372159
09/01/2017747172891483
10/01/2017777479731922
11/01/2017602398302107
12/01/2017650546333405
13/01/2017449393261584
14/01/2017717896551328
15/01/2017407045583636
16/01/2017688046683165
17/01/20174113799138
18/01/201747474759516
19/01/2017462447481534
20/01/2017267077973943
21/01/2017455124271924
22/01/2017742766392995
23/01/201775824605390
24/01/2017473634953676
25/01/2017724706853021
26/01/201748808942736
27/01/201732450661238
28/01/201748837213921
29/01/201746723857378
30/01/2017565037762414
31/01/2017505555303044

 

Table 2

IDSize
40505980
728787600
36370420
4100619000
579042100
45632610
338432600
71400420
23441150
234715800
234784500
234883100
23554800
2356812000
23576950
235862100
236077300
2366290
23678200
237021200
237155500
23751430
237642900
239413800
24013520
24224620
2422711000
242441800
24318960
24333120000
24334370
24456530
244872500
24498410

 

 

Relationship is on ID columns.

 

Thanks for your time.

 

Kind Regards.

I have done some research and found out that this is the dax I need to use to calculate percentile but not sure how to adapt it for each percentile group: = PERCENTILEX.INC(<table>, <expression>;, k)

 

Does anyone know how I can have a dynamic value for "k"?

 

Thanks

Hi @micky123,

 

to build a column like your example, you can create DAX table (for ex. PersentilleT):

PersentilleT = UNION(GENERATESERIES(0;0,05;0,01);GENERATESERIES(0,1;0,9;0,05);GENERATESERIES(0,91;1;0,01))

Now you have one-column table with "Value" column.


And as second calculated column you can use this formula:

Column = PERCENTILEX.INC(Table1;Table1[Cost];[Value])

 It's all as you need?

P.S. for dynamic k-value you need to build a measure. It' is a little complicated approach.

 

Best Regs

@McCow Thanks this was helpful and I think I am getting close to what I am trying to do.

I am unable to get this work 

Column = PERCENTILEX.INC(Table1;Table1[Cost];[Value])

I have created a sample PowerBI file using your suggestion. URL: https://1drv.ms/u/s!Aghc93erq8rBcLygGX6qE2p7zXY

Please let me know if this simplify what I am after.

Thanks

Hi @micky123

good job, you was a pretty near.

i corrected you PBIX example and send back, LOOK HERE:

 

Two remarks:

1) The percentile calculation must be a column, not measure (see above)

2) The source Data type must be "Decimal" not Whole (see bellow):

2017-12-20_22-16-34.png

 

 

 

 

 

 

 

 

 

 

And i added extra columns for better understanding you troubles. It can be leaved to better understanding, what i did. Enjoy!

If you have a questions ask back.

Best regs

@McCow Thanks alot! This is really helpful and also for explination what I was doing wrong. I am going to put this in production where I have quite a large dataset. Will let you know how it goes!
Thanks again for your help.

Kind Regards

@micky123

with a pleasure!

 

@McCow There are couple of issues I am facing, please can you advise?
- Date selection doesn't affect anything, I would like to filter percentlie table based on different date ranges from stats[Date], do we need to modify the DAX?

- Cost percentile total is wrong cost_percentileC = PERCENTILEX.INC(stats,stats[Cost],[Value]) should be exctly the same as what we have in stats[Cost]?

- Sales percentile using this sales_percentileC = PERCENTILEX.INC(stats,stats[Sales],[Value]) gives me an error "Expressions that yield variant data-type cannot be used to define calculated columns."

Thanks for your time, I really appricate your help.

Kind Regards

Hi @micky123,

see bellow pls


@micky123 wrote:

@McCow There are couple of issues I am facing, please can you advise?
- Date selection doesn't affect anything, I would like to filter percentlie table based on different date ranges from stats[Date], do we need to modify the DAX?

 Yes, but it is more comlex, than our example

- Cost percentile total is wrong cost_percentileC = PERCENTILEX.INC(stats,stats[Cost],[Value]) should be exctly the same as what we have in stats[Cost]?

 I maked extra page for complete table, look, summarize will be explain on picture bellow (yours [Cost] was "Don't summarise"):

2017-12-21_21-14-20.png

- Sales percentile using this sales_percentileC = PERCENTILEX.INC(stats,stats[Sales],[Value]) gives me an error "Expressions that yield variant data-type cannot be used to define calculated columns."

observe on data type for all of yours data source, must be Decimal number, all number types can be summarized (or not):

 

2017-12-21_20-56-28.png

 

Please check this LINK, I uploaded corrected to your request version.
Best regs.

@McCow Thanks for explination, I have tried to follow the same steps for prod data and there seems to be issue.
Could you point out what I am doing wrong please? Here's the updated pbi file: https://1drv.ms/u/s!Aghc93erq8rBcUhH5N_y1X_MXN8

I have created 3 new tables exactly in the same structure:
size_prod, stats_prod, Persentille_prod

 

Capture.PNG
Thanks again for your time, patience. I am learning alot from you!

 

Kind Regards

@micky123 , @McCow ,
I can understand how to calculate Percentile as solution. But now I want to calculate Percentile and group by [size_id] in the table stats_prod. How could it possible? 
Any your help will be appriciated.

And @micky123

I'm not sure if you somehow can use it, but u can try experiment with this formula:

cost_percentileC2 = PERCENTILEX.INC(
	CALCULATETABLE(
		stats_prod;	FILTER(stats_prod;stats_prod[_Size]>0); //from min
				FILTER(stats_prod;stats_prod[_Size]<100000) //to max
		); // filter-table
	stats_prod[cost];[Value]
)//end PERCENTILEX.INC

 

where Filter definition create low and high (min and max) data border from your data set.

 

Enjoy!

Hi @micky123,

as you know percenttille is a statistical measue (and the DAX functions PERCENTILEX.INC and PERCENTILEX.EXC), that shows how you data set will be exactly distributed from min to max unity value (and not the SUM). The most known is the normal distribution (Gauss).

 

In your case, this formula will show how often it will meet this or that "size", "cost" or "sale" value. On your example, you will see that the probability of meeting of any digit more than zero, is too small (k-value from "0" to "0.99"). And you'll meet just several digits with the max value of "5" (see [sale_percentileC]) very seldom.

 

And all of three columns ([size_percentile_prod], [cost_percentileC] and [sale_percentileC]) are independent. They will be calculated separately without any relationships between these values in single record (line of you data set).

 

The second difficulty is your relationship with [size] only. But in your case it's no matter because stat_prod has about 100K records and size_prod has only 5K. I don't understand what kind of statistic you want to calculate based on size_prod only. I created relation column [_Size] in stat_prod and suggest to make all calculations on the base of the biggest table (like stat_prod in you case).

 

And the task of Summarization of your columns is completly another / different task.

 First version of calculation of Sum of the stats_prod table:

SumOfCost = CALCULATE(SUM(stats_prod[cost]))

The picture below shows why it is impossible to count the amount according to the distributions table:

2017-12-23_12-12-16.png

Sorry for my English, and I hope my description provides a little more clarity to your question. If not, please ask back.

 

Best regs

Hi @micky123,

i don't see any big problems (except summarization options for Persentille_prod[size_percentile_prod], but it' no matter).

I'll be analyse and come back later.

 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.