Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Adding multiple Percentile columns to each Table in a Table of Tables

Hi @Jimmy801 ,

This is with respect to my last post, on `Table of Tables`. I am trying to find an easier & concise way to do all the Percentile calculations and column additions in each table in the `Table of tables` and then finally Combine all the tables. The way i am able to achieve it is very long and tedious and involves the query editor to add steps like, Combining tables first, then unpivot, then pivot and group by, ExpandColumns etc. 


Here is the sample code i created in Power BI Editor (though my original report is i am working on Excel 2016 though) for your perusal. Please note that each table in `Source` shown here has been transformed through multiple steps to clean each table.

 

let
    Source = {Table1,Table2},
    ListOfTables = Source,
    CombinedTables = Table.Combine(ListOfTables),
    TransformTypes = Table.TransformColumnTypes(
           CombinedTables,
           List.Zip({ 
               List.Buffer(List.Difference( Table.ColumnNames(CombinedTables), {"Code", "Metric"} )), 
               {type number, type number, type number} 
          })
       ),
    UnpivotedOtherColumns = Table.UnpivotOtherColumns(TransformTypes, {"Code", "Metric"}, "Subgroup", "Value"),
    PivotedOnMetric = Table.Pivot(UnpivotedOtherColumns, List.Distinct(UnpivotedOtherColumns[Metric]), "Metric", "Value"),
    UNIQ_P = PercentileInclusive(List.Buffer(PivotedOnMetric[UNIQ]),0.8),
    ND_P = PercentileInclusive(List.Buffer(PivotedOnMetric[ND]),0.8),
    UNIQ_TOP = Table.AddColumn(PivotedOnMetric,"Top UNIQ", each if [UNIQ]>UNIQ_P then "Top" else ""),
    ND_TOP = Table.AddColumn(PivotedOnMetric,"Top ND", each if [ND]>ND_P then "Top" else ""),
    NestedJoin = Table.NestedJoin(UNIQ_TOP,"Code", ND_TOP,"Code","ALL", JoinKind.Inner),
    ExpandedALL = Table.ExpandTableColumn(NestedJoin, "ALL", {"Top ND"}, {"Top ND"})
in
    ExpandedALL

 

 

PercentileInclusive function code:

 

//PercentileInclusive Function
let
    Source = (inputSeries as list, percentile as number) =>
let
    SeriesCount = List.Count(inputSeries),
    PercentileRank = percentile*(SeriesCount-1)+1, //percentile value between 0 and 1
    PercentileRankRoundedUp = Number.RoundUp(PercentileRank),
    PercentileRankRoundedDown = Number.RoundDown(PercentileRank),
    Percentile1 = List.Max(List.MinN(inputSeries,PercentileRankRoundedDown)),
    Percentile2 = List.Max(List.MinN(inputSeries,PercentileRankRoundedUp)),
    Percentile = Percentile1+(Percentile2-Percentile1)*(PercentileRank-PercentileRankRoundedDown)
in
    Percentile
in
    Source

 

 Please let me know how this code can be optimized and made concise for performance.

6 REPLIES 6
Jimmy801
Community Champion
Community Champion

Hello @Anonymous 

 

about question 1) this is a column you added to the query-table manually. Just delete it.

question2) I know that in some excel-versions the table is selected after refresh. The only way you may be able to work around is to use a makro, that does a select of a single after the refresh

 

Hope that helps

 

Jimmy

Anonymous
Not applicable

@Jimmy801 i did not add that column. if you see my query code, only the required columns are being generated. But in this one, an extra column gets added. Try unloading the query and loading it to a new worksheet and see for yourself. Not sure why its happening with this query only. My concerns are that this extra column could be the reason why my excel workbook gets corrupted when vba code is applied.

 

P.S: Strangely, i see 3 dots "..." in the last row of this column.

Hello @Anonymous 

 

I cannot confirm that. When I unload it and load it, then the correct data is shown in a new sheet

Jimmy801_0-1605879440204.png

Then there is something else causing this

 

BR Jimmy

 

Anonymous
Not applicable

Now i got 2 extra columns viz., Column1 & Column2 added when i loaded to a new sheet in Column C and 5 extra columns when i loaded to a new sheet in Column F. I think i know what is happening. When i add a new sheet and select the column & row i.e cell where the data should be loaded, it somehow offsets those many columns from the table start column.

If i just load it by selecting New worksheet in the load dialog, it will default to the 1st column only and so no extra column gets added.

 

See, if you are aware of this same behaviour in your excel and if you know whats causing this issue.

Jimmy801
Community Champion
Community Champion

Hello @Anonymous 

 

sorry, I don't have table1 and table2. I can't reproduce your scenario. I'm also no expert on statistics. If you can tell me how percentile is calculated

 

BR

 

Jimmy

Anonymous
Not applicable

Hi @Jimmy801 ,

 

No need to modify the PercentileInclusive function. I added it as it is used in the calculation. It is similar to the Excel's Percentile.Inc function, but in Power Query. 

 

It only means, given a series of values, find out which value is found at the given percentile - in my case, 80th percentile. Some interpolation is done at the end to arrive at a single percentile e.g. incase a value falls between 2 percentiles. .eg. a value 0.482 is falling between the 80th and 81th percentile, then the value is interpolated between the two and result at 80th Percentile could be 0.5. Here is a link for your information purposes:

https://www.datasciencemadesimple.com/percentile-inc-function-in-excel/ 

 

As for my issue, i tried using your method to optimize the query further a bit. I am attaching a sample workbook and 2 test csv files to load into the workbook.

 

Two behaviours observed that i need your immediate advice :

1] One thing i hope you can tell me is  - why an extra Column (Column1) gets appended to the query when loading it first time.

2] If i refresh the Query "Manually" or "via VBA Code", the table always gets selected. I came to know that this should not happen. I have just now repaired my Office to be sure, but still the same. Is there a way to turn off this table selection behaviour in Excel? I am using Excel 2016 Pro 64 bit.

 

Sample CSV files 

Sample Test workbook 

 

Hope you can help me with this query issue quickly as it is causing my workbook to corrupt when i run this query with VBA code. I have left simple comments in the workbook.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors