Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi @Jimmy801 ,
This is with respect to my last post, on `Table of Tables`. I am trying to find an easier & concise way to do all the Percentile calculations and column additions in each table in the `Table of tables` and then finally Combine all the tables. The way i am able to achieve it is very long and tedious and involves the query editor to add steps like, Combining tables first, then unpivot, then pivot and group by, ExpandColumns etc.
Here is the sample code i created in Power BI Editor (though my original report is i am working on Excel 2016 though) for your perusal. Please note that each table in `Source` shown here has been transformed through multiple steps to clean each table.
let
Source = {Table1,Table2},
ListOfTables = Source,
CombinedTables = Table.Combine(ListOfTables),
TransformTypes = Table.TransformColumnTypes(
CombinedTables,
List.Zip({
List.Buffer(List.Difference( Table.ColumnNames(CombinedTables), {"Code", "Metric"} )),
{type number, type number, type number}
})
),
UnpivotedOtherColumns = Table.UnpivotOtherColumns(TransformTypes, {"Code", "Metric"}, "Subgroup", "Value"),
PivotedOnMetric = Table.Pivot(UnpivotedOtherColumns, List.Distinct(UnpivotedOtherColumns[Metric]), "Metric", "Value"),
UNIQ_P = PercentileInclusive(List.Buffer(PivotedOnMetric[UNIQ]),0.8),
ND_P = PercentileInclusive(List.Buffer(PivotedOnMetric[ND]),0.8),
UNIQ_TOP = Table.AddColumn(PivotedOnMetric,"Top UNIQ", each if [UNIQ]>UNIQ_P then "Top" else ""),
ND_TOP = Table.AddColumn(PivotedOnMetric,"Top ND", each if [ND]>ND_P then "Top" else ""),
NestedJoin = Table.NestedJoin(UNIQ_TOP,"Code", ND_TOP,"Code","ALL", JoinKind.Inner),
ExpandedALL = Table.ExpandTableColumn(NestedJoin, "ALL", {"Top ND"}, {"Top ND"})
in
ExpandedALL
PercentileInclusive function code:
//PercentileInclusive Function
let
Source = (inputSeries as list, percentile as number) =>
let
SeriesCount = List.Count(inputSeries),
PercentileRank = percentile*(SeriesCount-1)+1, //percentile value between 0 and 1
PercentileRankRoundedUp = Number.RoundUp(PercentileRank),
PercentileRankRoundedDown = Number.RoundDown(PercentileRank),
Percentile1 = List.Max(List.MinN(inputSeries,PercentileRankRoundedDown)),
Percentile2 = List.Max(List.MinN(inputSeries,PercentileRankRoundedUp)),
Percentile = Percentile1+(Percentile2-Percentile1)*(PercentileRank-PercentileRankRoundedDown)
in
Percentile
in
Source
Please let me know how this code can be optimized and made concise for performance.
Hello @Anonymous
about question 1) this is a column you added to the query-table manually. Just delete it.
question2) I know that in some excel-versions the table is selected after refresh. The only way you may be able to work around is to use a makro, that does a select of a single after the refresh
Hope that helps
Jimmy
@Jimmy801 i did not add that column. if you see my query code, only the required columns are being generated. But in this one, an extra column gets added. Try unloading the query and loading it to a new worksheet and see for yourself. Not sure why its happening with this query only. My concerns are that this extra column could be the reason why my excel workbook gets corrupted when vba code is applied.
P.S: Strangely, i see 3 dots "..." in the last row of this column.
Hello @Anonymous
I cannot confirm that. When I unload it and load it, then the correct data is shown in a new sheet
Then there is something else causing this
BR Jimmy
Now i got 2 extra columns viz., Column1 & Column2 added when i loaded to a new sheet in Column C and 5 extra columns when i loaded to a new sheet in Column F. I think i know what is happening. When i add a new sheet and select the column & row i.e cell where the data should be loaded, it somehow offsets those many columns from the table start column.
If i just load it by selecting New worksheet in the load dialog, it will default to the 1st column only and so no extra column gets added.
See, if you are aware of this same behaviour in your excel and if you know whats causing this issue.
Hello @Anonymous
sorry, I don't have table1 and table2. I can't reproduce your scenario. I'm also no expert on statistics. If you can tell me how percentile is calculated
BR
Jimmy
Hi @Jimmy801 ,
No need to modify the PercentileInclusive function. I added it as it is used in the calculation. It is similar to the Excel's Percentile.Inc function, but in Power Query.
It only means, given a series of values, find out which value is found at the given percentile - in my case, 80th percentile. Some interpolation is done at the end to arrive at a single percentile e.g. incase a value falls between 2 percentiles. .eg. a value 0.482 is falling between the 80th and 81th percentile, then the value is interpolated between the two and result at 80th Percentile could be 0.5. Here is a link for your information purposes:
https://www.datasciencemadesimple.com/percentile-inc-function-in-excel/
As for my issue, i tried using your method to optimize the query further a bit. I am attaching a sample workbook and 2 test csv files to load into the workbook.
Two behaviours observed that i need your immediate advice :
1] One thing i hope you can tell me is - why an extra Column (Column1) gets appended to the query when loading it first time.
2] If i refresh the Query "Manually" or "via VBA Code", the table always gets selected. I came to know that this should not happen. I have just now repaired my Office to be sure, but still the same. Is there a way to turn off this table selection behaviour in Excel? I am using Excel 2016 Pro 64 bit.
Hope you can help me with this query issue quickly as it is causing my workbook to corrupt when i run this query with VBA code. I have left simple comments in the workbook.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.