Solved: Re: Faster Division/Multiplication using Merge sam...

Anonymous · ‎10-19-2020

I have a table of Columns containing Attribute counts. I have added a column that does row-wise summation of the Attribute column (denoted by Numbers) values. I want to Divide the Attribute columns by the "Count" column, in-place (replace values) i.e. do a `Percentage of Row-Total`. I don't want to add new columns, then do the calculation, then delete the original column.

"ID" "CD" "6" "5" "4" "3" "1" "Count"

1	RJ.1	45	124	53	75	5	302
2	RJ.2	43	66	36	57	9	211
3	RJ.3	29	104	41	76	8	258
4	RJ.4	26	46	23	34	11	140
5	RJ.5	34	51	41	22	13	161
6	RJ.6	33	63	26	58	13	193
7	RJ.7	22	56	40	41	8	167
8	RJ.8	34	93	40	54	11	232
9	RJ.9	34	69	34	50	11	198
10	RJ.10	27	46	33	42	8	156
11	RJ.11	32	70	26	42	15	185
12	RJ.12	22	57	35	45	5	164
13	RJ.13	25	83	39	43	9	199
14	RJ.14	31	69	26	50	8	184
15	RJ.15	24	62	39	35	10	170
16	RJ.16	11	75	43	57	6	192
17	RJ.17	21	45	21	23	8	118
18	RJ.18	20	51	30	32	12	145
19	RJ.19	28	60	34	37	9	168
20	RJ.20	26	38	29	41	4	138
21	RJ.21	21	64	23	42	11	161
22	RJ.22	33	61	35	38	11	178
23	RJ.23	19	51	36	36	8	150
24	RJ.24	22	65	32	36	11	166
25	RJ.25	28	85	35	45	11	204
26	RJ.26	26	56	28	27	11	148
27	RJ.27	29	43	41	33	7	153
28	RJ.28	22	57	15	43	11	148
29	RJ.29	14	30	22	35	9	110
30	RJ.30	19	35	25	27	8	114
31	RJ.31	18	30	28	26	12	114
32	RJ.32	13	54	36	44	10	157
33	RJ.33	20	59	19	40	9	147
34	RJ.34	24	34	38	23	5	124
35	RJ.35	26	67	37	39	8	177
36	RJ.36	27	46	31	34	7	145
37	RJ.37	24	26	27	36	13	126
38	RJ.38	14	24	24	21	10	93
39	RJ.39	31	58	34	28	10	161
40	RJ.40	18	45	25	37	10	135

In one of my searches, someone had suggested that the values for multiple columns could be calculated in a faster way (avoid row level context evaluation) by merging the table with itself and with some M-code. But not shared the code example.

Is it possible?

Jimmy801 · ‎10-19-2020

Hello @Anonymous

here some dynamic solution. Uses Unpivoting, Grouping and twice transforming of columns. The only parameter you have to change is the list in the Unpivot other (in case you have some additional columns). As you can see you can also remove the column "count" from your table.

let
    Origine = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("VVRLdiQxCLtLr/PmGTD+XGGWs+3X979GCgmYZNFWUiVTIMl+v1/y+nr9+/snYPqziM5ndXuWHf/Hz4a+Pl/vl5IbMIOwVryMxfez3OenIqAaqQEaz2VE2Rmf2bHhxAs/4E5yAzTezVg0tlo8k9gkc4Dr5Hq9dKmyGm1J7JLFHha5aBLtWn3BT3OvgbvJ3VXI0cio4uCvDeoh9VQL14rq3a4aFbvk3uKu/stHj3Ypg4y0IlB3CYHWp1YTT2Mgl2+BFm/3aP2gBMw8TnY6B+R8Ud+8XAd5TZLTO6DGmwMvbtkOP+8lOc0DmtSIVHlU0ycrp3tA5GxpVUYvGF02rZb0D4g5EUi0gPbx/FJoSQOB2mlWqSihD0ml00KgjoqRjdISOslM8dJEoMaWNcpFq9zLYmlNE7XdsFNnAElCRCzJaSK7hHiz+qWJ8iPOWsdPO9BSwtlp9s7a6aIy5T1jH1mmiVJrughE/eWlha3/jTB7mjYCIcjxX2niERg0XdNGIHOxah9Snuc7204fgdqRg3SYGTuch1bTR9bqWEvH5Hfp9JH30SzDKaeXj0JFLH0ESgeUI+/OE0e09BGIULHyqYmZp2Knj9YXFu4N6DxnnwLnbWN1k1pn9VZLuHQ4TZLTRyDMZEpPpapueJDTRg4WX1+7Io0jiRF3Vk4XgT8uJ6lv7B8nxtJEoPa9jm2ME8xRxsnSRWa4O+cipUfe05Ym8s6It94XMfSmeHlkZroIhDGzXURzZD8KfD7f", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, CD = _t, #"6" = _t, #"5" = _t, #"4" = _t, #"3" = _t, #"1" = _t, Count = _t]),
    ChangeType = Table.TransformColumnTypes(Origine,{{"6", Int64.Type}, {"5", Int64.Type}, {"4", Int64.Type}, {"3", Int64.Type}, {"1", Int64.Type}}),
    #"Removed Columns" = Table.RemoveColumns(ChangeType,{"Count"}),
    #"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Removed Columns", {"ID", "CD"}, "Attribute", "Value"),
    #"Grouped Rows" = Table.Group(#"Unpivoted Other Columns", {"CD"}, {{"AllRows", each _, type table [ID=text, CD=text, Attribute=text, Value=number]}}),
    TransformAllRows = Table.TransformColumns
    (
        #"Grouped Rows",
        {
            {
                "AllRows",
                (tbl)=> 
                let
                    SumValue = List.Sum(tbl[Value]),
                    TransformTbl = Table.TransformColumns
                    (
                        tbl,
                        {
                            {
                                "Value",
                                each _ / SumValue
                            }
                        }
                    )
                in 
                    TransformTbl

            }
        }
    ),
    #"Expanded AllRows" = Table.ExpandTableColumn(TransformAllRows, "AllRows", {"ID", "Attribute", "Value"}, {"ID", "Attribute", "Value"}),
    #"Pivoted Column" = Table.Pivot(#"Expanded AllRows", List.Distinct(#"Expanded AllRows"[Attribute]), "Attribute", "Value", List.Sum),
    #"Changed Type" = Table.TransformColumnTypes(#"Pivoted Column",{{"ID", Int64.Type}}),
    #"Sorted Rows" = Table.Sort(#"Changed Type",{{"ID", Order.Ascending}})
in
    #"Sorted Rows"

Copy paste this code to the advanced editor in a new blank query to see how the solution works.

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

View solution in original post

watkinnc · ‎10-27-2020

Have you tried just transforming the columns?

Table.TransformColumns(PriorStep, {{"6",each _/[Count]}, {"5", each _/[Count]}})

No reason that shouldn't work for ya.

I’m usually answering from my phone, which means the results are visualized only in my mind. You’ll need to use my answer to know that it works—but it will work!!

Anonymous · ‎11-01-2020

Hi @watkinnc,

Thanks for your reply. Could you look at the sample files that i had shared earlier and see if you could recreate the Output sheet table from the given sample file?

Jimmy801 · ‎10-19-2020

Hello @Anonymous

here some dynamic solution. Uses Unpivoting, Grouping and twice transforming of columns. The only parameter you have to change is the list in the Unpivot other (in case you have some additional columns). As you can see you can also remove the column "count" from your table.

let
    Origine = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("VVRLdiQxCLtLr/PmGTD+XGGWs+3X979GCgmYZNFWUiVTIMl+v1/y+nr9+/snYPqziM5ndXuWHf/Hz4a+Pl/vl5IbMIOwVryMxfez3OenIqAaqQEaz2VE2Rmf2bHhxAs/4E5yAzTezVg0tlo8k9gkc4Dr5Hq9dKmyGm1J7JLFHha5aBLtWn3BT3OvgbvJ3VXI0cio4uCvDeoh9VQL14rq3a4aFbvk3uKu/stHj3Ypg4y0IlB3CYHWp1YTT2Mgl2+BFm/3aP2gBMw8TnY6B+R8Ud+8XAd5TZLTO6DGmwMvbtkOP+8lOc0DmtSIVHlU0ycrp3tA5GxpVUYvGF02rZb0D4g5EUi0gPbx/FJoSQOB2mlWqSihD0ml00KgjoqRjdISOslM8dJEoMaWNcpFq9zLYmlNE7XdsFNnAElCRCzJaSK7hHiz+qWJ8iPOWsdPO9BSwtlp9s7a6aIy5T1jH1mmiVJrughE/eWlha3/jTB7mjYCIcjxX2niERg0XdNGIHOxah9Snuc7204fgdqRg3SYGTuch1bTR9bqWEvH5Hfp9JH30SzDKaeXj0JFLH0ESgeUI+/OE0e09BGIULHyqYmZp2Knj9YXFu4N6DxnnwLnbWN1k1pn9VZLuHQ4TZLTRyDMZEpPpapueJDTRg4WX1+7Io0jiRF3Vk4XgT8uJ6lv7B8nxtJEoPa9jm2ME8xRxsnSRWa4O+cipUfe05Ym8s6It94XMfSmeHlkZroIhDGzXURzZD8KfD7f", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, CD = _t, #"6" = _t, #"5" = _t, #"4" = _t, #"3" = _t, #"1" = _t, Count = _t]),
    ChangeType = Table.TransformColumnTypes(Origine,{{"6", Int64.Type}, {"5", Int64.Type}, {"4", Int64.Type}, {"3", Int64.Type}, {"1", Int64.Type}}),
    #"Removed Columns" = Table.RemoveColumns(ChangeType,{"Count"}),
    #"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Removed Columns", {"ID", "CD"}, "Attribute", "Value"),
    #"Grouped Rows" = Table.Group(#"Unpivoted Other Columns", {"CD"}, {{"AllRows", each _, type table [ID=text, CD=text, Attribute=text, Value=number]}}),
    TransformAllRows = Table.TransformColumns
    (
        #"Grouped Rows",
        {
            {
                "AllRows",
                (tbl)=> 
                let
                    SumValue = List.Sum(tbl[Value]),
                    TransformTbl = Table.TransformColumns
                    (
                        tbl,
                        {
                            {
                                "Value",
                                each _ / SumValue
                            }
                        }
                    )
                in 
                    TransformTbl

            }
        }
    ),
    #"Expanded AllRows" = Table.ExpandTableColumn(TransformAllRows, "AllRows", {"ID", "Attribute", "Value"}, {"ID", "Attribute", "Value"}),
    #"Pivoted Column" = Table.Pivot(#"Expanded AllRows", List.Distinct(#"Expanded AllRows"[Attribute]), "Attribute", "Value", List.Sum),
    #"Changed Type" = Table.TransformColumnTypes(#"Pivoted Column",{{"ID", Int64.Type}}),
    #"Sorted Rows" = Table.Sort(#"Changed Type",{{"ID", Order.Ascending}})
in
    #"Sorted Rows"

Copy paste this code to the advanced editor in a new blank query to see how the solution works.

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

Anonymous · ‎10-20-2020

Hi @Jimmy801 ,

I found this solution last night which is pretty fast and consice:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("VVRLdiQxCLtLr/PmGTD+XGGWs+3X979GCgmYZNFWUiVTIMl+v1/y+nr9+/snYPqziM5ndXuWHf/Hz4a+Pl/vl5IbMIOwVryMxfez3OenIqAaqQEaz2VE2Rmf2bHhxAs/4E5yAzTezVg0tlo8k9gkc4Dr5Hq9dKmyGm1J7JLFHha5aBLtWn3BT3OvgbvJ3VXI0cio4uCvDeoh9VQL14rq3a4aFbvk3uKu/stHj3Ypg4y0IlB3CYHWp1YTT2Mgl2+BFm/3aP2gBMw8TnY6B+R8Ud+8XAd5TZLTO6DGmwMvbtkOP+8lOc0DmtSIVHlU0ycrp3tA5GxpVUYvGF02rZb0D4g5EUi0gPbx/FJoSQOB2mlWqSihD0ml00KgjoqRjdISOslM8dJEoMaWNcpFq9zLYmlNE7XdsFNnAElCRCzJaSK7hHiz+qWJ8iPOWsdPO9BSwtlp9s7a6aIy5T1jH1mmiVJrughE/eWlha3/jTB7mjYCIcjxX2niERg0XdNGIHOxah9Snuc7204fgdqRg3SYGTuch1bTR9bqWEvH5Hfp9JH30SzDKaeXj0JFLH0ESgeUI+/OE0e09BGIULHyqYmZp2Knj9YXFu4N6DxnnwLnbWN1k1pn9VZLuHQ4TZLTRyDMZEpPpapueJDTRg4WX1+7Io0jiRF3Vk4XgT8uJ6lv7B8nxtJEoPa9jm2ME8xRxsnSRWa4O+cipUfe05Ym8s6It94XMfSmeHlkZroIhDGzXURzZD8KfD7f", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, CD = _t, #"6" = _t, #"5" = _t, #"4" = _t, #"3" = _t, #"1" = _t, Counts = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"6", Int64.Type}, {"5", Int64.Type}, {"4", Int64.Type}, {"3", Int64.Type}, {"1", Int64.Type}, {"Counts", Int64.Type}}),
    
    PercentOfRow = Table.FromRecords(
    Table.TransformRows(
        #"Changed Type",
        (r) => 
        Record.TransformFields(
            r,
            {
                {"6", each r[#"6"] / r[Counts] },
                {"5", each r[#"5"] / r[Counts] },
                {"4", each r[#"4"] / r[Counts] },
                {"3", each r[#"3"] / r[Counts] },
                {"1", each r[#"1"] / r[Counts] }
            }
        )
    ),        
    type table [
               CD= Text.Type,
               ID = Int64.Type,
               #"6" = Percentage.Type, 
               #"5" = Percentage.Type,
               #"4" = Percentage.Type,
               #"3" = Percentage.Type,
               #"1" = Percentage.Type
    ]
)
in
    PercentOfRow

taken from here:

https://stackoverflow.com/questions/31885049/power-query-transform-a-column-by-multiplying-by-anothe...

and here:

https://stackoverflow.com/questions/31548135/power-query-transform-a-column-based-on-another-column?...

Anonymous · ‎10-20-2020

Hi @Anonymous

the solution you found works at the row level, which is just what it seemed you asked not to do.

If you are also interested in this type of solution, it is also possible to make it usable in general and not only for 5 columns with a given name.

Anonymous · ‎10-20-2020

I found @Anonymous solution quite innovative with the calculations done in lists. However, the solution loses other main columns like CD, ID etc and you have to again rename the Attrb columns.

@Jimmy801 solution was straight forward with the unpivot - pivot and calculations though it took the same amount of time as my solution.

The problem that i am facing is that i am adding additional calculated columns to this final table and at each step that references previous calculated step, the loading just increases exponentially into Gigabytes though only for these 40 rows.

for e.g.

1. Step1 = Table.AddColumn(Step0, each [6] / Average[6]) then

2. Step2 = Table.AddColumn(Step1, each [4] / Average[4]).

But that will be another post that i will put up today.

Anonymous · ‎10-20-2020

Hi @Anonymous

" ...

I found @Rocco_sprmnt21 solution quite innovative with the calculations done in lists. However, the solution loses other main columns like CD, ID etc and you have to again rename the Attrb columns.

..."

I have tried, from what can be understood from the request, to stay strictly on the subject and give a hopefully useful suggestion.
I haven't tried to do a school homework with all the things in the right place: I don't have enough time to do this right now.
If from a performance point of view my solution is preferable to the others that work at the row level, you could fill in the missing aspects.

Anonymous · ‎10-20-2020

From the title of the subject it seemed an important aspect, but I'm not sure if that's the case.
However, I wanted to propose a comparison between the performances of the different solutions.
This calculates the percentages of the row total for a 10 X 1,000,000 table less than 5 '

let

dim=1000000,
    lst = List.Transform({1..10}, each List.Transform(List.Random(dim, _), each Number.Round(_*dim,0))),
    tfo=  Table.FromColumns(lst),
    #"Aggiunta colonna indice" = Table.AddIndexColumn(tfo, "Indice", 1, 1, Int64.Type),
    tab = Table.AddColumn(#"Aggiunta colonna indice", "CD", each "RJ."& Text.From([Indice])),
    Origine = tab,
    #"Rimosse colonne" = Table.RemoveColumns(Origine,{"Indice"}),
    #"Trasformate altre colonne tramite UnPivot" = Table.UnpivotOtherColumns(#"Rimosse colonne", {"CD"}, "Attributo", "Valore"),
    #"Rimosse colonne1" = Table.RemoveColumns(#"Trasformate altre colonne tramite UnPivot",{"Attributo"}),
    #"Colonna trasformata tramite Pivot" = Table.Pivot(#"Rimosse colonne1", List.Distinct(#"Rimosse colonne1"[CD]), "CD", "Valore", (v)=>List.Transform(v, each _/List.Sum(v))),
    #"Trasposta colonna" = Table.FromRows(Table.Transpose(#"Colonna trasformata tramite Pivot")[Column1])
in

#"Trasposta colonna"

this ("yours") seems much faster

let 

    dim=1000000,

   Origine = Table.FromColumns(List.Transform({1..5}, each List.Transform(List.Random(dim, _), each Number.Round(_*dim,0))),{"6","5","4","3","1"}),
    #"Aggiunta colonna personalizzata" = Table.AddColumn(Origine, "Counts", each [6]+[5]+[4]+[3]+[1]),



    #"Changed Type" = Table.TransformColumnTypes(#"Aggiunta colonna personalizzata",{{"6", Int64.Type}, {"5", Int64.Type}, {"4", Int64.Type}, {"3", Int64.Type}, {"1", Int64.Type}, {"Counts", Int64.Type}}),
    
    PercentOfRow = Table.FromRecords(
    Table.TransformRows(
        #"Changed Type",
        (r) => 
        Record.TransformFields(
            r,
            {
                {"6", each r[#"6"] / r[Counts] },
                {"5", each r[#"5"] / r[Counts] },
                {"4", each r[#"4"] / r[Counts] },
                {"3", each r[#"3"] / r[Counts] },
                {"1", each r[#"1"] / r[Counts] }
            }
        )
    ),        
    type table [
            
               #"6" = Percentage.Type, 
               #"5" = Percentage.Type,
               #"4" = Percentage.Type,
               #"3" = Percentage.Type,
               #"1" = Percentage.Type
    ]
)
in
    PercentOfRow

Anonymous · ‎10-20-2020

Interesting @Anonymous !

The main problem with my solution or yours or @Jimmy801 is that when i start adding custom calculated columns (row-level evaluation) to our final table, the loading time just keeps on increasing exponentially. 😞

for e.g. if i add the following calculated columns to our final table:

Step1 = AddColumn(Final, "Total" , each [#"6"] + [#"4"], Percentage.Type)

RowAvgOfTotal = Number.Round(List.Average(Step1[Total]), 2)

Step2 = AddColumn(Step1, "Total%" , each [Total] / RowAvgOfTotal, Percentage.Type)

RowPercentileOfTotal = Number.Round(PercentileInclusive(Step2[Total%], 0.8), 4 )
Step3 = AddColumn(Step2, "Total%Percentile" , each [Total] / RowPercentileOfTotal, Percentage.Type)

PercentileInclusive Function:

//PercentileInclusive Function
let
    Source = (inputSeries as list, percentile as number) =>
let
    SeriesCount = List.Count(inputSeries),
    PercentileRank = percentile*(SeriesCount-1)+1, //percentile value between 0 and 1
    PercentileRankRoundedUp = Number.RoundUp(PercentileRank),
    PercentileRankRoundedDown = Number.RoundDown(PercentileRank),
    Percentile1 = List.Max(List.MinN(inputSeries,PercentileRankRoundedDown)),
    Percentile2 = List.Max(List.MinN(inputSeries,PercentileRankRoundedUp)),
    Percentile = Percentile1+(Percentile2-Percentile1)*(PercentileRank-PercentileRankRoundedDown)
in
    Percentile
in
    Source

Adding such calculated columns just increases loading time from 4 to 5mbs to gb's. It is apparent that adding the columns to previous steps each time increases the load time as i think the previous calculated columns are also getting evaluated again. Any alternative approach that you can suggest?

Anonymous · ‎10-20-2020

I'm not sure I understand what you want to do, but maybe it can be useful to try doing separate calculations on the same source table and then putting it all together.

ac1 = func1 (tab),
ac2 = func2 (tab),
..

tfc=Table.FromColumns(

Anonymous · ‎10-22-2020

@Anonymous

@Anonymous wrote:
I'm not sure I understand what you want to do, but maybe it can be useful to try doing separate calculations on the same source table and then putting it all together.

ac1 = func1 (tab),
ac2 = func2 (tab),
..

tfc=Table.FromColumns(

These seem useful, but don't know how to do it without affecting loading time. How would you do it with my given sample files to create the desired output as shown in the workbook in a speedier way? Right now, in my amateur way, the loading takes a lot of time (going into gb's).

Anonymous · ‎10-22-2020

Hi @Anonymous

the csv and xlsx files you uploaded are not clear to me.

the files you uploaded are not clear to me.
And above all, it is not clear what gb's are.
I have already showed you with code that the operations you described (including those with percentiles) in a table of 1000000 rows and 10 columns are performed in a few minutes.

PS
In general, avoid making M do unnecessary operations such as rounding and other aesthetic things.
You can do these in a different environment or at the end.

Anonymous · ‎10-22-2020

@Anonymous oh, sorry for the confusion.

GB's = Gigabytes...that's what shows up when the workbook is connecting to the Datasource i.e. Sample CSV when i am adding additional columns as shown in the `Desired output`.

What i meant is, given a sample csv file, how do i transform this csv into the `Desired Output` without increasing the load time when i refresh the queries? The output is pretty straight-forward and all the column formulae are shown.

If it's possible, can you revisit and have a look at the workbook? I am particularly interested in the suggestion you made earlier i.e. to calculate the columns separately and then club them together. Can you help here?

Anonymous · ‎10-22-2020

Since the csv and xlsx files you uploaded are not clear to me, I use a randomly created tables (5k rows) to explain an idea of what can happen in your case (but without having a precise view of the input data and the code you use, I can just guess what happens).

Run the two different scripts and note the differences.

If you find any differences, use the scheme that best suits your needs.

let
    dim=50000,
    Origine = Table.FromColumns(List.Transform({1..10}, each List.Transform(List.Random(dim, _), each Number.Round(_*dim,0))),{"0".."9"}),
    #"Aggiunta colonna personalizzata" = Table.AddColumn(Origine, "total", each [4]+[6]),
    avg=List.Average(#"Aggiunta colonna personalizzata"[total]), 
    acp1 = Table.AddColumn(#"Aggiunta colonna personalizzata", "pctot", each [total]/avg),
    pcn=pcntl(acp1[pctot],0.8),
    acp = Table.AddColumn(acp1, "pcntl", each if [pctot]/pcn>1 then "H" else "")
in
    acp

let
    dim=50000,
    Origine = Table.FromColumns(List.Transform({1..10}, each List.Transform(List.Random(dim, _), each Number.Round(_*dim,0))),{"0".."9"}),
    #"Aggiunta colonna personalizzata" = Table.AddColumn(Origine, "total", each [4]+[6]),
    acp1 = Table.AddColumn(#"Aggiunta colonna personalizzata", "pctot", each [total]/List.Average(#"Aggiunta colonna personalizzata"[total])),
    acp = Table.AddColumn(acp1, "pcntl", each if [pctot]/pcntl(acp1[pctot],0.8)>1 then "H" else "")
in
    acp

pcntl(list, number) is the your percentile function.

Anonymous · ‎10-20-2020

let
    dim=500000,
    Origine = Table.FromColumns(List.Transform({1..10}, each List.Transform(List.Random(dim, _), each Number.Round(_*dim,0))),{"0".."9"}),


    PercentOfRow = Table.FromRecords( Table.TransformRows(Origine, (r)=>  let sum=List.Sum(Record.FieldValues(r)) in  Record.FromList(List.Transform(Record.FieldValues(r), each _/sum), {"0".."9"})& [Counts=sum])),
    pc = Table.TransformColumnTypes(PercentOfRow,List.Transform(Table.ColumnNames(Origine),each {_, Percentage.Type})),
    #"Aggiunta colonna personalizzata" = Table.AddColumn(pc, "total", each [4]+[6]),
    avg =  List.Average(#"Aggiunta colonna personalizzata"[total]),

    acp1 = Table.AddColumn(#"Aggiunta colonna personalizzata", "pctot", each [total]/avg),
    pcn=pcntl(acp1[pctot],0.8),
    acp = Table.AddColumn(acp1, "pcntl", each [pctot]/pcn)
in
    acp

this om my laptop ends in few minutes

Anonymous · ‎10-21-2020

I was tempted to generalize the following approach. For the first part it wasn't difficult.
But for the part where the type of the columns is specified, it wasn't easy.

PercentOfRow = Table.FromRecords(
    Table.TransformRows(
        #"Changed Type",
        (r) => 
        Record.TransformFields(
            r,
            {
                {"6", each r[#"6"] / r[Counts] },
                {"5", each r[#"5"] / r[Counts] },
                {"4", each r[#"4"] / r[Counts] },
                {"3", each r[#"3"] / r[Counts] },
                {"1", each r[#"1"] / r[Counts] }
            }
        )
    ),        
    type table [
               CD= Text.Type,
               ID = Int64.Type,
               #"6" = Percentage.Type, 
               #"5" = Percentage.Type,
               #"4" = Percentage.Type,
               #"3" = Percentage.Type,
               #"1" = Percentage.Type
    ]
)
in
    PercentOfRow

But here is the result anyway.
A big thank to Ben Gribaudo now the solution is general and again one-linear

let 
    dim=1000000,
    Origine = Table.FromColumns(List.Transform({1..10}, each List.Transform(List.Random(dim, _), each Number.Round(_*dim,0))),{"0".."9"}),
    
    PercentOfRow = Table.FromRecords( Table.TransformRows(Origine, (r)=>  let sum=List.Sum(Record.FieldValues(r)) in  Record.FromList(List.Transform(Record.FieldValues(r), each _/sum) , {"0".."9"})& [Counts=sum]),   Expression.Evaluate("type table"& "["&Text.Combine(List.Transform({"0".."9"}, each _&"="&"Percentage.Type"),",")&", Counts=number]",#shared))

in
  PercentOfRow

Anonymous · ‎10-21-2020

Hi @Anonymous ,

Any query in M can be written as one line this way, it is just a matter of time and persistence :).

This is similar, but parametrised approach:

let
    Origine = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("VVRLdiQxCLtLr/PmGTD+XGGWs+3X979GCgmYZNFWUiVTIMl+v1/y+nr9+/snYPqziM5ndXuWHf/Hz4a+Pl/vl5IbMIOwVryMxfez3OenIqAaqQEaz2VE2Rmf2bHhxAs/4E5yAzTezVg0tlo8k9gkc4Dr5Hq9dKmyGm1J7JLFHha5aBLtWn3BT3OvgbvJ3VXI0cio4uCvDeoh9VQL14rq3a4aFbvk3uKu/stHj3Ypg4y0IlB3CYHWp1YTT2Mgl2+BFm/3aP2gBMw8TnY6B+R8Ud+8XAd5TZLTO6DGmwMvbtkOP+8lOc0DmtSIVHlU0ycrp3tA5GxpVUYvGF02rZb0D4g5EUi0gPbx/FJoSQOB2mlWqSihD0ml00KgjoqRjdISOslM8dJEoMaWNcpFq9zLYmlNE7XdsFNnAElCRCzJaSK7hHiz+qWJ8iPOWsdPO9BSwtlp9s7a6aIy5T1jH1mmiVJrughE/eWlha3/jTB7mjYCIcjxX2niERg0XdNGIHOxah9Snuc7204fgdqRg3SYGTuch1bTR9bqWEvH5Hfp9JH30SzDKaeXj0JFLH0ESgeUI+/OE0e09BGIULHyqYmZp2Knj9YXFu4N6DxnnwLnbWN1k1pn9VZLuHQ4TZLTRyDMZEpPpapueJDTRg4WX1+7Io0jiRF3Vk4XgT8uJ6lv7B8nxtJEoPa9jm2ME8xRxsnSRWa4O+cipUfe05Ym8s6It94XMfSmeHlkZroIhDGzXURzZD8KfD7f", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, CD = _t, #"6" = _t, #"5" = _t, #"4" = _t, #"3" = _t, #"1" = _t, Count = _t]),
    ChangeType = Table.TransformColumnTypes(Origine,{{"6", Int64.Type}, {"5", Int64.Type}, {"4", Int64.Type}, {"3", Int64.Type}, {"1", Int64.Type}}),
    #"Removed Columns" = Table.RemoveColumns(ChangeType,{"Count"}),


    mTable = #"Removed Columns",
    nonValueColumns = {"ID", "CD"},
    
    fTransformRecord = (mRecord as record, mNonValueColumns as list)=>
        let 
            recordsToProcess = Record.RemoveFields(mRecord, mNonValueColumns),
            sumTotal = List.Sum(Record.ToList(recordsToProcess)),
            TransformRecord = List.Accumulate(Record.FieldNames(recordsToProcess), recordsToProcess, (a, n)=> Record.TransformFields(a, {n, (x)=>x / sumTotal}))   & Record.SelectFields(mRecord, nonValueColumns)     
        in TransformRecord,
    
    Output = Table.FromRecords(Table.TransformRows(mTable, (x)=> fTransformRecord(x, nonValueColumns))) 

in Output

Kind regards,

JB

Anonymous · ‎10-22-2020

Hi @Anonymous ,

this statement of yours

"Any query in M can be written as one line this way, it is just a matter of time and persistence :)." (*)

is an incontrovertible truth of which I am well aware.

But it was not from this general aspect that my enthusiasm for the 1-line solution arose.

The purpose of my previous post, after seeing that @Anonymous 's 1-line solution was performing, was to generalize it, for example, to be able to deal with a generic number of columns.

But, in a first attempt I was unable to make the part of setting column types parametric and in the same expression.

After reading Bengraud's blog, I was able, albeit with the use of the expression.evaluate function, to put in the same expression as it was in the 1-line solution of @Anonymous also the part of setting column types.

from this two

    PercentOfRow = Table.FromRecords( Table.TransformRows(Origine, (r)=>  let sum=List.Sum(Record.FieldValues(r)) in  Record.FromList(List.Transform(Record.FieldValues(r), each _/sum), {"0".."9"})& [Counts=sum])),
    pc = Table.TransformColumnTypes(PercentOfRow,List.Transform(Table.ColumnNames(Origine),each {_, Percentage.Type}))

to this one

    PercentOfRow = Table.FromRecords( Table.TransformRows(Origine, (r)=>  let sum=List.Sum(Record.FieldValues(r)) in  Record.FromList(List.Transform(Record.FieldValues(r), each _/sum) , {"0".."9"})& [Counts=sum]),   Expression.Evaluate("type table"& "["&Text.Combine(List.Transform({"0".."9"}, each _&"="&"Percentage.Type"),",")&", Counts=number]",#shared))

just this 🙂

(*)

the reduction to a single line was not simply incorporating one expression into another: so for example.

    pc = Table.TransformColumnTypes(    PercentOfRow = Table.FromRecords( Table.TransformRows(Origine, (r)=>  let sum=List.Sum(Record.FieldValues(r)) in  Record.FromList(List.Transform(Record.FieldValues(r), each _/sum), {"0".."9"})& [Counts=sum])),,List.Transform(Table.ColumnNames(Origine),each {_, Percentage.Type}))

I would never have been thrilled with a solution put like this.

PS

In general I believe that it is not the task of power query to do these "aesthetic" operations, I mean to change the number type to percentage type.

Perhaps it is preferable to do only mash-up and not make-up.

Anonymous · ‎10-21-2020

This is fast. However, once you start adding additional calculated columns (see sample workbook & sample csv in TEST folder in my previous reply), then loading the table each time a calculated column gets added, goes for a toss! That is my concern and the desired output i am trying to achieve in a fast loading manner.

Anonymous · ‎10-21-2020

Wow! This is incredible.

@Anonymous, I created a `sample workbook` and the `sample input csv` showing the Input and the desired Output just for your experienced eyes to see why the data is loading so slowly. I have added the percentile function that i had googled earlier (it seems there is no inbuilt functions in PQ for Percentile, Rank etc.)

Here are the files in this TEST folder. I have removed adding the other Columns code (as shown in desired output) as it was taking a long time to finish and just kept the code till where the row percentage of total gets calculated. The calculations for the additional columns are shown in the desired output table.

I am not sure what you meant by creating the columns separately and then joining them. I hope you can provide some clear logic on how to speed up the loading once the entire desired output report gets created.

Faster Division/Multiplication using Merge same table Query - Power Query

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024