cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
smpa01
Super User
Super User

PQ to add column from list to an existing table

My base table is this

 

#table({"col1", "col2"}, {{1,2},{3,4}})

 

I have a list as this

 

_list = {5,6},

 

I want to end up with this

 

| col1 | col2 | col3 |
|------|------|------|
| 1    | 3    | 5    |
| 2    | 4    | 6    |

 

I tried this which doesn't work.

 

let
    Source = #table({"col1", "col2"}, {{1,2},{3,4}}),
    _list = {5,6},
    Custom1 = Table.AddColumn(Source,"col3", each _list)
in
Custom1

 

 

 

 





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!






New Animated Dashboard: Sales Calendar


3 ACCEPTED SOLUTIONS
wdx223_Daniel
Super User
Super User

= Table.AddColumn(Source,"col3",each Function.ScalarVector(type function(n as number) as number ,each _list)(_))

i did not really get the trick of Function.ScalarVector, just try and use it.

View solution in original post

@wdx223_Daniel That's a pretty neat trick I haven't used before. Kudos!

 

It can indeed be extended to multiple columns by zipping the lists together and creating a record for each row to then expand. I'm sure there are lots of other possible generalizations and I'd love to see how you'd do it.

 

let
  Source = #table({"col1", "col2"}, {{1, 2}, {3, 4}}), 
  list1 = {5, 6}, 
  list2 = {7, 8}, 
  list3 = {9, 0}, 
  ListsCombined = List.Zip({list1, list2, list3}), 
  NewColNames = {"col3", "col4", "col5"}, 
  Custom1 = Table.AddColumn(
    Source, 
    "NewCols", 
    each Record.FromList(
      Function.ScalarVector(type function (n as list) as list, each ListsCombined)(_), 
      NewColNames
    )
  ), 
  #"Expanded NewCols" = Table.ExpandRecordColumn(Custom1, "NewCols", NewColNames, NewColNames)
in
  #"Expanded NewCols"

 

 

@smpa01 I'm glad you asked this question. I find it relatively common to need to add existing lists as new columns but never found an approach that feels clean and efficient.

 

@ImkeF I'd love to hear what you'd suggest for this topic since, to me, it feels similar in flavor to your Transforming Multiple Columns post.

View solution in original post

Hi all,

@ImkeF , @smpa01 , @AlexisOlson , @wdx223_Daniel @latimeria 

 

The benefits of "scalar" approach escapes for me. In the example that Chris brought in his table, it assumes performing a row-by-row function, which can be optimised this way.

 

In this particular case we are dealing with merging two (or more) sets of data which does not have anything in common (except, potentially, an equal number of rows). There is no functional dependency of one set on another from what I see.

 

Therefore this is unclear which option would be quicker for this particular application. I would say, it is heavily dependant on the implementation. For instance, if the table is mapped to the memory as a set of columns somehow build into a form of a list, breaking it down to columns would not take any time regardless of the number of rows. The new columns would be just added to the list of columns (as references rather than copying blocks of data into a new structure). This can work because since defined variable is immutable, hance the reference is guaranteed to not to change whatever happens. 

 

On the other hand, "scalar" option in this world would have to actually scroll through each item in the list (hence this can be done only once for all new columns at the same time) to add them to the existing table as new columns (i. e. physically copy blocks of data from one place in memory to another). Buffering the table can improve this to some extend, but the mechanics of this I can't explain/envisage.

 

I've run the following test. See two queries below, one is a "scalar" and another Table.ToColumns option of adding lists. Both operate with 1m rows (both on the table and the lists):

let
    data = List.Zip({{1..1000000}, List.Transform({1..1000000}, each Text.From(_) & "." & "0")}),
    Source = #table({"col1", "col2"}, data), 
    list1 = List.Transform({1..1000000}, each Text.From(_) & "." & "1"), 
    list2 = List.Transform({1..1000000}, each Text.From(_) & "." & "2"), 
    list3 = List.Transform({1..1000000}, each Text.From(_) & "." & "3"), 

// Buffer function input for performance reasons
    BufferedTable = List.Buffer(
        Table.ToRecords(#table({"col3", "col4", "col5"}, List.Zip({list1, list2, list3})))
    ), 
    AddColumn = Table.AddColumn(
        Source, 
        "NewCols", 
        each Function.ScalarVector(type function (x as record) as record, each BufferedTable)(_)
    ), 
    #"Expanded NewCols" = Table.ExpandRecordColumn(AddColumn, "NewCols", {"col3", "col4", "col5"})
in
    #"Expanded NewCols"

 

let
    data = List.Zip({{1..1000000}, List.Transform({1..1000000}, each Text.From(_) & "." & "0")}),
    Source = #table({"col1", "col2"}, data), 
    list1 = List.Transform({1..1000000}, each Text.From(_) & "." & "1"), 
    list2 = List.Transform({1..1000000}, each Text.From(_) & "." & "2"), 
    list3 = List.Transform({1..1000000}, each Text.From(_) & "." & "3"),
    Custom1 = Table.FromColumns(Table.ToColumns(Source) & {list1} & {list2} & {list3})
in
    Custom1

 

The "scalar" option on my laptop refreshes for about 3 sec, which is quick, but the Table.ToColumns refreshes in a blink of the eye, I can't even see the running dots to start counting. Also, IMHO, the table.tocolumns looks a bit neater and straightforward.

 

This kind of proofs the point with in-memory organisation of tables which favours the Table.ToColumns version. But I can see that this may be case-dependant.

 

Cheers,

John

   

View solution in original post

7 REPLIES 7
ImkeF
Super User
Super User

Hi @AlexisOlson ,
I must admint I haven't used this function much so far, but re-reading Chris' article about it, I definitely should change that.
As he pointed out the performance benefits only apply to buffered inputs, you might consider re-writing it like so for example:

let
    Source = #table({"col1", "col2"}, {{1, 2}, {3, 4}}), 
    list1 = {5, 6}, 
    list2 = {7, 8}, 
    list3 = {9, 0}, 

// Buffer function input for performance reasons
    BufferedTable = List.Buffer(
        Table.ToRecords(#table({"col3", "col4", "col5"}, List.Zip({list1, list2, list3})))
    ), 
    AddColumn = Table.AddColumn(
        Source, 
        "NewCols", 
        each Function.ScalarVector(type function (x as record) as record, each BufferedTable)(_)
    ), 
    #"Expanded NewCols" = Table.ExpandRecordColumn(AddColumn, "NewCols", {"col3", "col4", "col5"})
in
    #"Expanded NewCols"


(very strange that it doesn't seem to work with a table input, though...)

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

wdx223_Daniel
Super User
Super User

= Table.AddColumn(Source,"col3",each Function.ScalarVector(type function(n as number) as number ,each _list)(_))

i did not really get the trick of Function.ScalarVector, just try and use it.

@wdx223_Daniel  this is simply fascinating, just got greedy.

Can we add multiple lists in one go, i.e. adding list1, list2 in Custom1?

 

let
  Source = #table({"col1", "col2"}, {{1, 2}, {3, 4}}), 
  list1 = {3, 4}, 
  list2 = {5, 6}, 
  Custom1 = Table.AddColumn(
    Source, 
    "col3", 
    each Function.ScalarVector(type function (n as number) as number, each list2)(_)
  )
in
  Custom1

 

What I am asking is this, can we perform something similar to a javascript map?

 

const tbl = [{"col1":1,"col2":2},{"col1":3,"col2":4}];
const list1 = [5,6];
const list2 = [7,8];
tbl.map((a, i) => {
	a.col3 = list1[i];
	a.col4 = list2[i]
});
console.log(tbl);

 

smpa01_2-1663343743734.png

This is the query plan that I see here for @wdx223_Daniel 

smpa01_0-1663343068116.png

But PQ does not generate any query plan for @jbwtp @latimeria  query. I can't really assess the performance of this query. But I guess the query provided by you would have more overheads cause it needs to perform this `Table.ToColumns(Source)` which I would want to avoid cause my dataset (multiple excel from SharePoint) has 700k+ rows. Thank you nevertheless.

 

smpa01_1-1663343194405.png

 

 





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!






New Animated Dashboard: Sales Calendar


@wdx223_Daniel That's a pretty neat trick I haven't used before. Kudos!

 

It can indeed be extended to multiple columns by zipping the lists together and creating a record for each row to then expand. I'm sure there are lots of other possible generalizations and I'd love to see how you'd do it.

 

let
  Source = #table({"col1", "col2"}, {{1, 2}, {3, 4}}), 
  list1 = {5, 6}, 
  list2 = {7, 8}, 
  list3 = {9, 0}, 
  ListsCombined = List.Zip({list1, list2, list3}), 
  NewColNames = {"col3", "col4", "col5"}, 
  Custom1 = Table.AddColumn(
    Source, 
    "NewCols", 
    each Record.FromList(
      Function.ScalarVector(type function (n as list) as list, each ListsCombined)(_), 
      NewColNames
    )
  ), 
  #"Expanded NewCols" = Table.ExpandRecordColumn(Custom1, "NewCols", NewColNames, NewColNames)
in
  #"Expanded NewCols"

 

 

@smpa01 I'm glad you asked this question. I find it relatively common to need to add existing lists as new columns but never found an approach that feels clean and efficient.

 

@ImkeF I'd love to hear what you'd suggest for this topic since, to me, it feels similar in flavor to your Transforming Multiple Columns post.

Hi all,

@ImkeF , @smpa01 , @AlexisOlson , @wdx223_Daniel @latimeria 

 

The benefits of "scalar" approach escapes for me. In the example that Chris brought in his table, it assumes performing a row-by-row function, which can be optimised this way.

 

In this particular case we are dealing with merging two (or more) sets of data which does not have anything in common (except, potentially, an equal number of rows). There is no functional dependency of one set on another from what I see.

 

Therefore this is unclear which option would be quicker for this particular application. I would say, it is heavily dependant on the implementation. For instance, if the table is mapped to the memory as a set of columns somehow build into a form of a list, breaking it down to columns would not take any time regardless of the number of rows. The new columns would be just added to the list of columns (as references rather than copying blocks of data into a new structure). This can work because since defined variable is immutable, hance the reference is guaranteed to not to change whatever happens. 

 

On the other hand, "scalar" option in this world would have to actually scroll through each item in the list (hence this can be done only once for all new columns at the same time) to add them to the existing table as new columns (i. e. physically copy blocks of data from one place in memory to another). Buffering the table can improve this to some extend, but the mechanics of this I can't explain/envisage.

 

I've run the following test. See two queries below, one is a "scalar" and another Table.ToColumns option of adding lists. Both operate with 1m rows (both on the table and the lists):

let
    data = List.Zip({{1..1000000}, List.Transform({1..1000000}, each Text.From(_) & "." & "0")}),
    Source = #table({"col1", "col2"}, data), 
    list1 = List.Transform({1..1000000}, each Text.From(_) & "." & "1"), 
    list2 = List.Transform({1..1000000}, each Text.From(_) & "." & "2"), 
    list3 = List.Transform({1..1000000}, each Text.From(_) & "." & "3"), 

// Buffer function input for performance reasons
    BufferedTable = List.Buffer(
        Table.ToRecords(#table({"col3", "col4", "col5"}, List.Zip({list1, list2, list3})))
    ), 
    AddColumn = Table.AddColumn(
        Source, 
        "NewCols", 
        each Function.ScalarVector(type function (x as record) as record, each BufferedTable)(_)
    ), 
    #"Expanded NewCols" = Table.ExpandRecordColumn(AddColumn, "NewCols", {"col3", "col4", "col5"})
in
    #"Expanded NewCols"

 

let
    data = List.Zip({{1..1000000}, List.Transform({1..1000000}, each Text.From(_) & "." & "0")}),
    Source = #table({"col1", "col2"}, data), 
    list1 = List.Transform({1..1000000}, each Text.From(_) & "." & "1"), 
    list2 = List.Transform({1..1000000}, each Text.From(_) & "." & "2"), 
    list3 = List.Transform({1..1000000}, each Text.From(_) & "." & "3"),
    Custom1 = Table.FromColumns(Table.ToColumns(Source) & {list1} & {list2} & {list3})
in
    Custom1

 

The "scalar" option on my laptop refreshes for about 3 sec, which is quick, but the Table.ToColumns refreshes in a blink of the eye, I can't even see the running dots to start counting. Also, IMHO, the table.tocolumns looks a bit neater and straightforward.

 

This kind of proofs the point with in-memory organisation of tables which favours the Table.ToColumns version. But I can see that this may be case-dependant.

 

Cheers,

John

   

latimeria
Resolver V
Resolver V

Hi @smpa01 ,

Try this

let
    Source = #table({"col1", "col2"}, {{1,2},{3,4}}),
    TableToList = Table.ToColumns(Source),
    _list = {5,6},
    MergeLists = List.Combine( {TableToList, {_list} }),
    Custom2 = Table.FromColumns(MergeLists)
in
    Custom2

Hi @smpa01,

 

I would change the last line to something like this:

Table.FromColumns(MergeLists, Value.Type(Table.AddColumn(Source, "Column3", each null, type number)))

to avoid re-setting all column types again.

 

Cheers,

John

Helpful resources

Announcements
November 2022 Update

Check it Out!

Click here to read more about the November 2022 updates!

Microsoft 365 Conference â__ December 6-8, 2022

Microsoft 365 Conference - 06-08 December

Join us in Las Vegas to experience community, incredible learning opportunities, and connections that will help grow skills, know-how, and more.

Power BI Dev Camp Session 27

Ted's Dev Camp

This session walks through creating a new Azure AD B2C tenant and configuring it with user flows and custom policies.

Top Solution Authors
Top Kudoed Authors