Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
smpa01
Super User
Super User

PQ to add column from list to an existing table

My base table is this

 

#table({"col1", "col2"}, {{1,2},{3,4}})

 

I have a list as this

 

_list = {5,6},

 

I want to end up with this

 

| col1 | col2 | col3 |
|------|------|------|
| 1    | 3    | 5    |
| 2    | 4    | 6    |

 

I tried this which doesn't work.

 

let
    Source = #table({"col1", "col2"}, {{1,2},{3,4}}),
    _list = {5,6},
    Custom1 = Table.AddColumn(Source,"col3", each _list)
in
Custom1

 

 

 

 

Did I answer your question? Mark my post as a solution!
Proud to be a Super User!
My custom visualization projects
Plotting Live Sound: Viz1
Beautiful News:Viz1, Viz2, Viz3
Visual Capitalist: Working Hrs
3 ACCEPTED SOLUTIONS
wdx223_Daniel
Super User
Super User

= Table.AddColumn(Source,"col3",each Function.ScalarVector(type function(n as number) as number ,each _list)(_))

i did not really get the trick of Function.ScalarVector, just try and use it.

View solution in original post

@wdx223_Daniel That's a pretty neat trick I haven't used before. Kudos!

 

It can indeed be extended to multiple columns by zipping the lists together and creating a record for each row to then expand. I'm sure there are lots of other possible generalizations and I'd love to see how you'd do it.

 

let
  Source = #table({"col1", "col2"}, {{1, 2}, {3, 4}}), 
  list1 = {5, 6}, 
  list2 = {7, 8}, 
  list3 = {9, 0}, 
  ListsCombined = List.Zip({list1, list2, list3}), 
  NewColNames = {"col3", "col4", "col5"}, 
  Custom1 = Table.AddColumn(
    Source, 
    "NewCols", 
    each Record.FromList(
      Function.ScalarVector(type function (n as list) as list, each ListsCombined)(_), 
      NewColNames
    )
  ), 
  #"Expanded NewCols" = Table.ExpandRecordColumn(Custom1, "NewCols", NewColNames, NewColNames)
in
  #"Expanded NewCols"

 

 

@smpa01 I'm glad you asked this question. I find it relatively common to need to add existing lists as new columns but never found an approach that feels clean and efficient.

 

@ImkeF I'd love to hear what you'd suggest for this topic since, to me, it feels similar in flavor to your Transforming Multiple Columns post.

View solution in original post

Hi all,

@ImkeF , @smpa01 , @AlexisOlson , @wdx223_Daniel @latimeria 

 

The benefits of "scalar" approach escapes for me. In the example that Chris brought in his table, it assumes performing a row-by-row function, which can be optimised this way.

 

In this particular case we are dealing with merging two (or more) sets of data which does not have anything in common (except, potentially, an equal number of rows). There is no functional dependency of one set on another from what I see.

 

Therefore this is unclear which option would be quicker for this particular application. I would say, it is heavily dependant on the implementation. For instance, if the table is mapped to the memory as a set of columns somehow build into a form of a list, breaking it down to columns would not take any time regardless of the number of rows. The new columns would be just added to the list of columns (as references rather than copying blocks of data into a new structure). This can work because since defined variable is immutable, hance the reference is guaranteed to not to change whatever happens. 

 

On the other hand, "scalar" option in this world would have to actually scroll through each item in the list (hence this can be done only once for all new columns at the same time) to add them to the existing table as new columns (i. e. physically copy blocks of data from one place in memory to another). Buffering the table can improve this to some extend, but the mechanics of this I can't explain/envisage.

 

I've run the following test. See two queries below, one is a "scalar" and another Table.ToColumns option of adding lists. Both operate with 1m rows (both on the table and the lists):

let
    data = List.Zip({{1..1000000}, List.Transform({1..1000000}, each Text.From(_) & "." & "0")}),
    Source = #table({"col1", "col2"}, data), 
    list1 = List.Transform({1..1000000}, each Text.From(_) & "." & "1"), 
    list2 = List.Transform({1..1000000}, each Text.From(_) & "." & "2"), 
    list3 = List.Transform({1..1000000}, each Text.From(_) & "." & "3"), 

// Buffer function input for performance reasons
    BufferedTable = List.Buffer(
        Table.ToRecords(#table({"col3", "col4", "col5"}, List.Zip({list1, list2, list3})))
    ), 
    AddColumn = Table.AddColumn(
        Source, 
        "NewCols", 
        each Function.ScalarVector(type function (x as record) as record, each BufferedTable)(_)
    ), 
    #"Expanded NewCols" = Table.ExpandRecordColumn(AddColumn, "NewCols", {"col3", "col4", "col5"})
in
    #"Expanded NewCols"

 

let
    data = List.Zip({{1..1000000}, List.Transform({1..1000000}, each Text.From(_) & "." & "0")}),
    Source = #table({"col1", "col2"}, data), 
    list1 = List.Transform({1..1000000}, each Text.From(_) & "." & "1"), 
    list2 = List.Transform({1..1000000}, each Text.From(_) & "." & "2"), 
    list3 = List.Transform({1..1000000}, each Text.From(_) & "." & "3"),
    Custom1 = Table.FromColumns(Table.ToColumns(Source) & {list1} & {list2} & {list3})
in
    Custom1

 

The "scalar" option on my laptop refreshes for about 3 sec, which is quick, but the Table.ToColumns refreshes in a blink of the eye, I can't even see the running dots to start counting. Also, IMHO, the table.tocolumns looks a bit neater and straightforward.

 

This kind of proofs the point with in-memory organisation of tables which favours the Table.ToColumns version. But I can see that this may be case-dependant.

 

Cheers,

John

   

View solution in original post

7 REPLIES 7
ImkeF
Super User
Super User

Hi @AlexisOlson ,
I must admint I haven't used this function much so far, but re-reading Chris' article about it, I definitely should change that.
As he pointed out the performance benefits only apply to buffered inputs, you might consider re-writing it like so for example:

let
    Source = #table({"col1", "col2"}, {{1, 2}, {3, 4}}), 
    list1 = {5, 6}, 
    list2 = {7, 8}, 
    list3 = {9, 0}, 

// Buffer function input for performance reasons
    BufferedTable = List.Buffer(
        Table.ToRecords(#table({"col3", "col4", "col5"}, List.Zip({list1, list2, list3})))
    ), 
    AddColumn = Table.AddColumn(
        Source, 
        "NewCols", 
        each Function.ScalarVector(type function (x as record) as record, each BufferedTable)(_)
    ), 
    #"Expanded NewCols" = Table.ExpandRecordColumn(AddColumn, "NewCols", {"col3", "col4", "col5"})
in
    #"Expanded NewCols"


(very strange that it doesn't seem to work with a table input, though...)

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

wdx223_Daniel
Super User
Super User

= Table.AddColumn(Source,"col3",each Function.ScalarVector(type function(n as number) as number ,each _list)(_))

i did not really get the trick of Function.ScalarVector, just try and use it.

@wdx223_Daniel  this is simply fascinating, just got greedy.

Can we add multiple lists in one go, i.e. adding list1, list2 in Custom1?

 

let
  Source = #table({"col1", "col2"}, {{1, 2}, {3, 4}}), 
  list1 = {3, 4}, 
  list2 = {5, 6}, 
  Custom1 = Table.AddColumn(
    Source, 
    "col3", 
    each Function.ScalarVector(type function (n as number) as number, each list2)(_)
  )
in
  Custom1

 

What I am asking is this, can we perform something similar to a javascript map?

 

const tbl = [{"col1":1,"col2":2},{"col1":3,"col2":4}];
const list1 = [5,6];
const list2 = [7,8];
tbl.map((a, i) => {
	a.col3 = list1[i];
	a.col4 = list2[i]
});
console.log(tbl);

 

smpa01_2-1663343743734.png

This is the query plan that I see here for @wdx223_Daniel 

smpa01_0-1663343068116.png

But PQ does not generate any query plan for @jbwtp @latimeria  query. I can't really assess the performance of this query. But I guess the query provided by you would have more overheads cause it needs to perform this `Table.ToColumns(Source)` which I would want to avoid cause my dataset (multiple excel from SharePoint) has 700k+ rows. Thank you nevertheless.

 

smpa01_1-1663343194405.png

 

 

Did I answer your question? Mark my post as a solution!
Proud to be a Super User!
My custom visualization projects
Plotting Live Sound: Viz1
Beautiful News:Viz1, Viz2, Viz3
Visual Capitalist: Working Hrs

@wdx223_Daniel That's a pretty neat trick I haven't used before. Kudos!

 

It can indeed be extended to multiple columns by zipping the lists together and creating a record for each row to then expand. I'm sure there are lots of other possible generalizations and I'd love to see how you'd do it.

 

let
  Source = #table({"col1", "col2"}, {{1, 2}, {3, 4}}), 
  list1 = {5, 6}, 
  list2 = {7, 8}, 
  list3 = {9, 0}, 
  ListsCombined = List.Zip({list1, list2, list3}), 
  NewColNames = {"col3", "col4", "col5"}, 
  Custom1 = Table.AddColumn(
    Source, 
    "NewCols", 
    each Record.FromList(
      Function.ScalarVector(type function (n as list) as list, each ListsCombined)(_), 
      NewColNames
    )
  ), 
  #"Expanded NewCols" = Table.ExpandRecordColumn(Custom1, "NewCols", NewColNames, NewColNames)
in
  #"Expanded NewCols"

 

 

@smpa01 I'm glad you asked this question. I find it relatively common to need to add existing lists as new columns but never found an approach that feels clean and efficient.

 

@ImkeF I'd love to hear what you'd suggest for this topic since, to me, it feels similar in flavor to your Transforming Multiple Columns post.

Hi all,

@ImkeF , @smpa01 , @AlexisOlson , @wdx223_Daniel @latimeria 

 

The benefits of "scalar" approach escapes for me. In the example that Chris brought in his table, it assumes performing a row-by-row function, which can be optimised this way.

 

In this particular case we are dealing with merging two (or more) sets of data which does not have anything in common (except, potentially, an equal number of rows). There is no functional dependency of one set on another from what I see.

 

Therefore this is unclear which option would be quicker for this particular application. I would say, it is heavily dependant on the implementation. For instance, if the table is mapped to the memory as a set of columns somehow build into a form of a list, breaking it down to columns would not take any time regardless of the number of rows. The new columns would be just added to the list of columns (as references rather than copying blocks of data into a new structure). This can work because since defined variable is immutable, hance the reference is guaranteed to not to change whatever happens. 

 

On the other hand, "scalar" option in this world would have to actually scroll through each item in the list (hence this can be done only once for all new columns at the same time) to add them to the existing table as new columns (i. e. physically copy blocks of data from one place in memory to another). Buffering the table can improve this to some extend, but the mechanics of this I can't explain/envisage.

 

I've run the following test. See two queries below, one is a "scalar" and another Table.ToColumns option of adding lists. Both operate with 1m rows (both on the table and the lists):

let
    data = List.Zip({{1..1000000}, List.Transform({1..1000000}, each Text.From(_) & "." & "0")}),
    Source = #table({"col1", "col2"}, data), 
    list1 = List.Transform({1..1000000}, each Text.From(_) & "." & "1"), 
    list2 = List.Transform({1..1000000}, each Text.From(_) & "." & "2"), 
    list3 = List.Transform({1..1000000}, each Text.From(_) & "." & "3"), 

// Buffer function input for performance reasons
    BufferedTable = List.Buffer(
        Table.ToRecords(#table({"col3", "col4", "col5"}, List.Zip({list1, list2, list3})))
    ), 
    AddColumn = Table.AddColumn(
        Source, 
        "NewCols", 
        each Function.ScalarVector(type function (x as record) as record, each BufferedTable)(_)
    ), 
    #"Expanded NewCols" = Table.ExpandRecordColumn(AddColumn, "NewCols", {"col3", "col4", "col5"})
in
    #"Expanded NewCols"

 

let
    data = List.Zip({{1..1000000}, List.Transform({1..1000000}, each Text.From(_) & "." & "0")}),
    Source = #table({"col1", "col2"}, data), 
    list1 = List.Transform({1..1000000}, each Text.From(_) & "." & "1"), 
    list2 = List.Transform({1..1000000}, each Text.From(_) & "." & "2"), 
    list3 = List.Transform({1..1000000}, each Text.From(_) & "." & "3"),
    Custom1 = Table.FromColumns(Table.ToColumns(Source) & {list1} & {list2} & {list3})
in
    Custom1

 

The "scalar" option on my laptop refreshes for about 3 sec, which is quick, but the Table.ToColumns refreshes in a blink of the eye, I can't even see the running dots to start counting. Also, IMHO, the table.tocolumns looks a bit neater and straightforward.

 

This kind of proofs the point with in-memory organisation of tables which favours the Table.ToColumns version. But I can see that this may be case-dependant.

 

Cheers,

John

   

latimeria
Solution Specialist
Solution Specialist

Hi @smpa01 ,

Try this

let
    Source = #table({"col1", "col2"}, {{1,2},{3,4}}),
    TableToList = Table.ToColumns(Source),
    _list = {5,6},
    MergeLists = List.Combine( {TableToList, {_list} }),
    Custom2 = Table.FromColumns(MergeLists)
in
    Custom2

Hi @smpa01,

 

I would change the last line to something like this:

Table.FromColumns(MergeLists, Value.Type(Table.AddColumn(Source, "Column3", each null, type number)))

to avoid re-setting all column types again.

 

Cheers,

John

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors