Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

v-tangjie-msft

Split a column and store values into corresponding columns with M functions

Label: M function
Written by Nono C.

 

Scenario:
In some cases, we may need to split string fields into columns based on specific conditions. The purpose of this article is to illustrate how this can be accomplished using M functions for a specific case.

 

Sample Data:
This is a table with only one text field column, separated by colons. For the sample data, think of it as key-value pairs, e.g. on the first row it has: key=1, value=788; key=5, value=237 and so on. The key needs to be used as the column names with prefix “Column”, and the value is filled into the column of its corresponding key. In this example, we want to split the column into ten columns as we may have ten keys in total, and the result should be from column 1 to column 10. If a missing key-value pair is encountered, the value should be filled with 0 in the corresponding key column.

 

The original table is like:

vtangjiemsft_0-1715051416741.png

The expected results are as follows:

vtangjiemsft_1-1715051429267.png

 


How:

M function is excellent in data cleaning and conversion. In the following we will try to use M functions and custom functions to achieve the above goals.

Solution 1:
This is the M code of the first solution.

 

 

 

 

let
    Source = …,
    Ad_ToTable = Table.AddColumn(
    	Source, "ToTable", each 
    	[ a = List.Split(Text.Split([Column1], ":"), 2),
    	b = #table(List.Transform({ 1..10 }, (x)=> "Column"
 & Text.From(x)), { List.Repeat({null}, 10) }),
    	c = Table.Skip(b & #table(List.Transform(a, (x)=> "Column" 
& x{0}), 
          	{ List.Transform(a, (x)=> x{1}) }))]
    	[c], type table),
    ToTable = Table.Combine(Ad_ToTable[ToTable]),
    ReplacedValue = Table.ReplaceValue(ToTable,null,"0",
Replacer.ReplaceValue, Table.ColumnNames(ToTable))
In
    ReplacedValue

 

 

 

 

 

Next, I will analyze this M functions:

1.List.Split(Text.Split([Column1], ":"), 2): First, it uses the Text.Split function to split the values in the [Column1] column based on ":". Then split the result list into multiple lists where every item list contains 2 values.

2.#table(List.Transform({ 1..10 }, (x)=> "Column" & Text.From(x)), { List.Repeat({null}, 10) }): Then create a new table containing 10 empty values, the structure of which is defined by the List.Transform function. This function will combine the "column" string and the numbers to act as the column names, for example, "column 1". Then use the List.Repeat function to repeat these strings 10 times in the table.

3.Table.Skip(b & #table(List.Transform(a, (x)=> "Column" & x{0}), { List.Transform(a, (x)=> x{1}) })): Create a new table with two columns. One column is the string "Column n" (n is a number from 1 to 10) and the other is the value of each key-value pair (Key : Value) in the original column. The new table is then concatenated with the table represented by the variable “b”.

4.Table.Combine(Ad_ToTable[ToTable]): Eventually, Ad_ToTable will add a new column named ToTable to the source data table Source with the values of the table generated by the above steps.

vtangjiemsft_2-1715051462445.png

5.Table.ReplaceValue(ToTable,null,"0",Replacer.ReplaceValue, Table.ColumnNames(ToTable)): Replaces all null values in the table represented by the variable ToTable with the string "0".

vtangjiemsft_3-1715051462446.png

 

Solution 2:
In the second approach, we use different M functions.

 

 

 

 

let
    Source = …,
    ToTable = List.TransformMany(
Table.ToRows(Source),
each {List.Split(Text.Split(_{0}, ":"), 2)},
        	(x,y)=>  #table(List.Transform(y, each "Column" & _{0}), 
{List.Transform(y, each _{1})}) ),
Result = [ a = #table(List.Transform({ 1..10 }, 
(x)=> "Column" & Text.From(x)),{ List.Repeat({null}, 10) }),
    	b = Table.Skip(a & Table.Combine(ToTable))
  	][b],
ReplacedValue = Table.ReplaceValue(Result,null,"0",
Replacer.ReplaceValue, Table.ColumnNames(Result))
in
    ReplacedValue

 

 

 

 

 

1.List.Split(Text.Split(_{0}, ":"), 2): Splits the first element of each row in the Source table into two parts according to the colon, and then uses these two parts as the column name and column value to create a new table.

2.(x,y)=>  #table(List.Transform(y, each "Column" & _{0}),{List.Transform(y,each_{1})}):

The purpose of this code is to split each element of a list y into two columns, the first with the column names and the second with the corresponding values, and then combine these column names and values into a new table.

 

vtangjiemsft_0-1715051817548.png

 

3.List.Transform({ 1..10 }, (x)=> "Column" & Text.From(x)), { List.Repeat({null}, 10): The effect of this code is to create a new table containing a column with the name "Column" and a column with 10 null values.

4.Table.Skip(a & Table.Combine(ToTable): The effect of this code is to copy all the rows from the ToTable table into the “a” table, but to keep only the rows in the ToTable table and ignore the rows in the “a” table.

 

vtangjiemsft_1-1715051837929.png

 

5.Table.ReplaceValue(Result,null,"0",Replacer.ReplaceValue,Table.ColumnNames(Result)):  Replaces all null values in the Result table with the string "0".

 

vtangjiemsft_2-1715051837930.png

 

Solution 3:
The third method is as follows.

 

 

 

 

let
    Source = …,
    Ad_ToTable = Table.AddColumn(Source, "ToTable", each 
[ a = List.Split(Text.Split([Column1], ":"), 2),
b = List.Accumulate({1..10}, #table({"Col0"}, {{null}}), (s,c)=> Table.AddColumn(s, "Column" & Text.From(c), (x)=> if Text.From(c) = List.Select(a, (x)=> x{0} = Text.From(c)){0}?{0}? then Number.From(List.Select(a, (x)=> x{0} = Text.From(c)){0}?{1}?) else 0, Int64.Type)),
          	c = Table.RemoveColumns(b, {"Col0"})
       	][c], type table),
    ToTable = Table.Combine(Ad_ToTable[ToTable])
in
    ToTable

 

 

 

 

1.List.Accumulate(…): Extend a table “b” containing a sequence of numbers into a new table, where each number corresponds to a new column with the name "Column n" (n is the number 1-10). If there is a key in the original table “a” that matches the column name, the value of the new column is converted from the value of the corresponding key in table “a”. Otherwise, the value of the new column is 0.

 

vtangjiemsft_0-1715052028643.png

 

2.Table.Combine(Ad_ToTable[ToTable]): Merge the sub-tables named ToTable from the table Ad_ToTable with the ToTable column added into a single table.

 

vtangjiemsft_1-1715052028646.png

 

Summary:
There are many solutions to this problem, but we prefer to recommend simple and easy-to-understand methods. On one hand, it can increase the readability of the code; on the other hand, it improves the feasibility of operation.

 

Author: Nono C.

Reviewer: Ula and Kerry