Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Merge Multiple Sets of Columns dynamically to Stack them one below the other - Power Query

I have a single csv dataset that looks like below. I have multiple csv files like this which i want to combine once i figure out the correct transformation for a single file. Here is the Sample Excel File containing a single csv data : 

sifar786_0-1604507199608.png

into this expected Output :

Capture.JPG

The Input data has broadly 4 sections : 

1] The 1st 3 columns are used to create `Questions`, `Answers` & `Label` columns.

2] a section of Percentage columns 

3] a section of Stat testing columns

4] a section of 1 columns

 

I want to `Merge` each sets of columns `R1...R10` and then `Unpivot` (`Stack`) them one above the other as shown in the output. The `R1...R10` columns count may vary each time, but they will be the same for each section i.e. for `Percentage columns`, `Stat testing columns` & `1 columns`.

While googling, i am came across an Imke Feldmann post on Merging and unpivoting columns dynamically using a custom function. However, i am unable to select and merge the different sections of columns dynamically. Also, i found Bill Szysz's post which is somewhat similar and trying to figure out a way to modify his 2nd solution i.e. Combine Method on a similar post. Here is Bill Szysz's Combine Method solution.

 

let
    Source = Table.PromoteHeaders(Excel.CurrentWorkbook(){[Name="UglyData"]}[Content], [PromoteAllScalars=true]),
    Lists = Table.FromColumns({List.Transform({0..((List.Count(Table.ColumnNames(Source))-6)/5)-1}, each List.Range(Table.ColumnNames(Source), _*5+6, 5)) }),
    AlmostReady = Table.Combine(Table.AddColumn(Lists, "Columns", each Table.FromColumns({Table.Column(Source, Table.ColumnNames(Source){0})} & {List.Repeat({[Column1]{0}}, Table.RowCount(Source))} & List.Transform(List.Skip([Column1], 1), each Table.Column(Source, _)), {Table.ColumnNames(Source){0}, "Name"} & List.Transform(List.Skip([Column1], 1), each Text.BeforeDelimiter(_, "_")) )   )[Columns]),
    Ready = Table.AddColumn(AlmostReady, "Total", each List.Sum(List.Skip(Record.ToList(_), 2)))
in
    Ready

 

The problem is, Bill has used fixed column and row counts ( 5 & 6) whereas my columns and rows count could change. for e.g. today there are 10 sets of columns per section, tomorrow there may be 5 or 20. But the columns count will always be the same across each section i.e. for Percentage section, Stats section and Number section. Also, the sequence of the Columns may change but maintain the same sequence across the sections:

e.g:

Percentage section   |     | Stats section             |     | Numbers section      |

R1 | R3 | R2 | R5 | R4 |     | R1 | R3 | R2 | R5 | R4 |     | R1 | R3 | R2 | R5 | R4 |

This data is tricky and beyond my amateur skills. Is this possible to do?

@ImkeF@AlB

1 ACCEPTED SOLUTION

according to discussion, final code is:

let
    Source = Table.PromoteHeaders(Csv.Document(File.Contents("C:\Users\Clark\Desktop\SampleCSV.csv"),[Delimiter=",", Columns=36, Encoding=1252, QuoteStyle=QuoteStyle.None])),
    title = Table.ColumnNames(Source),
    count = List.Count(List.Select(title,each Text.StartsWith(_,title{5}))),
    num = (List.Count(title)-3)/count,
    Group = Table.SelectRows(
                Table.Combine(
                    Table.Group(
                                Source,
                                "Que",
                                {"n",each
                                        let
                                            header={"Que","Question","Answer","Label","Agg"}&List.Range(title,4,num-1),
                                            a=Table.ToRows(_),
                                            b=List.FirstN(a{0},2)&{a{1}{1}},
                                            c=List.TransformMany(
                                                    List.Skip(a,2),
                                                    each List.Split(List.Skip(_,3),num),
                                                    (m,n)=>b&List.Range(m,1,2)&List.Skip(n)) in #table(header,c)
                                },
                                0,
                                (x,y)=>Byte.From(Text.Length(y)>0)
                                )[n]),
                            each  Text.Length([Label])>0
                            )
in
    Group

View solution in original post

13 REPLIES 13
shaowu459
Resolver II
Resolver II

Morning, please try and check this one.

let
    Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
    Group = Table.SelectRows(Table.Combine(Table.Group(Source,"Que",{"n",each let header={"Que","Question","Answer","Label","Agg"}&List.Range(Table.ColumnNames(Source),4,10),a=Table.ToRows(_), b=List.FirstN(a{0},2)&{a{1}{1}}, c=List.TransformMany(List.Skip(a,2),each List.Split(List.Skip(_,3),11),(m,n)=>b&List.Range(m,1,2)&List.Skip(n)) in #table(header,c)},0,(x,y)=>Byte.From(y<>null))[n]),each [Label]<>null)
in
    Group

1.png

Anonymous
Not applicable

@shaowu459 Thanks for your reply.

 

I am getting this error:

sifar786_0-1604733544640.png

i then tried promoting the headers before the grouping, but it showed all rows empty, filtered on Label.

 

This is some brilliant piece of code! However, as i try to follow along, could you confirm, if you are hardcoding the 1st set of columns? e.g. here :

 

.... & List.Range(Table.ColumnNames(Source),4,10), ....

 

 

The first part of column names is hard coded {"Que","Question","Answer","Label","Agg"}, according to your description, i think this part will not change.

Anonymous
Not applicable

When i import the csv file, the Source looks like this :

sifar786_0-1604740417812.png

 

 

How about change Source in my code to Table.PromoteHeaders(Source)?

Group = Table.SelectRows(Table.Combine(Table.Group(Table.PromoteHeaders(Source),"Que",{"n",each let header={"Que","Question","Answer","Label","Agg"}&List.Range(Table.ColumnNames(Table.PromoteHeaders(Source)),4,10),a=Table.ToRows(_), b=List.FirstN(a{0},2)&{a{1}{1}}, c=List.TransformMany(List.Skip(a,2),each List.Split(List.Skip(_,3),11),(m,n)=>b&List.Range(m,1,2)&List.Skip(n)) in #table(header,c)},0,(x,y)=>Byte.From(y<>null))[n]),each [Label]<>null)
Anonymous
Not applicable

This is how i get it (same as before) after making your suggested change :

sifar786_0-1604743517323.png

 

 

according to discussion, final code is:

let
    Source = Table.PromoteHeaders(Csv.Document(File.Contents("C:\Users\Clark\Desktop\SampleCSV.csv"),[Delimiter=",", Columns=36, Encoding=1252, QuoteStyle=QuoteStyle.None])),
    title = Table.ColumnNames(Source),
    count = List.Count(List.Select(title,each Text.StartsWith(_,title{5}))),
    num = (List.Count(title)-3)/count,
    Group = Table.SelectRows(
                Table.Combine(
                    Table.Group(
                                Source,
                                "Que",
                                {"n",each
                                        let
                                            header={"Que","Question","Answer","Label","Agg"}&List.Range(title,4,num-1),
                                            a=Table.ToRows(_),
                                            b=List.FirstN(a{0},2)&{a{1}{1}},
                                            c=List.TransformMany(
                                                    List.Skip(a,2),
                                                    each List.Split(List.Skip(_,3),num),
                                                    (m,n)=>b&List.Range(m,1,2)&List.Skip(n)) in #table(header,c)
                                },
                                0,
                                (x,y)=>Byte.From(Text.Length(y)>0)
                                )[n]),
                            each  Text.Length([Label])>0
                            )
in
    Group
Anonymous
Not applicable

Genius! 🙂

 

Very clean and elegant code. Thank you very much.

I grouped table by column "Que", the error message you get says there is no column named "Que", please check your database, is the first column named "Que"? 

AlB
Super User
Super User

Hi @Anonymous 

If I understand correctly, you already have a solution that works only partially, with some issues. If so, can you post it here so that we can take it as base and try to fix/improve it?

Please mark the question solved when done and consider giving a thumbs up if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

SU18_powerbi_badge

Anonymous
Not applicable

@AlB, i dont have a partially created solution though i was trying to use and modify @ImkeF custom function and Bill Szysz's 2nd solution i.e. `Combine Method` on a similar post. Here is Bill Szysz solution.

 

let
    Source = Table.PromoteHeaders(Excel.CurrentWorkbook(){[Name="UglyData"]}[Content], [PromoteAllScalars=true]),
    Lists = Table.FromColumns({List.Transform({0..((List.Count(Table.ColumnNames(Source))-6)/5)-1}, each List.Range(Table.ColumnNames(Source), _*5+6, 5)) }),
    AlmostReady = Table.Combine(Table.AddColumn(Lists, "Columns", each Table.FromColumns({Table.Column(Source, Table.ColumnNames(Source){0})} & {List.Repeat({[Column1]{0}}, Table.RowCount(Source))} & List.Transform(List.Skip([Column1], 1), each Table.Column(Source, _)), {Table.ColumnNames(Source){0}, "Name"} & List.Transform(List.Skip([Column1], 1), each Text.BeforeDelimiter(_, "_")) )   )[Columns]),
    Ready = Table.AddColumn(AlmostReady, "Total", each List.Sum(List.Skip(Record.ToList(_), 2)))
in
    Ready

 

The problem is, he uses fixed column and row counts (5 & 6) whereas my columns and rows count could change. for e.g. today there are 10 sets of columns per section, tomorrow there may be 5 or 20. But the columns count will always be the same for each Percentage section, Stats section and Number section. Also, the sequence of the Columns may change but will be same across the sections:

e.g:

Percentage section   |     | Stats section             |     | Numbers section      |

R1 | R3 | R2 | R5 | R4 |     | R1 | R3 | R2 | R5 | R4 |     | R1 | R3 | R2 | R5 | R4 |

 

Hi @Anonymous ,

you can make it dynamic with this solution:

 

let

fnStackDynamic = 
    (Partition as table, NoOfColumns as number) =>
let
    #"Unpivoted Other Columns1" = Table.UnpivotOtherColumns(Partition, {"Que", "Que Ans Lbl", "Agg"}, "Attribute", "Value"),
    #"Filtered Rows2" = Table.SelectRows(#"Unpivoted Other Columns1", each not Text.StartsWith([Attribute], "Column")),
    #"Added Index" = Table.AddIndexColumn(#"Filtered Rows2", "Index2", 0, 1, Int64.Type),
    #"Inserted Integer-Division" = Table.AddColumn(#"Added Index", "RowIndex", each Number.IntegerDivide([Index2], (NoOfColumns-6)/3), Int64.Type),
    #"Extracted Text Before Delimiter" = Table.TransformColumns(#"Inserted Integer-Division", {{"Attribute", each Text.BeforeDelimiter(_, "_"), type text}}),
    #"Removed Columns" = Table.RemoveColumns(#"Extracted Text Before Delimiter",{"Index2"}),
    #"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Attribute]), "Attribute", "Value")
in
    #"Pivoted Column",

    Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
    #"Filled Down" = Table.FillDown(Source,{"Column1"}),
    #"Promoted Headers" = Table.PromoteHeaders(#"Filled Down", [PromoteAllScalars=true]),
    ColumnNames = Table.ColumnNames(#"Promoted Headers"),
    NoOfColumns = List.Count(ColumnNames),
    #"Replaced Value" = Table.ReplaceValue(#"Promoted Headers", null,"",Replacer.ReplaceValue, ColumnNames),
    #"Filtered Rows1" = Table.SelectRows(#"Replaced Value", each Text.StartsWith([Que Ans Lbl], "Label")),
    #"Grouped Rows" = Table.Group(#"Filtered Rows1", {"Que"}, {{"Partition", each _}}, GroupKind.Local),
    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each fnStackDynamic([Partition], NoOfColumns)),
    ConvertedRows = Table.RemoveColumns(#"Added Custom",{"Partition"}),
    Custom1 = #"Promoted Headers",
    #"Grouped Rows1" = Table.Group(Custom1, {"Que"}, {{"First2Rows", each [Question=[Que Ans Lbl]{0}, Answer=[Que Ans Lbl]{1}]}}),
    #"Expanded First2Rows1" = Table.ExpandRecordColumn(#"Grouped Rows1", "First2Rows", {"Question", "Answer"}, {"Question", "Answer"}),
    #"Merged Queries" = Table.NestedJoin(#"Expanded First2Rows1", {"Que"}, ConvertedRows, {"Que"}, "Custom", JoinKind.LeftOuter),
    #"Expanded Custom1" = Table.ExpandTableColumn(#"Merged Queries", "Custom", {"Custom"}, {"Custom"}),
    #"Expanded Custom" = Table.ExpandTableColumn(#"Expanded Custom1", "Custom", List.Range(ColumnNames, 1,2) & List.Range(List.RemoveFirstN(ColumnNames,3), 1, (NoOfColumns -3) / 3-1 ))
in
    #"Expanded Custom"

Please also see file attached.

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Anonymous
Not applicable

Hi Imke,

Thanks for your reply.

 

I am getting following error:

sifar786_0-1604682679298.png

It seems to come from this step where the custom function gets applied:

sifar786_1-1604682883855.png

 

 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors