Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
PhilC
Resolver I
Resolver I

Split varying number of columns into 4 columns each

Hi, 

I need to split a dynamic # of columns, each into 4 new columns with specific names (Surname, Initial, EmployeeID, Other)

 

Link to Excel file showing the source data and the output required is below.  There is a query that does the initial transformation to achieve the dynamic columns.  Just need to know how to then split all those columns irrespective of how many there are.

 

Sample file with Source data and required output 

 

Thanks in advance.

Phil

 

1 ACCEPTED SOLUTION

Hi @PhilC 

Excellent, that counts as an excuse 😉 (Otherwise: Everything that @Greg_Deckler  said...)

Here we go:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("VY/NDoIwEIRfxfTMQYq/R3/DAQ0REw+EQ4MrJRRICvp03nwxd5eAcuh0kq+704lj4QpHHKrMgG3RHfG40kPdiMTpqJ9nWYN3yGw2YgG88pYGTwznqFuGEt1N2UaD4QGiC9QBRqmGolQV2gvTJeqeMcVHn/dd1w90V6arIZZoCC3Yia+qBmhBwE/WqLvfAp0bLNWv96Z9OJfD4JqqMXL/0VkVqnxaRSuYytG/fDBmQF5XN/kC", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Surname = _t, Initial = _t, EmployeeID = _t, Other = _t]),
    #"Grouped Rows" = Table.Group(Source, {"ID"}, {{"IDTable", each Table.RemoveColumns(_, {"ID"})}, {"Count", each Table.RowCount(_), type number}}),
    AddColumnNames = Table.AddColumn(#"Grouped Rows", "NewColumnNames", each List.Combine( List.Transform({1..[Count]}, (x) => List.Transform(Table.ColumnNames([IDTable]), (t) => t & Text.From(x))))),
    AddNewTable = Table.AddColumn(AddColumnNames, "NewTable", each Table.FromRows({List.Combine(Table.ToRows([IDTable]))}, [NewColumnNames])),
    #"Expanded NewTable" = Table.ExpandTableColumn(AddNewTable, "NewTable", List.Distinct(List.Combine(AddNewTable[NewColumnNames]))),
    CleanUp = Table.RemoveColumns(#"Expanded NewTable",{"IDTable", "Count", "NewColumnNames"})
in
    CleanUp

You were on the right path with the grouping.

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

8 REPLIES 8
Greg_Deckler
Super User
Super User

In general, it is preferable to work with the data as you have it in your source data. Would be curious why you want it in your desired output format. Will definitely need to do a Group By operation. After that you will have to get some help from someone like @ImkeF , the Power Query guru extraordinaire. 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Thanks Greg,

 

It is indeed an odd layout.  The data is for upload into a an academic software package. 

 

Each row represents a publication.

There may be any number of authors (hence the dynamic # of columns piece).

For each author, there are 4 columns to be uploaded, Surname, Initial, ID, Other (not sure what this is yet, just know it is needed).

 

Yes, was hoping @ImkeF or @MarcelBeug might take an interest 🙂

 

Hi @PhilC 

Excellent, that counts as an excuse 😉 (Otherwise: Everything that @Greg_Deckler  said...)

Here we go:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("VY/NDoIwEIRfxfTMQYq/R3/DAQ0REw+EQ4MrJRRICvp03nwxd5eAcuh0kq+704lj4QpHHKrMgG3RHfG40kPdiMTpqJ9nWYN3yGw2YgG88pYGTwznqFuGEt1N2UaD4QGiC9QBRqmGolQV2gvTJeqeMcVHn/dd1w90V6arIZZoCC3Yia+qBmhBwE/WqLvfAp0bLNWv96Z9OJfD4JqqMXL/0VkVqnxaRSuYytG/fDBmQF5XN/kC", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Surname = _t, Initial = _t, EmployeeID = _t, Other = _t]),
    #"Grouped Rows" = Table.Group(Source, {"ID"}, {{"IDTable", each Table.RemoveColumns(_, {"ID"})}, {"Count", each Table.RowCount(_), type number}}),
    AddColumnNames = Table.AddColumn(#"Grouped Rows", "NewColumnNames", each List.Combine( List.Transform({1..[Count]}, (x) => List.Transform(Table.ColumnNames([IDTable]), (t) => t & Text.From(x))))),
    AddNewTable = Table.AddColumn(AddColumnNames, "NewTable", each Table.FromRows({List.Combine(Table.ToRows([IDTable]))}, [NewColumnNames])),
    #"Expanded NewTable" = Table.ExpandTableColumn(AddNewTable, "NewTable", List.Distinct(List.Combine(AddNewTable[NewColumnNames]))),
    CleanUp = Table.RemoveColumns(#"Expanded NewTable",{"IDTable", "Count", "NewColumnNames"})
in
    CleanUp

You were on the right path with the grouping.

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Thanks very much for the incredibly quick response @ImkeF.

 

It is working as hoped with the small dataset, will test it in a couple of days with the full one.

 

Cheers

Phil

You're welcome @PhilC ,

if your data is sorted by ID already, you can speed up the processing considerably by using GroupKind.Local, like described here: https://blog.crossjoin.co.uk/2014/01/03/aggregating-by-local-groups-in-power-query/ 

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Hi @ImkeF ,

 

I have been using the above solution for a while, and while slow, has been manageable.  I push this step out to a table so it is quicker when using in downstream queries.

 

I now have to add a lot more columns per author (total of 15), for around 2500 publications, and up to 41 authors (this is the max, most are under 5 authors).

 

It is now extremely slow (not 100% sure how long as I ran it and went to bed, but over 30mins and counting at the moment), so I was wondering if your suggestion re Chris's approach migh help.

 

Trouble is, your code is above my understanding, and I cannot work out how to incorporate the GroupKind.Local approach into your solution.

 

Any assistance would be very much appreciated.

 

Cheers, Phil

Hi @PhilC ,
please try this:

 

 

let
    Source = Table.FromRows(
        Json.Document(
            Binary.Decompress(
                Binary.FromText(
                    "VY/NDoIwEIRfxfTMQYq/R3/DAQ0REw+EQ4MrJRRICvp03nwxd5eAcuh0kq+704lj4QpHHKrMgG3RHfG40kPdiMTpqJ9nWYN3yGw2YgG88pYGTwznqFuGEt1N2UaD4QGiC9QBRqmGolQV2gvTJeqeMcVHn/dd1w90V6arIZZoCC3Yia+qBmhBwE/WqLvfAp0bLNWv96Z9OJfD4JqqMXL/0VkVqnxaRSuYytG/fDBmQF5XN/kC",
                    BinaryEncoding.Base64
                ),
                Compression.Deflate
            )
        ),
        let
            _t = ((type text) meta [Serialized.Text = true])
        in
            type table [ID = _t, Surname = _t, Initial = _t, EmployeeID = _t, Other = _t]
    ),
    Custom2 = Table.Group(
        Source,
        {"ID"},
        {
            {
                "IDTable",
                each Table.FromRows(
                    {{[ID]{0}} & List.Combine(Table.ToRows(Table.RemoveColumns(_, {"ID"})))},
                    {"ID"} & List.Combine(
                        List.Transform(
                            {1 .. Table.RowCount(_)},
                            (x) => {
                                "Surname" & Text.From(x),
                                "Initial" & Text.From(x),
                                "EmployeeID" & Text.From(x),
                                "Other" & Text.From(x)
                            }
                        )
                    )
                        
                )
            }
        }
    ),
    Custom3 = Table.Combine(Custom2[IDTable])
in
    Custom3

 

 

or, if the sorting is correct already, this might be even faster:

 

let
    Source = Table.FromRows(
        Json.Document(
            Binary.Decompress(
                Binary.FromText(
                    "VY/NDoIwEIRfxfTMQYq/R3/DAQ0REw+EQ4MrJRRICvp03nwxd5eAcuh0kq+704lj4QpHHKrMgG3RHfG40kPdiMTpqJ9nWYN3yGw2YgG88pYGTwznqFuGEt1N2UaD4QGiC9QBRqmGolQV2gvTJeqeMcVHn/dd1w90V6arIZZoCC3Yia+qBmhBwE/WqLvfAp0bLNWv96Z9OJfD4JqqMXL/0VkVqnxaRSuYytG/fDBmQF5XN/kC",
                    BinaryEncoding.Base64
                ),
                Compression.Deflate
            )
        ),
        let
            _t = ((type text) meta [Serialized.Text = true])
        in
            type table [ID = _t, Surname = _t, Initial = _t, EmployeeID = _t, Other = _t]
    ),
    Custom2 = Table.Group(
        Source,
        {"ID"},
        {
            {
                "IDTable",
                each Table.FromRows(
                    {{[ID]{0}} & List.Combine(Table.ToRows(Table.RemoveColumns(_, {"ID"})))},
                    {"ID"} & List.Combine(
                        List.Transform(
                            {1 .. Table.RowCount(_)},
                            (x) => {
                                "Surname" & Text.From(x),
                                "Initial" & Text.From(x),
                                "EmployeeID" & Text.From(x),
                                "Other" & Text.From(x)
                            }
                        )
                    )
                        
                )
            }
        }
     , GroupKind.Local
    ),
    Custom3 = Table.Combine(Custom2[IDTable])
in
    Custom3

 

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Hi @ImkeF , thanks for having a look at that.  Tried both options and the speed has not noticeable changed.  I have 12 columns now.

 

Out of interest, is the only real difference to the original solution effectively hard coding the columns?

 

Please do not spend any more time on this, your initial solution is operational and I can manage the timing so I do not have to run it urgently.

 

Appreciate all your time and assistance.

 

Cheers, Phil

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors