Solved: Re: Split by delimiter and autofill to next row (w...

sofia · ‎11-11-2022

Hi,

I have an issue where my source data looks like this, i.e data is split into rows, but the references are not.

I would want the reference numbers to be split by delimiter and moved into the next row. Expected result:

Trans. Ref.	Book Ref.	Index
20546	13491	20
20547	13492	21

I I split by delimiter I get duplicate values and I can't figure out how to get rid of them to keep only the relevant rows.

Any help would be much appreciated.

Poohkrd · ‎11-11-2022

Hi! you want something like this?

let
    Source = #table(
        {"Trans. Ref", "Book. Ref", "Index"}, 
        {   {"20546, 20547", "13491, 13492", 20},
            {"20546, 20547", "13491, 13492", 21},
            {"20548, 20549, 20550", "13493, 13494, 13495", 22},
            {"20548, 20549, 20550", "13493, 13494, 13495", 23},
            {"20548, 20549, 20550", "13493, 13494, 13495", 24}
        }),
    Headers = List.Buffer( Table.ColumnNames(Source) ),
    Grouped = Table.Group(Source, {"Trans. Ref", "Book. Ref"}, {{"l", each Table.ToColumns( Table.RemoveColumns(_,{"Trans. Ref", "Book. Ref"}) ), type list}}),
    Split = Table.TransformColumns(Grouped,{{"Trans. Ref", Splitter.SplitTextByDelimiter(", "), type text}, {"Book. Ref", Splitter.SplitTextByDelimiter(", "), type text}}),
    MergedColumns = Table.CombineColumns( Split,{"Trans. Ref", "Book. Ref","l"}, (x)=>Table.FromColumns( {x{0}} & {x{1}} & x{2}, Headers ), "Out" ),
    Expanded = Table.ExpandTableColumn( MergedColumns, "Out", Headers )
in
    Expanded

View solution in original post

sofia · ‎11-11-2022

Thanks but not exactly. I basically just want unique values in book ref and trans ref. by transferring the values to the nex row. The source date can look like this

Pay. Date	Order Ref.	Trans. Ref.	Book. Ref	Pay. Ref.	Payments Value Total	Index
2022-08-06 14:33:32	3929	20530	13704	4050	1600	11
2022-08-06 14:29:21	3926	20525, 20526, 20527, 20528	13700, 13701, 13702, 13703	4046	1600	13
2022-08-06 14:29:22	3926	20525, 20526, 20527, 20528	13700, 13701, 13702, 13703	4047	800	14
2022-08-06 14:29:22	3926	20525, 20526, 20527, 20528	13700, 13701, 13702, 13703	4048	800	15
2022-08-06 14:29:22	3926	20525, 20526, 20527, 20528	13700, 13701, 13702, 13703	4049	800	16

And I would want it to look like this

Pay. Date	Order Ref.	Trans. Ref.	Book. Ref	Pay. Ref.	Payments Value Total	Index
2022-08-06 14:33:32	3929	20530	13704	4050	1600	11
2022-08-06 14:29:21	3926	20525	13700	4046	1600	13
2022-08-06 14:29:22	3926	20526	13701	4047	800	14
2022-08-06 14:29:22	3926	20527	13702	4048	800	15
2022-08-06 14:29:22	3926	20528	13703	4049	800	16

Poohkrd · ‎11-11-2022

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("tc1LCsMwDIThqxitHZBGsvy4SvD9r1HHKqQUumtWH7OZ/zwJDBzcDvYkNlSHgjJpR1+Ai/JStLItjcuezhuhmb8f0AckHjweUHK68KAG7X3LOV1IgEB3y/yjpT9a+FOrLlqk7OFUu1Pl4VS/U05zvgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Pay. Date" = _t, #"Order Ref." = _t, #"Trans. Ref." = _t, #"Book. Ref" = _t, #"Pay. Ref." = _t, #"Payments Value Total" = _t, Index = _t]),
    TR_BR = List.Buffer( {"Trans. Ref.", "Book. Ref"} ),
    Headers = List.Buffer( TR_BR & List.Select( Table.ColumnNames(Source), each not List.Contains(TR_BR, _) ) ),
    Grouped = Table.Group(Source, TR_BR, {{"l", each Table.ToColumns( Table.RemoveColumns(_,TR_BR) ), type table}}),
    Split = Table.TransformColumns(Grouped,{{"Trans. Ref.", Splitter.SplitTextByDelimiter(", "), type list}, {"Book. Ref", Splitter.SplitTextByDelimiter(", "), type list}}),
    MergedColumns = Table.CombineColumns( Split, TR_BR & {"l"}, (x)=>Table.FromColumns( {x{0}} & {x{1}} & x{2}, Headers ), "Out" ),
    Expanded = Table.ExpandTableColumn( MergedColumns, "Out", Headers )
in
    Expanded

ToddChitt · ‎11-11-2022

But 20546 is 'linked' to Index of 20, AND to Index of 21. So is 20547. And so are both 13491 and 13492.

In your new data, you will lose that.

How do you determine that only the FIRST instance (top row) of 20546 gets associated with Index of 20?

Did I answer your question? If so, mark my post as a solution. Also consider helping someone else in the forums!

Proud to be a Super User!

sofia · ‎11-11-2022

The index represent the row number from the source data (I added it) and I can see that the source data has splitted the payments into rows that can be related to products in another table (by book ref). The logic seem to be that the 1st row relates to the first book ref, the 2nd to the second row etc. The Book refs have not been splitted between the rows though.

Poohkrd · ‎11-11-2022

Hi! you want something like this?

let
    Source = #table(
        {"Trans. Ref", "Book. Ref", "Index"}, 
        {   {"20546, 20547", "13491, 13492", 20},
            {"20546, 20547", "13491, 13492", 21},
            {"20548, 20549, 20550", "13493, 13494, 13495", 22},
            {"20548, 20549, 20550", "13493, 13494, 13495", 23},
            {"20548, 20549, 20550", "13493, 13494, 13495", 24}
        }),
    Headers = List.Buffer( Table.ColumnNames(Source) ),
    Grouped = Table.Group(Source, {"Trans. Ref", "Book. Ref"}, {{"l", each Table.ToColumns( Table.RemoveColumns(_,{"Trans. Ref", "Book. Ref"}) ), type list}}),
    Split = Table.TransformColumns(Grouped,{{"Trans. Ref", Splitter.SplitTextByDelimiter(", "), type text}, {"Book. Ref", Splitter.SplitTextByDelimiter(", "), type text}}),
    MergedColumns = Table.CombineColumns( Split,{"Trans. Ref", "Book. Ref","l"}, (x)=>Table.FromColumns( {x{0}} & {x{1}} & x{2}, Headers ), "Out" ),
    Expanded = Table.ExpandTableColumn( MergedColumns, "Out", Headers )
in
    Expanded

AlexisOlson · ‎11-11-2022

@Poohkrd This is really nice! Here's my variation of your code in case you're interested:

let
    Source = #table(
        {"Trans. Ref.", "Book. Ref", "Index"}, 
        {   {"20546, 20547", "13491, 13492", 20},
            {"20546, 20547", "13491, 13492", 21},
            {"20548, 20549, 20550", "13493, 13494, 13495", 22},
            {"20548, 20549, 20550", "13493, 13494, 13495", 23},
            {"20548, 20549, 20550", "13493, 13494, 13495", 24}
        }),
    Headers = List.Buffer(Table.ColumnNames(Source)),
    TrBr = {"Trans. Ref.", "Book. Ref"},
    OtherCols = List.RemoveItems(Headers, TrBr),
    Grouped = Table.Group(Source, TrBr, {{"OtherCols", each Table.RemoveColumns(_, TrBr), type table}}),
    TransformList = List.Transform(TrBr, each {_, Splitter.SplitTextByDelimiter(", "), type text}),
    Split = Table.TransformColumns(Grouped, TransformList),
    MergedColumns = Table.TransformRows(Split, each Table.FromColumns({[#"Trans. Ref."], [#"Book. Ref"]} & Table.ToColumns([OtherCols]), TrBr & OtherCols)),
    Combined = Table.Combine(MergedColumns),
    #"Reordered Columns" = Table.ReorderColumns(Combined, Headers)
in
    #"Reordered Columns"

@sofia Please give full credit to Pookrd for this solution. I came up with a different solution on my own but it isn't as robust or generalizable as this one but I'm including it as an alternative approach that might be easier to follow:

let
    Source = #table(
        {"Trans. Ref", "Book. Ref", "Index"}, 
        {
            {"20546, 20547", "13491, 13492", 20},
            {"20546, 20547", "13491, 13492", 21},
            {"20548, 20549, 20550", "13493, 13494, 13495", 22},
            {"20548, 20549, 20550", "13493, 13494, 13495", 23},
            {"20548, 20549, 20550", "13493, 13494, 13495", 24}
        }),
    Headers = List.Buffer(Table.ColumnNames(Source)),
    NonIndexCols = List.RemoveItems(Headers, {"Index"}),
    Grouped = Table.Group(Source, NonIndexCols, {}),
    TransRef = List.Union(List.Transform(Grouped[Trans. Ref], each Text.Split(_, ", "))),
    BookRef = List.Union(List.Transform(Grouped[Book. Ref], each Text.Split(_, ", "))),
    ConstructTable = Table.FromColumns({TransRef, BookRef, Source[Index]}, Headers)
in
    ConstructTable

Poohkrd · ‎11-11-2022

@AlexisOlson Thanks for the compliment. I have already provided the correct code for the new source from @sofia. I didn't take into consideration the order of column names there.

Split by delimiter and autofill to next row (without duplicating)

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024