Solved: Unpivot (Append?) Repeating column formats

pistachio · ‎03-08-2020

I have a bulky (55mb) excel sheet of repeating attribute readings. There are 800k rows & about 25 columns with the repeating format of ID / Datetime / Value. For example:

| ID   | Datetime  |   Value | (blank) | ID_1 | Datetime_1 | Value_1 | (blank) ....
| 47   |   17/3    |   17.45 | (blank) |  455 |  18/4      |   2     | (blank) ....
| 47   |   21/3    |   12    | (blank) |  455 |  12/4      |   21    | (blank) ....

I would like to combine these into a single power query table of 3 columns and possibly a couple million rows

| ID   | Datetime  |   Value |

ImkeF · ‎03-09-2020

Hi @pistachio

sorry, just read your second comment previously. you can try the following technique, less "intelligent" action required:

let
    Source = Web.Page(Web.Contents("https://community.powerbi.com/t5/Desktop/Unpivot-Append-Repeating-column-formats/m-p/964070/highlight/false#M462025")),
    Data0 = Source{0}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(Data0, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"ID", Int64.Type}, {"Datetime", type datetime}, {"Value", Int64.Type}, {"", type text}, {"ID_1", Int64.Type}, {"Datetime_2", type datetime}, {"Value_3", type number}, {"_4", type text}, {"ID_5", Int64.Type}, {"Datetime_6", type datetime}, {"Value_7", type number}, {"_8", type text}, {"ID_9", Int64.Type}, {"Datetime_10", type datetime}, {"Value_11", type number}}),
    Custom1 = Table.ToColumns(#"Changed Type"),
    #"Converted to Table" = Table.FromList(Custom1, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Added Index" = Table.AddIndexColumn(#"Converted to Table", "Index", 0, 1),
    #"Integer-Divided Column" = Table.TransformColumns(#"Added Index", {{"Index", each Number.IntegerDivide(_, 4), Int64.Type}}),
    #"Grouped Rows" = Table.Group(#"Integer-Divided Column", {"Index"}, {{"Partition", each Table.FromColumns(_[Column1]), type table [Column1=list, Index=number]}}, GroupKind.Local),
    Custom2 = Table.Combine(#"Grouped Rows"[Partition])
in
    Custom2

For performance it is crucial to use the "GroupKind.Local" in step "Grouped Rows"

Please let me know about the performance difference to the first Pivot-solution, thanks.

Please not that for this solution it is crucial that you always have the same number of columns per repetition!!

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

amitchandak · ‎03-08-2020

Refer, if these couple of link can help

https://www.powerquery.training/grouping-or-summarizing-data/

https://www.poweredsolutions.co/2019/07/30/grouping-rows-with-power-bi-power-query/

!! Power BI 101 Interview questions !! !! Master Microsoft Fabric- 36 Videos !!
Microsoft Power BI Learning Resources, 2023 !!
Learn Power BI - Full Course with Dec-2022, with Window, Index, Offset, 100+ Topics !!
Did I answer your question? Mark my post as a solution! Appreciate your Kudos !! Proud to be a Super User! !!

ImkeF · ‎03-09-2020

Hi @pistachio

the following steps should do the work:

1) Add an Index column to the table

2) Check ID-column & new Index-column -> Transform (or righ-mouse-click) -> Unpivot other columns

3) Split "Attribute"-column by underscore "_"

4) Pivot "back" on the first splitted column ("Attribute.1") with "Values.1" as the "Value"-field and select "Don't aggregate" in the advanced options

Paste the following code into the advanced editor and you can follow the steps:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjFX0lEyNNc3MAbTOiamQBrEtNA3AVJGSrE6UEVGhvpgNUZQBUb6RmBRpdhYAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Datetime = _t, Value = _t, Column1 = _t, Datetime_1 = _t, Value_1 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Datetime", type date}, {"Value", Int64.Type}, {"Column1", type text}, {"Datetime_1", type date}, {"Value_1", Int64.Type}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
    #"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Added Index", {"ID", "Index"}, "Attribute", "Value.1"),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Unpivoted Other Columns", "Attribute", Splitter.SplitTextByDelimiter("_", QuoteStyle.Csv), {"Attribute.1", "Attribute.2"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Attribute.1", type text}, {"Attribute.2", Int64.Type}}),
    #"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Changed Type1", {{"Attribute.1", type text}}, "en-GB"), List.Distinct(Table.TransformColumnTypes(#"Changed Type1", {{"Attribute.1", type text}}, "en-GB")[Attribute.1]), "Attribute.1", "Value.1"),
    #"Removed Other Columns" = Table.SelectColumns(#"Pivoted Column",{"ID", "Value", "Datetime"})
in
    #"Removed Other Columns"

Next time if you paste sample data, please use HTML-table like described here: https://community.powerbi.com/t5/Community-Blog/How-to-provide-sample-data-in-the-Power-BI-Forum/ba-...

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

pistachio · ‎03-09-2020

Hi Imke,

Thanks very much for your detailed response. I get confused at the unpivoting stage but I think this is the right track. When I pasted your code steps into power query they show the base data as only having one ID column. I've taken your advice and pasted more example data as HTML.

This is straight from the Excel, I assume Power Query puts the _2, _3 suffixes etc when I use first row as headers. In reality there are many more column groups

ID	Datetime	Value	ID	Datetime	Value	ID	Datetime	Value	ID	Datetime	Value
407	28/01/2020 14:36:27	0	7	28/01/2020 14:36:40	2.83	68	28/01/2020 14:36:46	0.26	57	28/01/2020 14:36:38	0.72
407	28/01/2020 14:37:27	0	7	28/01/2020 14:37:40	2.82	68	28/01/2020 14:37:46	0.25	57	28/01/2020 14:37:38	0.72
407	28/01/2020 14:38:27	0	7	28/01/2020 14:38:40	2.83	68	28/01/2020 14:38:46	0.26	57	28/01/2020 14:37:56	0.66
407	28/01/2020 14:39:27	0	7	28/01/2020 14:39:40	2.77	68	28/01/2020 14:39:46	0.25	57	28/01/2020 14:38:58	0.68
407	28/01/2020 14:40:27	0	7	28/01/2020 14:40:18	2.81	68	28/01/2020 14:40:46	0.26	57	28/01/2020 14:39:58	0.7

Can I convert this to a single 3 column table?

Greg_Deckler · ‎03-08-2020

So, there are probably easier ways to do this and @ImkeF probably has the solution. But, worst case, create a query from the Excel file and remove all but the first 3 columns. Create a second query, choose the next 3 columns (ID_1, Datetime_1, Value_1) and remove all the other columns. Rename the columns to ID, Datetime, Value. Rinse and repeat the same steps for the second query for a third, fourth, fifth query, etc. Then, use an Append query to append them all together.

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

pistachio · ‎03-09-2020

Yep cheers I've done this but each query seems like it has to reload the 55mb workbook so it takes a long time to get all together. Wondering if there was a better way

ImkeF · ‎03-09-2020

Hi @pistachio

sorry, just read your second comment previously. you can try the following technique, less "intelligent" action required:

let
    Source = Web.Page(Web.Contents("https://community.powerbi.com/t5/Desktop/Unpivot-Append-Repeating-column-formats/m-p/964070/highlight/false#M462025")),
    Data0 = Source{0}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(Data0, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"ID", Int64.Type}, {"Datetime", type datetime}, {"Value", Int64.Type}, {"", type text}, {"ID_1", Int64.Type}, {"Datetime_2", type datetime}, {"Value_3", type number}, {"_4", type text}, {"ID_5", Int64.Type}, {"Datetime_6", type datetime}, {"Value_7", type number}, {"_8", type text}, {"ID_9", Int64.Type}, {"Datetime_10", type datetime}, {"Value_11", type number}}),
    Custom1 = Table.ToColumns(#"Changed Type"),
    #"Converted to Table" = Table.FromList(Custom1, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Added Index" = Table.AddIndexColumn(#"Converted to Table", "Index", 0, 1),
    #"Integer-Divided Column" = Table.TransformColumns(#"Added Index", {{"Index", each Number.IntegerDivide(_, 4), Int64.Type}}),
    #"Grouped Rows" = Table.Group(#"Integer-Divided Column", {"Index"}, {{"Partition", each Table.FromColumns(_[Column1]), type table [Column1=list, Index=number]}}, GroupKind.Local),
    Custom2 = Table.Combine(#"Grouped Rows"[Partition])
in
    Custom2

For performance it is crucial to use the "GroupKind.Local" in step "Grouped Rows"

Please let me know about the performance difference to the first Pivot-solution, thanks.

Please not that for this solution it is crucial that you always have the same number of columns per repetition!!

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

pistachio · ‎03-09-2020

That works great! I struggle to understand the steps but was able to replicate it across all of my data thank you.

The performance was about twice as fast as the original method (make individual query for each row grouping, append into one query). Interestingly, the size of data pulled in is still much larger than the Excel sheet i.e. the excel sheet is 55mb, but when refreshing the .pbix file the progress bar loads upwards of 200mb from that query.

Still, very pleased thank you

Anonymous · ‎03-09-2020

This is really great solution!

Unpivot (Append?) Repeating column formats

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024

How to Get Your Question Answered Quickly