cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Steveyd78
Frequent Visitor

Pivot columns that have repeated row values - other solutions don't seem to work with my data

This might end up being a duplicate post, I previously hit submit but it the system threw me out (apoloiges if it is)

 

So in a nutshell I have data from a Drupal webform dump which I am having to split into different tables and then pivot so that I can format the columns properly (age as a number, address as text etc). The (simplified) raw data looks something like this:

 

User IDFieldValuePage ID
1Address123 ABC StUser_info
1Phone555-5555555User_info
1Age55User_info
2Address456 DEF StUser_info
2Phone123-4567890User_info
2Age33User_info

 

I need it to look like this:

 

User IDAddressPhoneAgePage ID
1123 ABC St555-555555555User_info
2456 DEF St123-456789033User_info

 

I have read quite a few solutions that need to add indexes, group, add sub-indexes etc. I can follow a fair amount and the closest I get (with no cigar) is something like this:

 

User IDAddressPhoneAgePage ID
1123 ABC St   
  555-5555555  
   55 
    User_info
2you get it!   

 

I fundamentally don't quite understand what I need to do, can somebody point me in the right direction!

 

Thanks in advance

 

Steve

1 ACCEPTED SOLUTION
AlB
Super User III
Super User III

@Steveyd78 

The problem is repeated fields in your data. For instance, for sid 6 there are several repeated values for name (see below). That causes the error when pivoting. In the data you've shared, it seems like all those repetitions have the same value. For instance, below for sid 6, all rows with name = contact_registered_before have value = Yes

all rows with name = contact_registered_before_with have value = Submitted to the OFHA

If that is always the case, simply eliminating duplicates before the pivoting will solve it. Otherwise, we will first need to determine how to treat those cases. For example, if for sid 6 you had two rows with

name = contact_registered_before but one with value = Yes and one with value = No, which one will you choose.  The M code for the previous example adding the step to remove the duplicates:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUXJMSSlKLS4GsgyNjBUcnZwVgkuAnNDi1KL4zLy0fKVYHYjCgIz8vFQgbWpqqmsKAVjVOaZDVGFIGqHYZmJqpuDi6obNNiMk24CO0gUqNbewNMBuINg2Y2NUyVgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"User ID" = _t, Field = _t, Value = _t, #"Page ID" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"User ID", Int64.Type}, {"Field", type text}, {"Value", type text}, {"Page ID", type text}}),

    #"Removed Duplicates" = Table.Distinct(#"Changed Type"),
    #"Pivoted Column" = Table.Pivot(#"Removed Duplicates", List.Distinct(#"Removed Duplicates"[Field]), "Field", "Value"),
    #"Reordered Columns" = Table.ReorderColumns(#"Pivoted Column",{"User ID", "Address", "Phone", "Age", "Page ID"})
in
    #"Reordered Columns"

 

Please mark the question solved when done and consider giving a thumbs up if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

SU18_powerbi_badge

 

sid name value
6 contact_registered_before Yes
6 contact_registered_before Yes
6 contact_registered_before Yes
6 contact_registered_before_with Submitted to the OFHA
6 contact_registered_before_with Submitted to the OFHA
6 contact_registered_before_with Submitted to the OFHA
6 contact_registered_other_groups Yes
6 contact_registered_other_groups Yes
6 contact_registered_other_groups Yes
6 country Canada
6 provide_details Fugiat voluptatem ve
6 provide_details Fugiat voluptatem ve
6 provide_details_other_groups Deserunt qui nisi no
6 provide_details_other_groups Deserunt qui nisi no
6 province_canada Northwest Territories
6 submitting_behalf Yes

 

 

View solution in original post

10 REPLIES 10
Rocco_sprmnt21
Super User II
Super User II

let
    Origine = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUXJMSSlKLS4GsgyNjBUcnZwVgkuAnNDi1KL4zLy0fKVYHYjCgIz8vFQgbWpqqmsKAVjVOaZDVGFIGqHYZmJqpuDi6obNNiMk24CO0gUqNbewNMBuINg2Y2NUyVgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"User ID" = _t, Field = _t, Value = _t, #"Page ID" = _t]),
    #"Modificato tipo" = Table.TransformColumnTypes(Origine,{{"User ID", Int64.Type}, {"Field", type text}, {"Value", type text}, {"Page ID", type text}}),
    #"Raggruppate righe" = Table.Group(#"Modificato tipo", {"User ID", "Page ID"}, {{"all", each Table.PromoteHeaders(Table.Transpose(_[[Field], [Value]]))}}),
    #"Tabella all espansa" = Table.ExpandTableColumn(#"Raggruppate righe", "all", {"Address", "Phone", "Age"}, {"Address", "Phone", "Age"})
in
    #"Tabella all espansa"
Rocco_sprmnt21
Super User II
Super User II

it is not clear to me if you have already had a complete solution or not ... in any case try this

let
    Origine = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUXJMSSlKLS4GsgyNjBUcnZwVgkuAnNDi1KL4zLy0fKVYHYjCgIz8vFQgbWpqqmsKAVjVOaZDVGFIGqHYZmJqpuDi6obNNiMk24CO0gUqNbewNMBuINg2Y2NUyVgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"User ID" = _t, Field = _t, Value = _t, #"Page ID" = _t]),
    #"Modificato tipo" = Table.TransformColumnTypes(Origine,{{"User ID", Int64.Type}, {"Field", type text}, {"Value", type text}, {"Page ID", type text}}),
    #"Colonna trasformata tramite Pivot" = Table.Pivot(#"Modificato tipo", List.Distinct(#"Modificato tipo"[Field]), "Field", "Value", List.First)
    in #"Colonna trasformata tramite Pivot"

 

AlB
Super User III
Super User III

@Steveyd78 

The problem is repeated fields in your data. For instance, for sid 6 there are several repeated values for name (see below). That causes the error when pivoting. In the data you've shared, it seems like all those repetitions have the same value. For instance, below for sid 6, all rows with name = contact_registered_before have value = Yes

all rows with name = contact_registered_before_with have value = Submitted to the OFHA

If that is always the case, simply eliminating duplicates before the pivoting will solve it. Otherwise, we will first need to determine how to treat those cases. For example, if for sid 6 you had two rows with

name = contact_registered_before but one with value = Yes and one with value = No, which one will you choose.  The M code for the previous example adding the step to remove the duplicates:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUXJMSSlKLS4GsgyNjBUcnZwVgkuAnNDi1KL4zLy0fKVYHYjCgIz8vFQgbWpqqmsKAVjVOaZDVGFIGqHYZmJqpuDi6obNNiMk24CO0gUqNbewNMBuINg2Y2NUyVgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"User ID" = _t, Field = _t, Value = _t, #"Page ID" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"User ID", Int64.Type}, {"Field", type text}, {"Value", type text}, {"Page ID", type text}}),

    #"Removed Duplicates" = Table.Distinct(#"Changed Type"),
    #"Pivoted Column" = Table.Pivot(#"Removed Duplicates", List.Distinct(#"Removed Duplicates"[Field]), "Field", "Value"),
    #"Reordered Columns" = Table.ReorderColumns(#"Pivoted Column",{"User ID", "Address", "Phone", "Age", "Page ID"})
in
    #"Reordered Columns"

 

Please mark the question solved when done and consider giving a thumbs up if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

SU18_powerbi_badge

 

sid name value
6 contact_registered_before Yes
6 contact_registered_before Yes
6 contact_registered_before Yes
6 contact_registered_before_with Submitted to the OFHA
6 contact_registered_before_with Submitted to the OFHA
6 contact_registered_before_with Submitted to the OFHA
6 contact_registered_other_groups Yes
6 contact_registered_other_groups Yes
6 contact_registered_other_groups Yes
6 country Canada
6 provide_details Fugiat voluptatem ve
6 provide_details Fugiat voluptatem ve
6 provide_details_other_groups Deserunt qui nisi no
6 provide_details_other_groups Deserunt qui nisi no
6 province_canada Northwest Territories
6 submitting_behalf Yes

 

 

View solution in original post

Steveyd78
Frequent Visitor

Ok so that is fantastic and I think I have good and bad news. I think those duplicates came from a previous merge query (I know where I went wrong) so that can be fixed

 

There are other that will be duplicated. For example there are questions that might have multiple responses, e.g. in "self_status_issues" the respondent would be able to check up to 10 (or so) checkboxes - the dump only tags them to "self_status_issues" and not to the ID of the checkbox.

 

I am going to have to see if we can add additional fields in the database extract, although there maybe ways it can be handled in Power BI.

 

I am going to accept as solution as now I know exactly what the issue is (THANKS!!), this would have taken me a LONG time to work out!!

AlB
Super User III
Super User III

@Steveyd78 

It can most likely be done in the query editor too. M is quite powerful and versatile. But it might be more efficient to do it directly at the source through SQL. I would need some example data to look into it.

Please mark the question solved when done and consider giving a thumbs up if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

 

SU18_powerbi_badge

AlB
Super User III
Super User III

@Steveyd78 

Post it here. Please post the data you're getting the error on, plus the M code of the transformation

Best if you can share the pbix

Please mark the question solved when done and consider giving a thumbs up if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

 

SU18_powerbi_badge

AlB
Super User III
Super User III

Hi @Steveyd78 

Place the following M code in a blank query to see the steps

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUXJMSSlKLS4GsgyNjBUcnZwVgkuAnNDi1KL4zLy0fKVYHYjCgIz8vFQgbWpqqmsKAVjVOaZDVGFIGqHYZmJqpuDi6obNNiMk24CO0gUqNbewNMBuINg2Y2NUyVgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"User ID" = _t, Field = _t, Value = _t, #"Page ID" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"User ID", Int64.Type}, {"Field", type text}, {"Value", type text}, {"Page ID", type text}}),

    #"Pivoted Column" = Table.Pivot(#"Changed Type", List.Distinct(#"Changed Type"[Field]), "Field", "Value"),
    #"Reordered Columns" = Table.ReorderColumns(#"Pivoted Column",{"User ID", "Address", "Phone", "Age", "Page ID"})
in
    #"Reordered Columns"

 

Please mark the question solved when done and consider giving a thumbs up if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

 

SU18_powerbi_badge

Steveyd78
Frequent Visitor

Let me slightly modify my last comment. I just re-did it and I DO get the pivot operation working BUT for some values I see and error and when I click on that I get:

 

"Expression.Error: There were too many elements in the enumeration to complete the operation.
Details:
[List]"

 

This is now slightly outside the scope of this post (the pivoting seems to work), you think I should post this as another thread?

 

Thanks again!!

 

 

Steveyd78
Frequent Visitor

Ok so that seems to be straightforward, when I followed the steps it looked like just like the standard pivot operations out of the user guide

 

However when I do it on my dataset (and I select don't aggregate) I get this error "There were too many elements in the enumeration to complete the operation". Now my dataset has quite a few more rows but apart from the columns are named differently that is about it!  - was there anything that you did in your M that was different from just doing a standard pivot operation?

Steveyd78
Frequent Visitor

Thanks, I will work that through and see how well I understand the steps. Really apprecaited!

Helpful resources

Announcements
MBAS on Demand

2021 Release Wave 2 Plan

Power Platform release plan for the 2021 release wave 2 describes all new features releasing from October 2021 through March 2022.

July 2021 Update 768x460.png

Check it out!

Click here to read more about the July 2021 Updates

Power Query PA Forum 768x460.png

Check it out!

Did you know that you can visit the Power Query Forum in Power BI and now Power Apps

Urdu Hindi D365 Bootcamp 768x460.png

Urdu Hindi D365 Bootcamp

Dont miss our very own April Dunnam’s The Developer Guide to the Galaxy! Find out what the Power Platform has to offer for the traditional developer.

Top Solution Authors
Top Kudoed Authors