cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Theycallmesnack
Frequent Visitor

Convert three columns with repeating rows into a primary key, field columns, and associated rows?

Hi PowerBI folks, 

 

I'm at the last stretch of a fairly lengthy process where I ingested and parsed a series of XML files into constiuent tables. Each table contains a primary key (survey ID), a field column, and a value column. 

 

The issue is that each of the fields is on its own row with its associated value. In order to get this to work, I needed to change the field name to columns, but allow the value column to stay as rows (it's difficult to explain, but hopefully the example helps). Thus having the survey ID be unique, the fields being individual columns, and the values populating for each field under the associated column. 

 

Ultimately, I was able to format the tables containing numeric values by pivoting the field column. Unfortunately, I'm running into issues with the demographics table where both the field and value columns are strings. I've tried simply transposing the data (first trying it with one of the columns, then two, then all three) but I keep getting a complexity error. 

 

I've also tried grouping the data into lists, then using =table.columns() with a delimiter, but that doesn't preserve the value column. 

 

Here's an example of the format I need the tables in (the successful one with numerical values)

Theycallmesnack_0-1656601586271.png

Here's an example of the one I'm struggling with (It's an Excel table because I needed to anonymize everything, the demographic field is meaningless, pretend they're all different values):

SURVEY_ID    Demographic    Demographic Value

2996734628ADJSAMPSpoon
2996734628AGELlama
2996734628AGERG35 - 49 Yrs
2996734628CONTFEEDYes
2996734628DISTRIBInternet
2996734628DVC_TP_EMobile Phone
2996734628DVC_TP_SMobile Phone
2996734628ITADD2Pistacio
2996734628ITAGE47
2996734628ITCITYNew york
2996734628ITCLAIMC7825
2996734628ITCLAIMD20220125
2996734628ITCLAIMRXXXXXXXXXX
2996734628ITCLAIMTM
2996734628ITCLINICDepartment of Truth
2996734628ITCONTRAH0174
2996734628ITDIAG_2Chronic Silliness
2996734628ITDISDAT1/18/2022
2996734628ITDOB6/7/1974
2996734628ITETHNIC-2 N/A
2996734628ITE_FLAGN
2996734628ITLOCATIDOT
2996734628ITLOCCDLC19
2996734628ITLOC_ADXYZ
2996734628ITLOC_CINY
2996734628ITLOC_STNY
2996734628ITLOC_ZI782082127
2996734628ITMDSPECFAMILY PRACTICE
2996734628ITMD_FBob
2996734628ITMD_LRoss
2996734628ITMD_NAMBob Ross
2996734628ITMD_TYPPhysician
2996734628ITMEM_ID28888744
2996734628ITMOBILE2106092957
2996734628ITNPI1417453903
2996734628ITPHONE2016092957
2996734628ITPLANkazoo
2996734628ITPLANDEKazoo
2996734628ITPLANNAMedicare
2996734628ITPRIDIAI10
2996734628ITPROC_C99441
2996734628ITPROVFGPrimary Care
2996734628ITRACE-2 N/A
2996734628ITSERVTYMT0102E
2996734628ITSEXNew york
2996734628ITSITE_I7825
2996734628ITSPECCD20220125
2996734628ITSPECIAXXXXXXXXXX
2996734628ITSTATEM
2996734628ITSTATECDepartment of Truth
2996734628ITSVCDATH0174
2996734628ITTAX_IDChronic Silliness
2996734628ITZIP1/18/2022
2996734628LANGUAGE6/7/1974
2996734628PAGELINK-2 N/A
2996734628REVISIONN
2996734628SEXDOT
2996734628SITEIDLC19
2996734628TIMECOMPXYZ
2996734628VST_TYPENY
2996737032ADJSAMPNY
2996737032AGE782082127
2996737032AGERGFAMILY PRACTICE
2996737032CONTFEEDBob
2996737032DISTRIBRoss
2996737032DVC_TP_EBob Ross
2996737032DVC_TP_SPhysician
2996737032ITAGE28888744
2996737032ITCITY2106092957
2996737032ITCLAIMC1417453903
2996737032ITCLAIMD2016092957
2996737032ITCLAIMRNew york
2996737032ITCLAIMT40432425
2996737032ITCLINIC60644725
2996737032ITCONTRAXXXXXXXXXX
2996737032ITDIAG_2M
2996737032ITDISDATDepartment of Truth
2996737032ITDOBH0175
2996737032ITETHNICChronic Silliness
2996737032ITE_FLAG10/25/1926
2996737032ITLOCATI6/7/1974
2996737032ITLOCCD-2 N/A
2996737032ITLOC_ADN
2996737032ITLOC_CIDOT
2996737032ITLOC_STLC20
2996737032ITLOC_ZIXYZ
2996737032ITMDSPECNY
2996737032ITMD_FNY
2996737032ITMD_L782082128
2996737032ITMD_NAMFAMILY PRACTICE
2996737032ITMD_TYPBob
2996737032ITMEM_IDRoss
2996737032ITMOBILEBob Ross
2996737032ITNPIPhysician
2996737032ITPHONE2710375537
2996737032ITPLAN3237672895
2996737032ITPLANDE3764970254
2996737032ITPLANNA4292267612
2996737032ITPRIDIANew york
2996737032ITPROC_C80857025
2996737032ITPROVFG101069325
2996737032ITRACEXXXXXXXXXX
2996737032ITSERVTYM
2996737032ITSEXDepartment of Truth
2996737032ITSITE_IH0176
2996737032ITSPECCDChronic Silliness
2996737032ITSPECIA1/18/2022
2996737032ITSTATE6/7/1974
2996737032ITSTATEC-2 N/A
2996737032ITSVCDATN
2996737032ITTAX_IDDOT
2996737032ITZIPLC21
2996737032LANGUAGEXYZ
2996737032PAGELINKNY
2996737032PROVIDNY
2996737032REVISION782082129
2996737032SEXFAMILY PRACTICE
2996737032SITEIDBob
2996737032SPRoss
2996737032TIMECOMPBob Ross
2996737032VST_TYPEPhysician

 

Any assistance would be greatly appreciated! 

 

Thanks, 

 

-Snacks

1 ACCEPTED SOLUTION
HotChilli
Super User
Super User

Pivot the demographic column (demographic values in the Values section) and "Don't aggregate" under Advanced.

I think that's what your after.

Let me know.

View solution in original post

4 REPLIES 4
HotChilli
Super User
Super User

Apologies for the late reply.  I try not to worry too much about performance until it becomes an issue.  If you follow good practice for designing the model (bearing in mind the user requirements) i.e. get the granularity of the fact tables right, create a star schema with dimension tables to slice and dice the data, you can't go far wrong.

I think the main issue you face at this stage is this part "want to keep their options open".  The ideal of presenting a dataset which is a utopia for users is a bit of a pipedream. So try to get a few examples of what they want to do (even if they want to replicate existing reports).

Good luck.

wdx223_Daniel
Super User
Super User

NewStep= Table.Pivot(PreviousStepName,List.Distinct(PreviousStepName[Demographic]),"Demographic","Demographic Value",each _{0}?)

HotChilli
Super User
Super User

Pivot the demographic column (demographic values in the Values section) and "Don't aggregate" under Advanced.

I think that's what your after.

Let me know.

Thanks HotChilli! (and Daniel as well. Ultimately the command is doing the same thing),

 

That's exactly what I needed. Can't show you the entirety of the end result, but here's a snippet: 

Theycallmesnack_1-1656687055020.png

 

That leaves me with a quick f/u question that's much lower stakes :-): 

Now, I've got a few options for the last step: I'd like to be able to alias the fields using the lookup table included in the XML (but loaded separately...the folks who built these XML files had some interesting ideas...). So here's what I've got planned: 

  1. Merge the lookup table with the Question/Answer table, then delete the original question field leaving only the aliased headings (once I pivot out that field). 
  2. Leave the three final tables as is (Question/Answer, Demographics, Comments), relate them by the newly unique survey ID, and then pull them into Visualizations/tables as needed Or...
  3. Merge all three tables leaving one particularly wide table with all the questions, answers, and comments in a single table. 

Using method 2 or 3 will yield the same final result, but thinking of the end users, which of these two is likely to yield better performance? I'm unfortunately in a situation where the end users "want to keep their options open" for final visualizations, so I don't have any idea what sort of dashboards they're going to have us cook up, and which fields are or aren't particularly important. 

 

Thanks again, this was incredibly key! 

 

-Snacks

Helpful resources

Announcements
September Update

Check it Out!

Click here to learn more about the September 2022 updates!

Power BI Dev Camp Session 26

New Date - Check it Out!

Mark your calendars and join us on Thursday, October 6 at 11a PDT for a great session with Ted Pattison!

Top Kudoed Authors