Only 6% of Analytics Reporting in organizations source their data from standard databases. Oh no, I made up that number. I have no idea what the numbers are. But I am certain that the majority of self-service BI & data analytics users must deal with a lot of crappy data.
One of the big issues Power Query users must also deal with is Table Structuring. When should you use “Transpose” and when should you use “Unpivot”?
Order ID labelled “b” is well structured in a column, but Segment and Ship Mode are on Rows instead of columns as well.
So, we started by Transposing the table.
Then we did a little extra cleaning, removing the Total lines, Filling down the Segment Column. And the very important one, Promoted the first row to headings. This step was necessary to do a partial transpose (Unpivot) of all other columns apart from Segment and Ship Mode (which are now in order).
Then we could get a clean table (after naming the columns properly).
What we have cleverly done can be summarized as:
Because we have rows that are meant to be columns, we transposed the table.
Although we had a good column before that transpose operation, it went bad (into a row).
We were fine with that because, by making that new row become a heading row, we would apply an Unpivot to that section to bring it back as a column.
Apparently, our steps worked because we had only one good column before transpose. And after transposing, we could make that row a header row and unpivot the section of headings to bring it back. If we had two good columns before transpose, it means we would have those two columns go into rows after transpose. Only one can be made header, and only that same one can be Unpivoted back to column. This is the bone of contention in this article.
Have a look at the table below:
If we transpose this table, Ship Mode and Segment (a) will be well aligned as columns while Order ID and Order Date (b) will now be wrongly aligned as rows. That is back to square 1. We would be running in circles.
So, the approach must be different to solve this, and the following steps summarizes that:
Import the data and delete all steps from Promoted Headers. (This, you now know, is a standard approach if you need to transpose a table without losing any of its contents.)
Merge the two columns that are good into one. (This is the key to solving this. By merging all columns that are in perfect order, when the table is transposed, and things go south for them, we can unpivot them all together).
Select Column 1 and Column 2, right click on the column header and select merge.
Make sure to use a delimiter that is not part of the data content.
Transpose the table. Go to the Transform tab and hit the Transpose button. (for tables that require transpose operations and have columns that are in order, the columns should be combined into a single one, with a unique delimiter.
Promote Headers. Make the first row (which now contains the combined Order ID’s and Order Dates) the headers. This is necessary for us to bring them back to column with an Unpivot Step. (for the sake of this data, a little cleaning is also required to fill down rows with null values prior to promoting the headers)
Select the good columns (Ship Mode and Segment) and unpivot the others. This will bring back the combined Order ID and Order Date as a column, with the ensuing values as another column right in front. What do you think the next step would be from here?
My guess is you guessed right. Now we need to split the new column named Attribute, to get it separated into the Order ID component and the Order Date Component, using our clever delimiter.
The final thing would be to tidy things up, as necessary. Such as renaming the headers properly, filtering out Grand Totals (they are somewhere in this data) and renaming columns appropriately.
I will conclude by stressing the fact that, when you require a Transpose step in your cleaning steps, you should:
Ensure there are no promoted header steps before you transpose.
If you have more than one column in order, merge them into one before you transpose.
To carry out a partial transpose of selected columns, use the Unpivot option.
Make sure the items you need to be partially transposed are sitting on the headers of your data.