cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
JohnJairoV Regular Visitor
Regular Visitor

Transform Data with Power Query

Transform Data with Power Query

Hi to all!

 

Thanks a lot for receive me in this forum.  Is my first message here!  I hope you can help me about the issue.

 

I need to transform with Power Query from this:

 

*ABC
3CompanyOriginDestiny
4AABarcelonaValencia
5AAValenciaBarcelona
6AAMadridRoma
7AARomaMadrid
8AABarcelonaMadrid
9AAMadridBarcelona
10ABBarcelonaValencia
11ABValenciaBarcelona
12ACMadridRoma
13ACRomaMadrid
14ACBarcelonaRoma
15ACRomaBarcelona

 

Into This:

 

 

*EFGHI
3CitiesNo. Of DestiniesDestiny 1Destiny 2Destiny 3
4Barcelona3ValenciaMadridRoma
5Valencia1Barcelona  
6Madrid2RomaBarcelona 
7Roma2MadridBarcelona

 

I already know how to do it with formulas and vba. But I wanna know how can I get this result with Power Query. Thx in advanced for your answers. Blessings!

1 ACCEPTED SOLUTION

Accepted Solutions
Super User
Super User

Re: Transform Data with Power Query

then we take this code instead (it will ignore all columns that are not relevant to the results):

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlHSUXJ0BBJOiUXJqTn5eYlAdlhiTmpecmaiUqxOtJIpTAVcFFkxSIUZTIVvYkpRZgqQEZSfC5Eyh0mBReAqQFIW2GxGkrfEMBXVVkMDkAInfE43NIQpwel2QyOQEmesjjc0hslhut7QBCaHbDtCqymaViRbYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Company = _t, Origin = _t, Destiny = _t]),
    Group = Table.Group(Source, {"Origin"}, {{"Count", each List.Count(List.Distinct(_[Destiny])), type number}, {"Destinations", each Table.FromRows({List.Distinct(_[Destiny])}), type table}}),
    AllColumnNames = Table.ColumnNames(Table.Combine(Group[Destinations])),
    #"Expanded Destinations" = Table.ExpandTableColumn(Group, "Destinations", AllColumnNames, AllColumnNames)
in
    #"Expanded Destinations"

 

The step "Group" will chunk the table up per "Origin" and return 2 columns: "Count" with the count of all distinct destinations and "Destinations" with a table per row containing a column with all distinct destinations.

 

These nested functions need to be read "inside-out" like in Excel: So for the "Count" we start with _[Destiny] which selects the column "Destiny" from the table and returns it it List-format. So we take "List.Distinct" in order to reduce it to distinct values only. Then a List.Count on it.

 

In order to return the list of distinct destinations in a table-format (quasi-transpose), we use Table.FromRows, as this is just made for this job. Just be aware that it could also create a table from multiple rows, so the format of the input-parameter it expects is a list of lists (therefore the additional curly brackets).

 

Now we have the content we need, but we need to expand the table that has been returned: If you would do that manually, a command would be created that let's you open all columns of the table in the first row and these column names would be hardcoded. So in order to avoid that, we create a list of column names that is dynamic, retrieving all column names from all tables. Therefore we combine all tables into one big one (Table.Combine) and read out the column names from it (Table.ColumnNames) in step "AllColumnNames". We use this as a variable for our last step "#"Expanded Destinations".

 

The key to write and read such a compact code lies in understanding the specific value types (https://msdn.microsoft.com/en-us/library/mt186367.aspx) of the necessary input- & output-parameters of the functions that you can find here: https://msdn.microsoft.com/en-us/library/mt253322.aspx?f=255&MSPPError=-2147217396

Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!

Imke Feldmann

How to integrate M-code into your solution -- Check out more PBI- learning resources here -- Performance Tipps for M-queries




View solution in original post

12 REPLIES 12
Highlighted
JohnJairoV Regular Visitor
Regular Visitor

Transform Data with Power Query

Hi to all!

 

Thanks a lot for receive me in this forum.  Is my first message here!  I hope you can help me about the issue.

 

I need to transform with Power Query from this:

 

*ABC
3CompanyOriginDestiny
4AABarcelonaValencia
5AAValenciaBarcelona
6AAMadridRoma
7AARomaMadrid
8AABarcelonaMadrid
9AAMadridBarcelona
10ABBarcelonaValencia
11ABValenciaBarcelona
12ACMadridRoma
13ACRomaMadrid
14ACBarcelonaRoma
15ACRomaBarcelona

 

Into This:

 

 

*EFGHI
3CitiesNo. Of DestiniesDestiny 1Destiny 2Destiny 3
4Barcelona3ValenciaMadridRoma
5Valencia1Barcelona  
6Madrid2RomaBarcelona 
7Roma2MadridBarcelona

 

I already know how to do it with formulas and vba. But I wanna know how can I get this result with Power Query. Thx in advanced for your answers. Blessings!

Super User
Super User

Re: Transform Data with Power Query

Hmm, this seems to be headed somewhat down the right path but more work to do. @ImkeF is the Power Query guru, any thoughts on this one?

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlHSUXJ0BBJOiUXJqTn5eYlAdlhiTmpecmaiUqxOtJIpTAVcFFkxSIUZTIVvYkpRZgqQEZSfC5Eyh0mBReAqQFIW2GxGkrfEMBXVVkMDkAInfE43NIQpwel2QyOQEmesjjc0hslhut7QBCaHbDtCqymaViRbYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Company = _t, Origin = _t, Destiny = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}}),
    #"Pivoted Column" = Table.Pivot(#"Changed Type", List.Distinct(#"Changed Type"[Destiny]), "Destiny", "ID", List.Sum)
in
    #"Pivoted Column"

I have book! Learn Power BI from Packt


Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!

JohnJairoV Regular Visitor
Regular Visitor

Re: Transform Data with Power Query

Hi @Greg_Deckler and Thanks for your response.

 

The numbers in the first column and letters in the first row not neccesary.  Just is the location of the data in the Excel Sheet.  The Company is not neccesary too (you can delete if neccesary).  Just need the uniques Cities, the number of uniques destinies and the uniques destinies like the last table.  Blessings!

Super User
Super User

Re: Transform Data with Power Query

Closer:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlHSUXJ0BBJOiUXJqTn5eYlAdlhiTmpecmaiUqxOtJIpTAVcFFkxSIUZTIVvYkpRZgqQEZSfC5Eyh0mBReAqQFIW2GxGkrfEMBXVVkMDkAInfE43NIQpwel2QyOQEmesjjc0hslhut7QBCaHbDtCqymaViRbYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Company = _t, Origin = _t, Destiny = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}}),
    #"Duplicated Column" = Table.DuplicateColumn(#"Changed Type", "Origin", "Origin - Copy"),
    #"Renamed Columns" = Table.RenameColumns(#"Duplicated Column",{{"Origin - Copy", "City"}}),
    #"Reordered Columns" = Table.ReorderColumns(#"Renamed Columns",{"ID", "Company", "City", "Origin", "Destiny"}),
    #"Pivoted Column" = Table.Pivot(#"Reordered Columns", List.Distinct(#"Reordered Columns"[Destiny]), "Destiny", "Origin"),
    #"Removed Duplicates" = Table.Distinct(#"Pivoted Column", {"Valencia", "Barcelona", "Roma", "Madrid"}),
    #"Grouped Rows" = Table.Group(#"Removed Duplicates", {"City"}, {{"Count", each Table.RowCount(_), type number}})
in
    #"Grouped Rows"

I have book! Learn Power BI from Packt


Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!

KGrice Established Member
Established Member

Re: Transform Data with Power Query

This worked for me, starting from the Origin and Destiny columns only. Sorry I don't have much time to explain at the moment, but here's the end result (didn't rename/reorder columns, but it's all the same values you're looking for):

 

EDIT: came back and added comments to the M code, and ended with the same column names and order as expected

 

Updated.PNG

 

And here's the M code. I used this for figuring out one of the steps, identified below in the comments. There's some room for improvement, particularly in making the last part dynamic if necessary.

 

let
    // Starting with the Origin and Destiny columns only, with headers already promoted
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    
    // Created a column that makes Origin-Destiny pairs, as we only need unique combinations of the two
    #"Inserted Merged Column" = Table.AddColumn(Source, "Origin-Destiny", each Text.Combine({Text.From([Origin], "en-US"), Text.From([Destiny], "en-US")}, ":"), type text),
    
    // Removed the duplicates, leaving a unique combination of Origin-Destiny combinations
    #"Removed Duplicates" = Table.Distinct(#"Inserted Merged Column", {"Origin-Destiny"}),

    // Added an index field to restore original table order at the end. Might not be necessary, but didn't 
    // require editing the formula at the link provided as a resource Smiley Happy
    #"Added Index" = Table.AddIndexColumn(#"Removed Duplicates", "Index", 1, 1),

    // Grouped on the Origin, and included all rows in the grouping. This will allow making an index that starts over for each origin
    #"Grouped Rows" = Table.Group(#"Added Index", {"Origin"}, {{"Values", each _, type table}}),
    
    // This step requires using the Advanced Editor to enter it, and is provided from the linked resource
    // Allows you to sort the grouped rows to create an internal index on each Origin
    Sorted = Table.TransformColumns(#"Grouped Rows",{{"Values", each Table.AddIndexColumn(Table.Sort(_, "Index"),"GroupIndex", 1, 1)}}),
    
    // Expand back out to the original table, now including your index per origin
    #"Expanded Values" = Table.ExpandTableColumn(Sorted, "Values", {"Destiny", "Origin-Destiny", "Index", "GroupIndex"}, {"Destiny", "Origin-Destiny", "Index", "GroupIndex"}),
    
    // Sort back to your original order. Again, maybe not necessary
    #"Sorted Rows" = Table.Sort(#"Expanded Values",{{"Index", Order.Ascending}}),

    // Get rid of the Origin-Destiny combo column and the original Index
    #"Removed Columns1" = Table.RemoveColumns(#"Sorted Rows",{"Origin-Destiny", "Index"}),

    // Pivot so that the destination numbers are now column headings, with the destination under the correct column
    #"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Removed Columns1", {{"GroupIndex", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(#"Removed Columns1", {{"GroupIndex", type text}}, "en-US")[GroupIndex]), "GroupIndex", "Destiny"),
    
    // My least favorite part of this, as it's not dynamic if you have more than 3 destinations
    #"Added Custom" = Table.AddColumn(#"Pivoted Column", "Destinies", each (if [1]<>null then 1 else 0)
+ (if [2]<>null then 1 else 0)
+ (if [3]<>null then 1 else 0)),
    
    // This is just another option to do the step above a different way, by selecting columns 1-3...
    // click the Statistics dropdown in Add Column, and select Count Values
    // Could potentially be modified to make it dynamic, i.e., do this for all columns that are not the Origin column
    #"Inserted Count" = Table.AddColumn(#"Added Custom", "Count", each List.NonNullCount({[1], [2], [3]}), Int64.Type),

    // Rename and reorder columns
    #"Renamed Columns" = Table.RenameColumns(#"Inserted Count",{{"1", "Destiny 1"}, {"2", "Destiny 2"}, {"3", "Destiny 3"}}),
    #"Reordered Columns" = Table.ReorderColumns(#"Renamed Columns",{"Origin", "Destinies", "Destiny 1", "Destiny 2", "Destiny 3", "Count"})
in
    #"Reordered Columns"

 

 

 

Super User
Super User

Re: Transform Data with Power Query

For a more dynamic version that works for any number of destinations, you might prefer this code:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WckosSk7Nyc9LVNJRCkvMSc1LzkxUitWJRnB0kNSAJHwTU4oyU4DCQfm5EBEwQwcmARJBNhVJGK4V1UTauwFdGZJpsQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Origin = _t, Destiny = _t]),
    Group = Table.Group(Source, {"Origin"}, {{"Count", each Table.RowCount(_), type number}, {"Destinations", each Table.FromRows({List.Distinct(_[Destiny])}), type table}}),
    AllColumnNames = Table.ColumnNames(Table.Combine(Group[Destinations])),
    #"Expanded Destinations" = Table.ExpandTableColumn(Group, "Destinations", AllColumnNames, AllColumnNames)
in
    #"Expanded Destinations"

Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!

Imke Feldmann

How to integrate M-code into your solution -- Check out more PBI- learning resources here -- Performance Tipps for M-queries




JohnJairoV Regular Visitor
Regular Visitor

Re: Transform Data with Power Query

Hi KGrice!

 

Great! Is similar that I want to obtain.  My PowerQuery is in Spanish (I'm Colombian).  I did again all the steps, and put two querys.  My basis in columns are: Company, Origin and Destiny.

 

First Query: TblDestinos

 

 

let
    Origen = Excel.CurrentWorkbook(){[Name="Tabla1"]}[Content],
    #"Tipo cambiado" = Table.TransformColumnTypes(Origen,{{"Company", type text}, {"Origin", type text}, {"Destiny", type text}}),
    #"Columnas quitadas" = Table.RemoveColumns(#"Tipo cambiado",{"Company"}),
    #"Texto recortado" = Table.TransformColumns(#"Columnas quitadas",{},Text.Trim),
    #"Duplicados quitados" = Table.Distinct(#"Texto recortado"),
    #"Personalizada agregada" = Table.AddColumn(#"Duplicados quitados", "Origin-Destiny", each [Origin]&":"&[Destiny]),
    #"Índice agregado" = Table.AddIndexColumn(#"Personalizada agregada", "Índice", 1, 1),
    #"Filas agrupadas" = Table.Group(#"Índice agregado", {"Origin"}, {{"Valores", each _, type table}}),
    Ordenado = Table.TransformColumns(#"Filas agrupadas",{{"Valores", each Table.AddIndexColumn(Table.Sort(_, "Índice"),"ÍndiceGrupo", 1, 1)}}),
    #"Se expandió Valores" = Table.ExpandTableColumn(Ordenado, "Valores", {"Origin", "Destiny", "Origin-Destiny", "Índice", "ÍndiceGrupo"}, {"Valores.Origin", "Valores.Destiny", "Valores.Origin-Destiny", "Valores.Índice", "Valores.ÍndiceGrupo"}),
    #"Filas ordenadas" = Table.Sort(#"Se expandió Valores",{{"Valores.Índice", Order.Ascending}}),
    #"Columnas quitadas1" = Table.RemoveColumns(#"Filas ordenadas",{"Valores.Origin", "Valores.Origin-Destiny", "Valores.Índice"}),
    #"Columna dinamizada" = Table.Pivot(Table.TransformColumnTypes(#"Columnas quitadas1", {{"Valores.ÍndiceGrupo", type text}}, "es-CO"), List.Distinct(Table.TransformColumnTypes(#"Columnas quitadas1", {{"Valores.ÍndiceGrupo", type text}}, "es-CO")[Valores.ÍndiceGrupo]), "Valores.ÍndiceGrupo", "Valores.Destiny")
in
    #"Columna dinamizada"

The another query: TblCantidadDestinos

 

let
    Origen = Excel.CurrentWorkbook(){[Name="Tabla1"]}[Content],
    #"Tipo cambiado" = Table.TransformColumnTypes(Origen,{{"Company", type text}, {"Origin", type text}, {"Destiny", type text}}),
    #"Columnas quitadas" = Table.RemoveColumns(#"Tipo cambiado",{"Company"}),
    #"Texto recortado" = Table.TransformColumns(#"Columnas quitadas",{},Text.Trim),
    #"Duplicados quitados" = Table.Distinct(#"Texto recortado"),
    #"Filas agrupadas" = Table.Group(#"Duplicados quitados", {"Origin"}, {{"No. Destinos", each Table.RowCount(_), type number}}),
    #"Consultas combinadas" = Table.NestedJoin(#"Filas agrupadas",{"Origin"},TblDestinos,{"Origin"},"NewColumn",JoinKind.LeftOuter),
    #"Se expandió NewColumn" = Table.ExpandTableColumn(#"Consultas combinadas", "NewColumn", {"1", "2", "3", "4"}, {"1", "2", "3", "4"})
in
    #"Se expandió NewColumn"

 

This works, but just if I have a maximum of 4 Destinies.  There are a way to do it dinamically?

Really Thanks for your help.

 

JohnJairoV Regular Visitor
Regular Visitor

Re: Transform Data with Power Query

Hi @ImkeF

 

Awesome! This is a tiny code... But I have troubles with your code.  I get this result:

 

OriginCountColumn1Column2Column3
Barcelona4ValenciaMadridRoma
Valencia2Barcelona  
Madrid3RomaBarcelona 
Roma3MadridBarcelona 

 

If I put this table as Basis (note the new register with Company AD)

 

CompanyOriginDestiny
AABarcelonaValencia
AAValenciaBarcelona
AAMadridRoma
AARoma Madrid
AABarcelonaMadrid
AAMadridBarcelona
ABBarcelonaValencia
ABValenciaBarcelona
ACMadridRoma
ACRoma Madrid
ACBarcelonaRoma
ACRoma Barcelona
ADCaliBarcelona

 

Not show in the results.  Can you explain the logic of your code?  Thank you for all your responses.  Blessings!

Super User
Super User

Re: Transform Data with Power Query

then we take this code instead (it will ignore all columns that are not relevant to the results):

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlHSUXJ0BBJOiUXJqTn5eYlAdlhiTmpecmaiUqxOtJIpTAVcFFkxSIUZTIVvYkpRZgqQEZSfC5Eyh0mBReAqQFIW2GxGkrfEMBXVVkMDkAInfE43NIQpwel2QyOQEmesjjc0hslhut7QBCaHbDtCqymaViRbYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Company = _t, Origin = _t, Destiny = _t]),
    Group = Table.Group(Source, {"Origin"}, {{"Count", each List.Count(List.Distinct(_[Destiny])), type number}, {"Destinations", each Table.FromRows({List.Distinct(_[Destiny])}), type table}}),
    AllColumnNames = Table.ColumnNames(Table.Combine(Group[Destinations])),
    #"Expanded Destinations" = Table.ExpandTableColumn(Group, "Destinations", AllColumnNames, AllColumnNames)
in
    #"Expanded Destinations"

 

The step "Group" will chunk the table up per "Origin" and return 2 columns: "Count" with the count of all distinct destinations and "Destinations" with a table per row containing a column with all distinct destinations.

 

These nested functions need to be read "inside-out" like in Excel: So for the "Count" we start with _[Destiny] which selects the column "Destiny" from the table and returns it it List-format. So we take "List.Distinct" in order to reduce it to distinct values only. Then a List.Count on it.

 

In order to return the list of distinct destinations in a table-format (quasi-transpose), we use Table.FromRows, as this is just made for this job. Just be aware that it could also create a table from multiple rows, so the format of the input-parameter it expects is a list of lists (therefore the additional curly brackets).

 

Now we have the content we need, but we need to expand the table that has been returned: If you would do that manually, a command would be created that let's you open all columns of the table in the first row and these column names would be hardcoded. So in order to avoid that, we create a list of column names that is dynamic, retrieving all column names from all tables. Therefore we combine all tables into one big one (Table.Combine) and read out the column names from it (Table.ColumnNames) in step "AllColumnNames". We use this as a variable for our last step "#"Expanded Destinations".

 

The key to write and read such a compact code lies in understanding the specific value types (https://msdn.microsoft.com/en-us/library/mt186367.aspx) of the necessary input- & output-parameters of the functions that you can find here: https://msdn.microsoft.com/en-us/library/mt253322.aspx?f=255&MSPPError=-2147217396

Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!

Imke Feldmann

How to integrate M-code into your solution -- Check out more PBI- learning resources here -- Performance Tipps for M-queries




View solution in original post

Helpful resources

Announcements
Can You Solve These Challenge

Challenge: Can You Solve These?

Find out how to participate in the first Power BI 'Can You Solve These?' challenge.

Community News & Announcements

Community News & Announcements

Get your latest community news and announcements.

Virtual Launch Event

Microsoft Business Applications October Virtual Launch Event

Join us for an in-depth look at the new innovations across Dynamics 365 and the Microsoft Power Platform.

Community Kudopalooza

Win Power BI Swag with Community Kudopalooza!

Each week, complete activities and be qualified in the drawing for cool Power BI Swag.

Users Online
Currently online: 331 members 3,260 guests
Please welcome our newest community members: