Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
JohnJairoV
Helper I
Helper I

Transform Data with Power Query

Transform Data with Power Query

Hi to all!

 

Thanks a lot for receive me in this forum.  Is my first message here!  I hope you can help me about the issue.

 

I need to transform with Power Query from this:

 

*ABC
3CompanyOriginDestiny
4AABarcelonaValencia
5AAValenciaBarcelona
6AAMadridRoma
7AARomaMadrid
8AABarcelonaMadrid
9AAMadridBarcelona
10ABBarcelonaValencia
11ABValenciaBarcelona
12ACMadridRoma
13ACRomaMadrid
14ACBarcelonaRoma
15ACRomaBarcelona

 

Into This:

 

 

*EFGHI
3CitiesNo. Of DestiniesDestiny 1Destiny 2Destiny 3
4Barcelona3ValenciaMadridRoma
5Valencia1Barcelona  
6Madrid2RomaBarcelona 
7Roma2MadridBarcelona

 

I already know how to do it with formulas and vba. But I wanna know how can I get this result with Power Query. Thx in advanced for your answers. Blessings!

1 ACCEPTED SOLUTION

then we take this code instead (it will ignore all columns that are not relevant to the results):

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlHSUXJ0BBJOiUXJqTn5eYlAdlhiTmpecmaiUqxOtJIpTAVcFFkxSIUZTIVvYkpRZgqQEZSfC5Eyh0mBReAqQFIW2GxGkrfEMBXVVkMDkAInfE43NIQpwel2QyOQEmesjjc0hslhut7QBCaHbDtCqymaViRbYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Company = _t, Origin = _t, Destiny = _t]),
    Group = Table.Group(Source, {"Origin"}, {{"Count", each List.Count(List.Distinct(_[Destiny])), type number}, {"Destinations", each Table.FromRows({List.Distinct(_[Destiny])}), type table}}),
    AllColumnNames = Table.ColumnNames(Table.Combine(Group[Destinations])),
    #"Expanded Destinations" = Table.ExpandTableColumn(Group, "Destinations", AllColumnNames, AllColumnNames)
in
    #"Expanded Destinations"

 

The step "Group" will chunk the table up per "Origin" and return 2 columns: "Count" with the count of all distinct destinations and "Destinations" with a table per row containing a column with all distinct destinations.

 

These nested functions need to be read "inside-out" like in Excel: So for the "Count" we start with _[Destiny] which selects the column "Destiny" from the table and returns it it List-format. So we take "List.Distinct" in order to reduce it to distinct values only. Then a List.Count on it.

 

In order to return the list of distinct destinations in a table-format (quasi-transpose), we use Table.FromRows, as this is just made for this job. Just be aware that it could also create a table from multiple rows, so the format of the input-parameter it expects is a list of lists (therefore the additional curly brackets).

 

Now we have the content we need, but we need to expand the table that has been returned: If you would do that manually, a command would be created that let's you open all columns of the table in the first row and these column names would be hardcoded. So in order to avoid that, we create a list of column names that is dynamic, retrieving all column names from all tables. Therefore we combine all tables into one big one (Table.Combine) and read out the column names from it (Table.ColumnNames) in step "AllColumnNames". We use this as a variable for our last step "#"Expanded Destinations".

 

The key to write and read such a compact code lies in understanding the specific value types (https://msdn.microsoft.com/en-us/library/mt186367.aspx) of the necessary input- & output-parameters of the functions that you can find here: https://msdn.microsoft.com/en-us/library/mt253322.aspx?f=255&MSPPError=-2147217396

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

13 REPLIES 13
JohnJairoV
Helper I
Helper I

Hi to all!

 

Thanks a lot for receive me in this forum.  Is my first message here!  I hope you can help me about the issue.

 

I need to transform with Power Query from this:

 

*ABC
3CompanyOriginDestiny
4AABarcelonaValencia
5AAValenciaBarcelona
6AAMadridRoma
7AARomaMadrid
8AABarcelonaMadrid
9AAMadridBarcelona
10ABBarcelonaValencia
11ABValenciaBarcelona
12ACMadridRoma
13ACRomaMadrid
14ACBarcelonaRoma
15ACRomaBarcelona

 

Into This:

 

 

*EFGHI
3CitiesNo. Of DestiniesDestiny 1Destiny 2Destiny 3
4Barcelona3ValenciaMadridRoma
5Valencia1Barcelona  
6Madrid2RomaBarcelona 
7Roma2MadridBarcelona

 

I already know how to do it with formulas and vba. But I wanna know how can I get this result with Power Query. Thx in advanced for your answers. Blessings!

Hmm, this seems to be headed somewhat down the right path but more work to do. @ImkeF is the Power Query guru, any thoughts on this one?

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlHSUXJ0BBJOiUXJqTn5eYlAdlhiTmpecmaiUqxOtJIpTAVcFFkxSIUZTIVvYkpRZgqQEZSfC5Eyh0mBReAqQFIW2GxGkrfEMBXVVkMDkAInfE43NIQpwel2QyOQEmesjjc0hslhut7QBCaHbDtCqymaViRbYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Company = _t, Origin = _t, Destiny = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}}),
    #"Pivoted Column" = Table.Pivot(#"Changed Type", List.Distinct(#"Changed Type"[Destiny]), "Destiny", "ID", List.Sum)
in
    #"Pivoted Column"

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

Hi @Greg_Deckler and Thanks for your response.

 

The numbers in the first column and letters in the first row not neccesary.  Just is the location of the data in the Excel Sheet.  The Company is not neccesary too (you can delete if neccesary).  Just need the uniques Cities, the number of uniques destinies and the uniques destinies like the last table.  Blessings!

Closer:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlHSUXJ0BBJOiUXJqTn5eYlAdlhiTmpecmaiUqxOtJIpTAVcFFkxSIUZTIVvYkpRZgqQEZSfC5Eyh0mBReAqQFIW2GxGkrfEMBXVVkMDkAInfE43NIQpwel2QyOQEmesjjc0hslhut7QBCaHbDtCqymaViRbYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Company = _t, Origin = _t, Destiny = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}}),
    #"Duplicated Column" = Table.DuplicateColumn(#"Changed Type", "Origin", "Origin - Copy"),
    #"Renamed Columns" = Table.RenameColumns(#"Duplicated Column",{{"Origin - Copy", "City"}}),
    #"Reordered Columns" = Table.ReorderColumns(#"Renamed Columns",{"ID", "Company", "City", "Origin", "Destiny"}),
    #"Pivoted Column" = Table.Pivot(#"Reordered Columns", List.Distinct(#"Reordered Columns"[Destiny]), "Destiny", "Origin"),
    #"Removed Duplicates" = Table.Distinct(#"Pivoted Column", {"Valencia", "Barcelona", "Roma", "Madrid"}),
    #"Grouped Rows" = Table.Group(#"Removed Duplicates", {"City"}, {{"Count", each Table.RowCount(_), type number}})
in
    #"Grouped Rows"

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

This worked for me, starting from the Origin and Destiny columns only. Sorry I don't have much time to explain at the moment, but here's the end result (didn't rename/reorder columns, but it's all the same values you're looking for):

 

EDIT: came back and added comments to the M code, and ended with the same column names and order as expected

 

Updated.PNG

 

And here's the M code. I used this for figuring out one of the steps, identified below in the comments. There's some room for improvement, particularly in making the last part dynamic if necessary.

 

let
    // Starting with the Origin and Destiny columns only, with headers already promoted
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    
    // Created a column that makes Origin-Destiny pairs, as we only need unique combinations of the two
    #"Inserted Merged Column" = Table.AddColumn(Source, "Origin-Destiny", each Text.Combine({Text.From([Origin], "en-US"), Text.From([Destiny], "en-US")}, ":"), type text),
    
    // Removed the duplicates, leaving a unique combination of Origin-Destiny combinations
    #"Removed Duplicates" = Table.Distinct(#"Inserted Merged Column", {"Origin-Destiny"}),

    // Added an index field to restore original table order at the end. Might not be necessary, but didn't 
    // require editing the formula at the link provided as a resource :)
    #"Added Index" = Table.AddIndexColumn(#"Removed Duplicates", "Index", 1, 1),

    // Grouped on the Origin, and included all rows in the grouping. This will allow making an index that starts over for each origin
    #"Grouped Rows" = Table.Group(#"Added Index", {"Origin"}, {{"Values", each _, type table}}),
    
    // This step requires using the Advanced Editor to enter it, and is provided from the linked resource
    // Allows you to sort the grouped rows to create an internal index on each Origin
    Sorted = Table.TransformColumns(#"Grouped Rows",{{"Values", each Table.AddIndexColumn(Table.Sort(_, "Index"),"GroupIndex", 1, 1)}}),
    
    // Expand back out to the original table, now including your index per origin
    #"Expanded Values" = Table.ExpandTableColumn(Sorted, "Values", {"Destiny", "Origin-Destiny", "Index", "GroupIndex"}, {"Destiny", "Origin-Destiny", "Index", "GroupIndex"}),
    
    // Sort back to your original order. Again, maybe not necessary
    #"Sorted Rows" = Table.Sort(#"Expanded Values",{{"Index", Order.Ascending}}),

    // Get rid of the Origin-Destiny combo column and the original Index
    #"Removed Columns1" = Table.RemoveColumns(#"Sorted Rows",{"Origin-Destiny", "Index"}),

    // Pivot so that the destination numbers are now column headings, with the destination under the correct column
    #"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Removed Columns1", {{"GroupIndex", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(#"Removed Columns1", {{"GroupIndex", type text}}, "en-US")[GroupIndex]), "GroupIndex", "Destiny"),
    
    // My least favorite part of this, as it's not dynamic if you have more than 3 destinations
    #"Added Custom" = Table.AddColumn(#"Pivoted Column", "Destinies", each (if [1]<>null then 1 else 0)
+ (if [2]<>null then 1 else 0)
+ (if [3]<>null then 1 else 0)),
    
    // This is just another option to do the step above a different way, by selecting columns 1-3...
    // click the Statistics dropdown in Add Column, and select Count Values
    // Could potentially be modified to make it dynamic, i.e., do this for all columns that are not the Origin column
    #"Inserted Count" = Table.AddColumn(#"Added Custom", "Count", each List.NonNullCount({[1], [2], [3]}), Int64.Type),

    // Rename and reorder columns
    #"Renamed Columns" = Table.RenameColumns(#"Inserted Count",{{"1", "Destiny 1"}, {"2", "Destiny 2"}, {"3", "Destiny 3"}}),
    #"Reordered Columns" = Table.ReorderColumns(#"Renamed Columns",{"Origin", "Destinies", "Destiny 1", "Destiny 2", "Destiny 3", "Count"})
in
    #"Reordered Columns"

 

 

 

Hi KGrice!

 

Great! Is similar that I want to obtain.  My PowerQuery is in Spanish (I'm Colombian).  I did again all the steps, and put two querys.  My basis in columns are: Company, Origin and Destiny.

 

First Query: TblDestinos

 

 

let
    Origen = Excel.CurrentWorkbook(){[Name="Tabla1"]}[Content],
    #"Tipo cambiado" = Table.TransformColumnTypes(Origen,{{"Company", type text}, {"Origin", type text}, {"Destiny", type text}}),
    #"Columnas quitadas" = Table.RemoveColumns(#"Tipo cambiado",{"Company"}),
    #"Texto recortado" = Table.TransformColumns(#"Columnas quitadas",{},Text.Trim),
    #"Duplicados quitados" = Table.Distinct(#"Texto recortado"),
    #"Personalizada agregada" = Table.AddColumn(#"Duplicados quitados", "Origin-Destiny", each [Origin]&":"&[Destiny]),
    #"Índice agregado" = Table.AddIndexColumn(#"Personalizada agregada", "Índice", 1, 1),
    #"Filas agrupadas" = Table.Group(#"Índice agregado", {"Origin"}, {{"Valores", each _, type table}}),
    Ordenado = Table.TransformColumns(#"Filas agrupadas",{{"Valores", each Table.AddIndexColumn(Table.Sort(_, "Índice"),"ÍndiceGrupo", 1, 1)}}),
    #"Se expandió Valores" = Table.ExpandTableColumn(Ordenado, "Valores", {"Origin", "Destiny", "Origin-Destiny", "Índice", "ÍndiceGrupo"}, {"Valores.Origin", "Valores.Destiny", "Valores.Origin-Destiny", "Valores.Índice", "Valores.ÍndiceGrupo"}),
    #"Filas ordenadas" = Table.Sort(#"Se expandió Valores",{{"Valores.Índice", Order.Ascending}}),
    #"Columnas quitadas1" = Table.RemoveColumns(#"Filas ordenadas",{"Valores.Origin", "Valores.Origin-Destiny", "Valores.Índice"}),
    #"Columna dinamizada" = Table.Pivot(Table.TransformColumnTypes(#"Columnas quitadas1", {{"Valores.ÍndiceGrupo", type text}}, "es-CO"), List.Distinct(Table.TransformColumnTypes(#"Columnas quitadas1", {{"Valores.ÍndiceGrupo", type text}}, "es-CO")[Valores.ÍndiceGrupo]), "Valores.ÍndiceGrupo", "Valores.Destiny")
in
    #"Columna dinamizada"

The another query: TblCantidadDestinos

 

let
    Origen = Excel.CurrentWorkbook(){[Name="Tabla1"]}[Content],
    #"Tipo cambiado" = Table.TransformColumnTypes(Origen,{{"Company", type text}, {"Origin", type text}, {"Destiny", type text}}),
    #"Columnas quitadas" = Table.RemoveColumns(#"Tipo cambiado",{"Company"}),
    #"Texto recortado" = Table.TransformColumns(#"Columnas quitadas",{},Text.Trim),
    #"Duplicados quitados" = Table.Distinct(#"Texto recortado"),
    #"Filas agrupadas" = Table.Group(#"Duplicados quitados", {"Origin"}, {{"No. Destinos", each Table.RowCount(_), type number}}),
    #"Consultas combinadas" = Table.NestedJoin(#"Filas agrupadas",{"Origin"},TblDestinos,{"Origin"},"NewColumn",JoinKind.LeftOuter),
    #"Se expandió NewColumn" = Table.ExpandTableColumn(#"Consultas combinadas", "NewColumn", {"1", "2", "3", "4"}, {"1", "2", "3", "4"})
in
    #"Se expandió NewColumn"

 

This works, but just if I have a maximum of 4 Destinies.  There are a way to do it dinamically?

Really Thanks for your help.

 

For a more dynamic version that works for any number of destinations, you might prefer this code:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WckosSk7Nyc9LVNJRCkvMSc1LzkxUitWJRnB0kNSAJHwTU4oyU4DCQfm5EBEwQwcmARJBNhVJGK4V1UTauwFdGZJpsQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Origin = _t, Destiny = _t]),
    Group = Table.Group(Source, {"Origin"}, {{"Count", each Table.RowCount(_), type number}, {"Destinations", each Table.FromRows({List.Distinct(_[Destiny])}), type table}}),
    AllColumnNames = Table.ColumnNames(Table.Combine(Group[Destinations])),
    #"Expanded Destinations" = Table.ExpandTableColumn(Group, "Destinations", AllColumnNames, AllColumnNames)
in
    #"Expanded Destinations"

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

syelify
Frequent Visitor

Hi, 

I would like to understand how to get this: 

Json.Document(Binary.Decompress(Binary.FromText("i45WckosSk7Nyc9LVNJRCkvMSc1LzkxUitWJRnB0kNSAJHwTU4oyU4DCQfm5

"

from the excel file? 
Thanks in advance

 

Hi @ImkeF

 

Awesome! This is a tiny code... But I have troubles with your code.  I get this result:

 

OriginCountColumn1Column2Column3
Barcelona4ValenciaMadridRoma
Valencia2Barcelona  
Madrid3RomaBarcelona 
Roma3MadridBarcelona 

 

If I put this table as Basis (note the new register with Company AD)

 

CompanyOriginDestiny
AABarcelonaValencia
AAValenciaBarcelona
AAMadridRoma
AARoma Madrid
AABarcelonaMadrid
AAMadridBarcelona
ABBarcelonaValencia
ABValenciaBarcelona
ACMadridRoma
ACRoma Madrid
ACBarcelonaRoma
ACRoma Barcelona
ADCaliBarcelona

 

Not show in the results.  Can you explain the logic of your code?  Thank you for all your responses.  Blessings!

then we take this code instead (it will ignore all columns that are not relevant to the results):

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlHSUXJ0BBJOiUXJqTn5eYlAdlhiTmpecmaiUqxOtJIpTAVcFFkxSIUZTIVvYkpRZgqQEZSfC5Eyh0mBReAqQFIW2GxGkrfEMBXVVkMDkAInfE43NIQpwel2QyOQEmesjjc0hslhut7QBCaHbDtCqymaViRbYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Company = _t, Origin = _t, Destiny = _t]),
    Group = Table.Group(Source, {"Origin"}, {{"Count", each List.Count(List.Distinct(_[Destiny])), type number}, {"Destinations", each Table.FromRows({List.Distinct(_[Destiny])}), type table}}),
    AllColumnNames = Table.ColumnNames(Table.Combine(Group[Destinations])),
    #"Expanded Destinations" = Table.ExpandTableColumn(Group, "Destinations", AllColumnNames, AllColumnNames)
in
    #"Expanded Destinations"

 

The step "Group" will chunk the table up per "Origin" and return 2 columns: "Count" with the count of all distinct destinations and "Destinations" with a table per row containing a column with all distinct destinations.

 

These nested functions need to be read "inside-out" like in Excel: So for the "Count" we start with _[Destiny] which selects the column "Destiny" from the table and returns it it List-format. So we take "List.Distinct" in order to reduce it to distinct values only. Then a List.Count on it.

 

In order to return the list of distinct destinations in a table-format (quasi-transpose), we use Table.FromRows, as this is just made for this job. Just be aware that it could also create a table from multiple rows, so the format of the input-parameter it expects is a list of lists (therefore the additional curly brackets).

 

Now we have the content we need, but we need to expand the table that has been returned: If you would do that manually, a command would be created that let's you open all columns of the table in the first row and these column names would be hardcoded. So in order to avoid that, we create a list of column names that is dynamic, retrieving all column names from all tables. Therefore we combine all tables into one big one (Table.Combine) and read out the column names from it (Table.ColumnNames) in step "AllColumnNames". We use this as a variable for our last step "#"Expanded Destinations".

 

The key to write and read such a compact code lies in understanding the specific value types (https://msdn.microsoft.com/en-us/library/mt186367.aspx) of the necessary input- & output-parameters of the functions that you can find here: https://msdn.microsoft.com/en-us/library/mt253322.aspx?f=255&MSPPError=-2147217396

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Hi @ImkeF! Awesome! Thx for your Example and explanations from you.

 

I have a last question about it.  You use Json.Document instead of Excel.CurrentWorkbook, result of making the query from table into Excel.  What is the advantage of use Json?

Sorry, should have said this: The Source-expression using JSON should be replaced by the reference to your table.

(I just used it to "ship" sample data with so that you can just copy/paste the code into the advanced editor and see the results.)

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Great @ImkeF! all clear now.  Thx for your support.  Blessings!

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.