Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
sebfl
Frequent Visitor

Grouping by and conditional merging

Hello,

I posted this some days ago:

https://community.powerbi.com/t5/Desktop/Grouping-by-and-separating-by-year/m-p/2252436#M819467

 

The answer given in the post is correct, but the fact that it is a virtual table created with measures gives me alot of problems when I try to create dashboards with new columns. In my understanding it would be better to have a table created in Power Query for my specifics needs.

 

The sample data is the same as in the post, with the difference that I no longer need exact matches in the grouping of both years, this means that the desired output is the following:

CityStateCustomer TypeRevenue agg 2021Quantity agg 2021Revenue agg 2020Quantity agg 2020
MelbourneVICHome Office49.54821.8729
SydneyNSWCorporate126.381752.6922
SydneyNSWConsumer0.59331.217
SydneyNSWSmall Business36.1982.7470
MelbourneVICCorporate3.61106nullnull
SydneyNSWHome Office4.9216nullnull

 

I think the way to go is:

  • Grouping by City, State,Customer Type and Order Year. Aggregating by the sum of profit margin and order quantity.
  • Doing a Full Outer Join (in the merge queries option) of the the grouped table with itself, with two conditions:
    • Table 1 is the grouped table with the condition that only uses values of Order Year = 2021
    • Table 2 is the grouped table with the condition that only uses values of Order Year = 2020

 

Now my questions are:

  • How do I group by without overwriting the original table?
    • For example, when I use the merge queries it gives you the option of creating a new table.
  • When doing the Full Outer Join:
    • How do I specify that I need in one table only the data from the current year (and previous year for the other table)?
    • This means that Order Year cannot be exactly a number, it has to represent the current year and previous year.

 

Maybe there is an easier way of doing this but I'm not sure.

 

Any suggestion would be deeply appreciated,

Thanks.

1 ACCEPTED SOLUTION

(I will delete my previous answer, as this one is much improved especially with large database)

Try with

  • Filter for current and previous year
    • Place the Filter step as early as your connector will allow.
    • Consider Table.Buffer after filtering.
  • Table.Group: "Order Year","City", "State", "Customer Type"
    • Aggregate with Sum of the Revenue (Profit Margin) and Quantity  columns.
  • Unpivot the resultant Attribute and Value columns
  • Merge the Year and Attribute columns
  • Pivot on the Merged Columns

Also, omitting the re-sorting operation at the end may also speed things up as that has to re-read the entire table.

 

 

let
    Source = Excel.CurrentWorkbook(){[Name="Table9"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{
        {"Order No", type text}, {"Order Year", Int64.Type}, {"City", type text}, 
        {"State", type text}, {"Customer Type", type text}, {"Order Priority", type text}, 
        {"Product Category", type text}, {"Profit Margin", type number}, {"Order Quantity", Int64.Type}}),

//filter to include only current and previous years
    #"Filtered Rows" = Table.SelectRows(#"Changed Type", each 
        [Order Year] = Date.Year(Date.From(DateTime.LocalNow()))
        or [Order Year] = -1 + Date.Year(Date.From(DateTime.LocalNow()))
        ),

//Group and sum
    #"Grouped Rows" = Table.Group(#"Filtered Rows", {"Order Year","City", "State", "Customer Type"}, {
        {"Revenue", each List.Sum([Profit Margin]),Currency.Type},
        {"Quantity", each List.Sum([Order Quantity]), Int64.Type}

        }),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Grouped Rows", {"Order Year", "City", "State", "Customer Type"}, "Attribute", "Value"),
    #"Merged Columns" = Table.CombineColumns(Table.TransformColumnTypes(#"Unpivoted Columns", {{"Order Year", type text}}, "en-US"),{"Order Year", "Attribute"},Combiner.CombineTextByDelimiter(" ", QuoteStyle.None),"Merged"),
    #"Pivoted Column" = Table.Pivot(#"Merged Columns", List.Distinct(#"Merged Columns"[Merged]), "Merged", "Value")
in
    #"Pivoted Column"

 

 

 

 

ronrsnfld_0-1640694666754.png

 

 

View solution in original post

7 REPLIES 7
ronrsnfld
Super User
Super User

Message deleted by author

Hi @ronrsnfld ,

The answer that you gave me is exactly what I was looking for, the problem is when I try to replicate the code in my original data (with only a million rows). It uses 90% of my RAM and it hasn't finished yet after an hour and a half. Given that the whole data that I will be using (more than 10 million rows) I'm not sure how to proceed.

(I will delete my previous answer, as this one is much improved especially with large database)

Try with

  • Filter for current and previous year
    • Place the Filter step as early as your connector will allow.
    • Consider Table.Buffer after filtering.
  • Table.Group: "Order Year","City", "State", "Customer Type"
    • Aggregate with Sum of the Revenue (Profit Margin) and Quantity  columns.
  • Unpivot the resultant Attribute and Value columns
  • Merge the Year and Attribute columns
  • Pivot on the Merged Columns

Also, omitting the re-sorting operation at the end may also speed things up as that has to re-read the entire table.

 

 

let
    Source = Excel.CurrentWorkbook(){[Name="Table9"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{
        {"Order No", type text}, {"Order Year", Int64.Type}, {"City", type text}, 
        {"State", type text}, {"Customer Type", type text}, {"Order Priority", type text}, 
        {"Product Category", type text}, {"Profit Margin", type number}, {"Order Quantity", Int64.Type}}),

//filter to include only current and previous years
    #"Filtered Rows" = Table.SelectRows(#"Changed Type", each 
        [Order Year] = Date.Year(Date.From(DateTime.LocalNow()))
        or [Order Year] = -1 + Date.Year(Date.From(DateTime.LocalNow()))
        ),

//Group and sum
    #"Grouped Rows" = Table.Group(#"Filtered Rows", {"Order Year","City", "State", "Customer Type"}, {
        {"Revenue", each List.Sum([Profit Margin]),Currency.Type},
        {"Quantity", each List.Sum([Order Quantity]), Int64.Type}

        }),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Grouped Rows", {"Order Year", "City", "State", "Customer Type"}, "Attribute", "Value"),
    #"Merged Columns" = Table.CombineColumns(Table.TransformColumnTypes(#"Unpivoted Columns", {{"Order Year", type text}}, "en-US"),{"Order Year", "Attribute"},Combiner.CombineTextByDelimiter(" ", QuoteStyle.None),"Merged"),
    #"Pivoted Column" = Table.Pivot(#"Merged Columns", List.Distinct(#"Merged Columns"[Merged]), "Merged", "Value")
in
    #"Pivoted Column"

 

 

 

 

ronrsnfld_0-1640694666754.png

 

 

Hi @ronrsnfld ,

The solution works perfectly with larger data, thanks!!

Vera_33
Resident Rockstar
Resident Rockstar

Hi @sebfl 

 

The desired output can be a matrix? Or you have specific request of the column names, order, etc?

Vera_33_0-1640658160124.png

 

sebfl
Frequent Visitor

Hi @Vera_33,

The desired output should be like the following image:

sebfl_0-1640658676410.png

  • It is a table calculated directly in Power Query.

 

wdx223_Daniel_0-1640672995490.png

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    Custom1 = Table.Group(
                          Source,
                          {"City","State","Customer Type"},
                          List.TransformMany(
                                             List.Sort(List.Distinct(Source[Order Year]),1),
                                             each {"Revenue","Qantity"},
                                             (x,y)=>{
                                                     y&" agg "&Text.From(x),
                                                     each List.Sum(
                                                                   Table.Column(
                                                                                Table.SelectRows(
                                                                                                 _,
                                                                                                 (s)=>s[Order Year]=x
                                                                                                ),
                                                                                if y="Revenue"
                                                                                then "Profit Margin"
                                                                                else "Order Quantity"
                                                                               )
                                                                  )
                                                    }
                                            )
                         )
in
    Custom1

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors