Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
FrankZappasMama
Frequent Visitor

Calculations in Custom Column with LOOKUP and IF

Folks,

 

I'm struggling to create a custom column to the Project List dataset (explained below) that needs to lookup two values in the Project Cost dataset and return it. There are three relevant datasets all loaded into the PowerBI project that I need to work with:

  • Project List (one row per project, hundreds of rows)
  • Project Hours (multiple rows per project, thousands of rows)
  • Project Cost (few dozen rows based on one cost per quarter per category)

I'm able to easily link the Project List and Project Hours and then create reports as:

  • Project List contains each project in a single row with a unique project id
  • Project Hours contains multiple rows for each project all referencing the project id

However, I'm struggling to deal with Project Cost. Each projects cost should by obtained by referencing Project Category and Project Start Date in the Project Cost dataset which contains four columns:

  • Dvision (irrelevant for this)
  • Start Date (one cost per category per quarter)
  • Category (3 types)
  • Cost ($/hr)

Just for the sake of clarity, each project will only have a single cost ($/hr) which is determined by when it started and which category it is. In the Project Cost dataset there are three Costs listed for each quarter, ie. one for each category. I just need to pull the right one across to the Project List so I can then work with it in the Reports and Visualizations.

 

Can someone help me write the Custom Column formula that will Lookup the correct Project $/hr based on Project Category and Project Start Date in Project List please?

 

Cheers,

FZM

2 ACCEPTED SOLUTIONS

Another way, using the same sample data, is to use the TOPN function:

ProjectCost2 = 
CALCULATE(MAX(Costs[Cost]),
          TOPN(1,
               FILTER(Costs,Costs[Category] = Projects[Category] && 
                            Costs[Date] <= Projects[Start]),
               Costs[Date]))

View solution in original post

Of course. Bare with me though as this may appear a little more complex. I could post just the "M" code (you can see it below) but I assume it to be more helpful to lead you through the steps in the GUI to get there. Apply these steps:

1. Use Get Data to load your Projects and Costs tables and edit in Power Query. Select the Projects table.

2. Select Merge Queries. Select the Category table from Projects, select the Costs table in the dropdown, select the Category column in the Costs table and select an Inner Join in the drop down. Click OK.

3. Expand the resulting tables in the Costs column in the result by clicking the icon to the right in the Costs column header.

4. Now we need to only keep rows where Costs.Date is equal or earlier than Start date. Create a new custom column named Keep defined as [Costs.Date] <= [Start]. Click the filter button in the Keep header and unselect FALSE.

5. Click Group By under transform. Click Advanced. Select ID as your Group By column. Enter CostDate as your New column name, select Max as you operation and select Costs.Date as Column. Click Add aggregation to get a new row. Enter Start, Max, Start for the fields. Click Add aggregation again. Enter Category, Max, Category. Click Add aggregation again. Enter ProjectCost, MAX, Costs.Cost. Repeat this as necessary if you have other columns you need to have in your result. Click OK. 

 

Done. You can now remove/reorder/rename columns as appropriate. If you click Advanced Editor you can see the resulting "M" code, it should look something like this:

let
    Source = Excel.Workbook(File.Contents("D:\OneDrive\file.xlsx"), null, true),
    Projects_Sheet = Source{[Item="Projects",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(Projects_Sheet, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"ID", type text}, {"Start", type date}, {"Category", type text}, {"Sum", Int64.Type}}),
    #"Merged Queries" = Table.NestedJoin(#"Changed Type",{"Category"},Costs,{"Category"},"Costs",JoinKind.Inner),
    #"Expanded Costs" = Table.ExpandTableColumn(#"Merged Queries", "Costs", {"Category", "Date", "Cost"}, {"Costs.Category", "Costs.Date", "Costs.Cost"}),
    #"Added Custom" = Table.AddColumn(#"Expanded Costs", "Keep", each [Costs.Date] <= [Start]),
    #"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Keep] = true)),
    #"Grouped Rows" = Table.Group(#"Filtered Rows", {"ID"}, {{"CostDate", each List.Max([Costs.Date]), type date}, {"Start", each List.Max([Start]), type date}, {"Category", each List.Max([Category]), type text}, {"ProjectCost", each List.Max([Costs.Cost]), type number}})
in
    #"Grouped Rows"

Good luck, let me know if you run into issues.

 

View solution in original post

15 REPLIES 15
v-ljerr-msft
Employee
Employee

Hi @FrankZappasMama,

 

Have you tried the solution provided by @erik_tarnvik above? Does it work in your scenario? If it works, could you accept it as solution to close this thread?

 

If you still have any question on this issue, feel free to post here. Smiley Happy

 

Regards

PowerBI legends,

 

My apologies for not replying, my week has been hijacked by some other items so I have not yet tried these solutions.

 

I'm hoping to try this out tomorrow. Will revent then.

 

Thanks

erik_tarnvik
Solution Specialist
Solution Specialist

Assuming the following tables:

Projects:
ID	Start	        Category
A	1/10/2017	Z
B	4/10/2017	Z
C	1/10/2017	X
D	4/10/2017	X

Costs:
Category	Date	        Cost
Z	        1/1/2017	100
Z	        4/1/2017	150
X	        1/1/2017	200
X	        4/1/2017	250

I further assume the logic to be that the cost of Project A should be the Cost in the Costs table that has a date equal or earlier than the Start date where the categories match. This can be achieved with a calculated column in table Projects defined as follows:

ProjectCost = 
VAR CostDate = 
    CALCULATE(MAX(Costs[Date]),
              FILTER(Costs,
                     Costs[Category] = Projects[Category] && 
                     Costs[Date] <= Projects[Start]))
RETURN
    CALCULATE(MAX(Costs[Cost]), 
              FILTER(Costs, Costs[Date] = CostDate && 
                            Costs[Category] = Projects[Category]))

 This results in the following table Projects:

image.png

 

Hope this helps!

 

 

@erik_tarnvik

 

Just found a few minutes to try this out BUT I'm getting a syntax error after adapting the formula you provided for the actual column names in my dataset. Seems it doesn't like the CostDate name....

 

PowerBI 1.PNG

Hi @FrankZappasMama,

 

In addition, the formulas provided by @erik_tarnvik are DAX, not M query. So you should use them in Modeling Table -> New Column, instead of Query Editor -> Custom Column. Smiley Happy

 

c2.PNG

 

Regards

Dare I ask what this same function would be applied as New Column in Query (therefore M)?

Of course. Bare with me though as this may appear a little more complex. I could post just the "M" code (you can see it below) but I assume it to be more helpful to lead you through the steps in the GUI to get there. Apply these steps:

1. Use Get Data to load your Projects and Costs tables and edit in Power Query. Select the Projects table.

2. Select Merge Queries. Select the Category table from Projects, select the Costs table in the dropdown, select the Category column in the Costs table and select an Inner Join in the drop down. Click OK.

3. Expand the resulting tables in the Costs column in the result by clicking the icon to the right in the Costs column header.

4. Now we need to only keep rows where Costs.Date is equal or earlier than Start date. Create a new custom column named Keep defined as [Costs.Date] <= [Start]. Click the filter button in the Keep header and unselect FALSE.

5. Click Group By under transform. Click Advanced. Select ID as your Group By column. Enter CostDate as your New column name, select Max as you operation and select Costs.Date as Column. Click Add aggregation to get a new row. Enter Start, Max, Start for the fields. Click Add aggregation again. Enter Category, Max, Category. Click Add aggregation again. Enter ProjectCost, MAX, Costs.Cost. Repeat this as necessary if you have other columns you need to have in your result. Click OK. 

 

Done. You can now remove/reorder/rename columns as appropriate. If you click Advanced Editor you can see the resulting "M" code, it should look something like this:

let
    Source = Excel.Workbook(File.Contents("D:\OneDrive\file.xlsx"), null, true),
    Projects_Sheet = Source{[Item="Projects",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(Projects_Sheet, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"ID", type text}, {"Start", type date}, {"Category", type text}, {"Sum", Int64.Type}}),
    #"Merged Queries" = Table.NestedJoin(#"Changed Type",{"Category"},Costs,{"Category"},"Costs",JoinKind.Inner),
    #"Expanded Costs" = Table.ExpandTableColumn(#"Merged Queries", "Costs", {"Category", "Date", "Cost"}, {"Costs.Category", "Costs.Date", "Costs.Cost"}),
    #"Added Custom" = Table.AddColumn(#"Expanded Costs", "Keep", each [Costs.Date] <= [Start]),
    #"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Keep] = true)),
    #"Grouped Rows" = Table.Group(#"Filtered Rows", {"ID"}, {{"CostDate", each List.Max([Costs.Date]), type date}, {"Start", each List.Max([Start]), type date}, {"Category", each List.Max([Category]), type text}, {"ProjectCost", each List.Max([Costs.Cost]), type number}})
in
    #"Grouped Rows"

Good luck, let me know if you run into issues.

 

@erik_tarnvik, expertise is a beautiful thing. I'm more or less with you until Step 5. Are there other ways to achieve the grouping step? I have 85 columns that contain a lot of project data that I didn't want to lose in the process. Or would it be easier to load the Projects data file in twice and relate the unformatted table to this newly created table by Project ID?

Yes that would be easier. You don't even have to relate, you should be able to just merge (on project ID) what you have after step 5 with a new load of the file and thereby avoid repeating all the columns. Should have thought of that. If you try that and run into issues, let me know and I'll take a stab at it.

@erik_tarnvik, thanks again.

 

I've now been able to create all the measures that build on this which was always the ultimate goal. I went back to use the original solution with TOPN in a new column so I avoided importing the same data twice. That dataset is on our an internal server which is already streched so fewer queries is better.

 

Power to the Power BI people.

@v-ljerr-msft

 

....and there's the confirmation I'm punching above my weight with this question Smiley Embarassed

Looks like you are missing an equal sign after VAR CostDate?

Another way, using the same sample data, is to use the TOPN function:

ProjectCost2 = 
CALCULATE(MAX(Costs[Cost]),
          TOPN(1,
               FILTER(Costs,Costs[Category] = Projects[Category] && 
                            Costs[Date] <= Projects[Start]),
               Costs[Date]))

@erik_tarnvik,

 

Thank you kindly sir. This works just as I was hoping it would. Much oblidged.

vanessafvg
Super User
Super User

@FrankZappasMama is it possible to see the relationship diagram if its available in order to see how you would link this?

 

are there any relationships set up?





If I took the time to answer your question and I came up with a solution, please mark my post as a solution and /or give kudos freely for the effort 🙂 Thank you!

Proud to be a Super User!




Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.