Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
EDIT: I figured out a solution, but I would like feedback about my calculated column. I use a lot of ALL() functions and I worry this could cause performance issues. Is there a more streamlined way to write this, or even a way to write it as a measure?
My most recent answer:
Right after I submitted my topic, I came up with something to help determine if a row is blank.
TEST TASKNUMBER Days =
// Find current task number on current row
VAR CurrentrowTASKNUMBER = 'Table'[Task Number]
// Find the earliest start date for this task number
VAR CurrentTASKNUMBERMinStartDate =
MINX(
FILTER(
ALL('Table')
, 'Table'[Task Number] = CurrentRowTASKNUMBER
)
, 'Table'[Task Start]
)
// I created a formula that returns the task number if the task is incomplete; otherwise it returns a blank on completed tasks
VAR ReturnTaskNumberIfTaskIncomplete =
MINX(
FILTER(
ALL('Table')
, 'Table'[Task Number] = CurrentRowTASKNUMBER && ISBLANK('Table'[Finish Date])
)
, CurrentRowTASKNUMBER
)
// Find the highest finish date for this task number.
VAR CurrentTASKNUMBERMaxFinishDate =
MAXX(
FILTER(
ALL('Table')
, 'Table'[Task Number] = CurrentRowTASKNUMBER
)
, 'Table'[Task Finish]
)
RETURN
SWITCH(
TRUE()
, ISBLANK(ReturnTaskNumberIfTaskIncomplete) //If the task is complete, this is blank
, DATEDIFF(CurrentTASKNUMBERMinStartDate, CurrentTASKNUMBERMaxFinishDate, DAY) // Subtract the most recent finish date from the earliest start date
, DATEDIFF(CurrentTASKNUMBERMinStartDate, TODAY(), DAY) // If the task is incopmlete, subtract the earliest start date from today's date
Original post below:
I have a table with the following:
- A column with task numbers. Each task number can have multiple rows as the task flows between different stages. Therefore, there are multiple rows with the same task number.
- A column "row number." This is basically an index row. The more recent the I have found that there are a couple of duplicate index numbers on our table unfortunately, but the this column generally goes in order (ie the later a stage starts, the higher the row number is)
- Each row has a "Start Date" and "End Date"
My user wants me to create the following for each Task Number:
- a column that returns the number of days between the task number's earliest start date and the task number's highest end date. For example, for task 30006591, the earliest Task Start date is 9/30/2011 and the highest Task Finish date is 4/1/2012, so this number would be 184 days. This would only be for columns on completed tasks.
- a column that returns the number of days between today's date and the task number's earliest start date. This would only be for columns that are missing a "Task Finish" date. For example, task number 6989569's earliest start date is 3/1/2012. However task 6989569 is missing an End Date on it's highest row number 39371, so we know this task is incomplete. In this case, we would want to provide the number of dates between today's date and the earliest Start date 3/1/2012.
Currently we have columns that subtract the current row's end date from the current row's start date. However, since each task has multiple rows, that is not the full picture.
Task Number | Row Number | Task Start | Task Finish | Current Row Aging | Current Row Cycle Time | Goal Column 1: Incomplete Task Aging: Days between today's date and the start date for the earliest row number for this task number | Goal Column 2: Completed Task Cycle Time: Days between Earliest Row Number Task Start Date and Highest Row Number Task End Date |
30006591 | 5228 | 9/30/2011 | 2/1/2012 | 124.00 | 184 | ||
16770363 | 8945 | 9/30/2011 | 4/1/2021 | 3471.00 | 4231 | ||
30006591 | 29871 | 2/1/2012 | 4/1/2012 | 60.00 | 184 | ||
6989569 | 15217 | 3/1/2012 | 3/29/2018 | 2219.00 | 4122 | ||
6989569 | 20333 | 3/30/2018 | 1/14/2019 | 290.00 | 4122 | ||
6989569 | 25436 | 1/16/2019 | 2/28/2021 | 774.00 | 4122 | ||
6989569 | 28846 | 3/1/2021 | 12/31/2021 | 305.00 | 4122 | ||
16770363 | 34936 | 4/1/2021 | 5/1/2023 | 760.00 | 4231 | ||
6989569 | 39371 | 1/1/2022 | 529.00 | 4122 | |||
34364789 | 45272 | 2/4/2022 | 495.00 | 495 |
Previously, Felix provided an excellent Power Query solution, but I learned this morning that I will not be able to use power query to solve my problem. Inspired by Felix, I tried to create a version of his solution for a DAX calculated column:
TEST TASKNUMBER Days =
// Create a variable to store the current row's Task Number
VAR CurrentrowTASKNUMBER = 'Table'[Task Number]
// Create a variable to store the earliest start date for the current row's Task Number
VAR CurrentTASKNUMBERMinStartDate =
MINX(
FILTER(
ALL('Table')
, 'Table'[Task Number] = CurrentRowTASKNUMBER
)
, 'Table'[Task Start]
)
// Create a variable to store the most recent Finish Date. Later, this variable will be used later to find the cycle time for completed tasks.
VAR CurrentTASKNUMBERMaxFinishDate =
MAXX(
FILTER(
ALL('Table')
, 'Table'[Task Number] = CurrentRowTASKNUMBER
)
, 'Table'[Task Finish]
)
// Create a variable to store the earliest finish date for the current row's Task Number. This was intended to find incomplete tasks that have a blank for "Final Complete Date" on one of their rows. This is where I ran into a problem: in DAX, this returns the lowest populated date and ignores rows with blank dates.
VAR CurrentTASKNUMBERMinFinishDate =
MINX(
FILTER(
ALL('Table')
, 'Table'[Task Number] = CurrentRowTASKNUMBER
)
, 'Table'[Task Finish]
)
RETURN
SWITCH(
TRUE()
, ISBLANK(CurrentTASKNUMBERMinFinishDate)
, DATEDIFF(CurrentTASKNUMBERMinStartDate, TODAY(), DAY)
, DATEDIFF(CurrentTASKNUMBERMinStartDate, CurrentTASKNUMBERMaxFinishDate, DAY)
)
Does anyone have advice for how I can update my column to find blanks, or rewrite this as a measure?
Right after I submitted my topic, I came up with something to help determine if a row is blank.
TEST TASKNUMBER Days =
VAR CurrentrowTASKNUMBER = 'Table'[Task Number]
// Returns correct answer
VAR CurrentTASKNUMBERMinStartDate =
MINX(
FILTER(
ALL('Table')
, 'Table'[Task Number] = CurrentRowTASKNUMBER
)
, 'Table'[Task Start]
)
// Returns correct answer
// I created a formula that returns the task number if the task is incomplete; otherwise it returns a blank on completed tasks
VAR ReturnTaskNumberIfTaskIncomplete =
MINX(
FILTER(
ALL('Table')
, 'Table'[Task Number] = CurrentRowTASKNUMBER && ISBLANK('Table'[Finish Date])
)
, CurrentRowTASKNUMBER
)
VAR CurrentTASKNUMBERMaxFinishDate =
MAXX(
FILTER(
ALL('Table')
, 'Table'[Task Number] = CurrentRowTASKNUMBER
)
, 'Table'[Task Finish]
)
RETURN
SWITCH(
TRUE()
, ISBLANK(ReturnTaskNumberIfTaskIncomplete) //If the task is complete, this is blank
, DATEDIFF(CurrentTASKNUMBERMinStartDate, CurrentTASKNUMBERMaxFinishDate, DAY) // Subtract the most recent finish date from the earliest start date
, DATEDIFF(CurrentTASKNUMBERMinStartDate, TODAY(), DAY) // If the task is incopmlete, subtract the earliest start date from today's date
My current question is if there's a more efficient way I could write this calculated column, or if I could write it as a measure. I'm concerned about the number of ALL()'s.
)
User | Count |
---|---|
93 | |
85 | |
78 | |
68 | |
63 |
User | Count |
---|---|
113 | |
99 | |
97 | |
64 | |
59 |