Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
I have a table with transactions tied to a project ID.
When counting project duration, I need to count the total days without counting the same days twice if there are multiple transactions occurring on the same dates.
In the data example below, Project ID 1 entire duration should be 8 and Project ID 2 should also be 8.
I do have a calendar table.
ID | Start | End | Duration |
1 | 1/1/2020 | 1/2/2020 | 2 |
1 | 1/10/2020 | 1/15/2020 | 6 |
1 | 1/10/2020 | 1/15/2020 | 6 |
2 | 1/7/2020 | 1/10/2020 | 4 |
2 | 1/12/2020 | 1/15/2020 | 4 |
2 | 1/14/2020 | 1/15/2020 | 2 |
Solved! Go to Solution.
Hi @WorkHard
The DAX to follow the M code logic would be something like this:
_projDuration =
COUNTROWS(
SUMMARIZE(
ADDCOLUMNS(
GENERATEALL(
zTable2,
FILTER(
DISTINCT(cal[date]),
cal[date] >= zTable2[Start]
&& cal[date] <= zTable2[End]
)
),
"idDate", [ID] & "-" & [date]
),
[idDate]
)
)
Or this:
_projDuration2 =
VAR listTable =
GENERATEALL(
zTable2,
FILTER(
DISTINCT(cal[date]),
cal[date] >= zTable2[Start]
&& cal[date] <= zTable2[End]
)
)
RETURN
COUNTROWS(
DISTINCT(
SELECTCOLUMNS(
listTable,
"idDate", [ID] & "-" & [date]
)
)
)
With the following outputs:
I'm unsure if either of these would be very performant over very large datasets, but there's a couple of options there for you to test/compare. There's probably some scope for memory optimisation within the GENERATEALL function by not bringing in the whole of zTable2 in the first argument, but it's the combination of functions that I think might cause the most performance issues.
Pete
Proud to be a Datanaut!
Hi @WorkHard ,
In Power Query, go to New Source>Blank Query then in Advanced Editor paste my code over the default code. You can then follow the steps I took to complete this:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTLUN9Q3MjAyADONIMxYHbicAULS0JRYWSOwiDmSpAG6pKERPq2GJlhkYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Start = _t, End = _t]),
chgDateTypes = Table.TransformColumnTypes(Source, {{"Start", type date}, {"End", type date}}, "en-US"),
addDtList = Table.AddColumn(chgDateTypes, "dtList", each List.Transform({Number.From([Start])..Number.From([End])}, each Date.From(_))),
expandDtList = Table.ExpandListColumn(addDtList, "dtList"),
addIdDate = Table.AddColumn(expandDtList, "idDate", each Text.Combine({[ID], Text.From([dtList], "en-GB")}, "-"), type text),
remDupeIdDate = Table.Distinct(addIdDate, {"idDate"}),
groupRows = Table.Group(remDupeIdDate, {"ID"}, {{"firstStart", each List.Min([Start]), type nullable date}, {"lastEnd", each List.Max([End]), type nullable date}, {"duration", each Table.RowCount(_), Int64.Type}})
in
groupRows
Pete
Proud to be a Datanaut!
Hi Pete,
Trying to follow your logic in DAX as I'm unable to change the code in Power Query due to my setup.
Sorry for not mentioning that.
I would generate a table series with start and end dates first.
DateSeries =
GENERATE(
var _DateStart = 'projdata'[Start date]
var _DateEnd = 'projdata'[End date]
return
GENERATESERIES(_DateStart , _DateEnd , 1)
)
And then from here, I can just count the number of days from this generated series grouped by project ID.
How would I group it from here by Project ID? WIth a measure or a calculated column?
Hi @WorkHard
The DAX to follow the M code logic would be something like this:
_projDuration =
COUNTROWS(
SUMMARIZE(
ADDCOLUMNS(
GENERATEALL(
zTable2,
FILTER(
DISTINCT(cal[date]),
cal[date] >= zTable2[Start]
&& cal[date] <= zTable2[End]
)
),
"idDate", [ID] & "-" & [date]
),
[idDate]
)
)
Or this:
_projDuration2 =
VAR listTable =
GENERATEALL(
zTable2,
FILTER(
DISTINCT(cal[date]),
cal[date] >= zTable2[Start]
&& cal[date] <= zTable2[End]
)
)
RETURN
COUNTROWS(
DISTINCT(
SELECTCOLUMNS(
listTable,
"idDate", [ID] & "-" & [date]
)
)
)
With the following outputs:
I'm unsure if either of these would be very performant over very large datasets, but there's a couple of options there for you to test/compare. There's probably some scope for memory optimisation within the GENERATEALL function by not bringing in the whole of zTable2 in the first argument, but it's the combination of functions that I think might cause the most performance issues.
Pete
Proud to be a Datanaut!
Hi @WorkHard ,
Not sure if following the M code logic is the most efficient course to take with DAX.
I'll have a think about it now and see if I can come up with a DAX solution that fits. If not, I'll hand you over to one of the DAX specialists on the team.
Pete
Proud to be a Datanaut!
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
108 | |
97 | |
79 | |
67 | |
60 |
User | Count |
---|---|
148 | |
113 | |
97 | |
84 | |
67 |