Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Sample dataset:
Date | Key | Value |
04.08.2022 | A | 3 |
07.08.2022 | A | 7 |
05.08.2022 | B | 11 |
07.08.2022 | B | 13 |
The dates are NOT sequental; this is important as this assumption is freely taken in many similar cases. Dates could be anything, just random.
The goal is to be able to produce a measure showing average value as last known at any given date. Example result set with this measure would be:
Date | Key | "Latest known value" |
04.08.2022 | A | 3 |
04.08.2022 | B | (null) |
05.08.2022 | A | 3 |
05.08.2022 | B | 11 |
06.08.2022 | A | 3 |
06.08.2022 | B | 11 |
07.08.2022 | A | 7 |
07.08.2022 | B | 13 |
At the same time, it should produce correct averages when grouped without the key
Date (from calendar table) | "Latest known value" (average) | Comment |
04.08.2022 | 3 | (3+null)/1 |
05.08.2022 | 7 | (3+11)/2 |
06.08.2022 | 7 | (3+11)/2 |
07.08.2022 | 10 | (7+13)/2 |
If it wasn't for the Key, I could have used this solution (CT is my calendar table and Msr is my AVERAGE(Value) measure):
// VAR CurrentContextKeyValue = ???
Why would a random third party come and mark an INCORRECT reply here as an "accepted solution"? I've explained in details why proposed solution doesn't actually solve anything as it cannot be applied anywhere outside of a sandbox containing super small dataset.
Forum mods, you really need to change your policy and disallow anyone but OP to mark as solution
Hi @DmitryKo
Attached sample file with the solution
Latest known value =
VAR Result =
AVERAGEX (
CALCULATETABLE ( VALUES ( Tbl[Key] ), ALL ( CT ) ),
CALCULATE (
VAR CurrentDate =
MAX ( CT[Date] )
VAR SelectedDates =
FILTER ( ALLSELECTED ( CT[Date] ), CT[Date] <= CurrentDate )
VAR MsrTable =
FILTER (
ADDCOLUMNS (
SelectedDates,
"@MSR",
CALCULATE ( AVERAGE ( Tbl[Value] ) )
),
[@MSR] <> BLANK ()
)
VAR LastNonBlankRecord =
TOPN ( 1, MsrTable, [Date] )
RETURN
MAXX ( LastNonBlankRecord, [@MSR] )
)
)
RETURN
IFERROR ( VALUE ( Result & "" ), "" )
Hello.
This one seems to be working correctly, although I'd expect for it to be working out of the box with the simple ALLEXCEPT() function.
I would have to check how it works in case of more than one dimension, but I guess adding another VALUES(Dimension2[Key]) should solve it.
UPDATE: over a very small dataset (single dimension, ~1000 unique keys, less than 500 rows in the fact table spanning around 2 months total and around 100 keys of that single dimension) this measure's performance turned out to be extremely slow. I had a +12GB jump in memory usage and around 5 minutes of 100% CPU utilization after trying to open the report in the browser.
The only difference was that metric added.
Surely this is not a solution for any real data set (dozens of millions of rows of facts, 10-20 dimensions, several years timeframe span).
Hi @DmitryKo
Is the following good enough? Only problem when slicing by Key the record 04.08.2022 - B - (null) does not appear. I guess it can be solved but with extra DAX thus trying to avoid unnecessary extra code.
Hi @DmitryKo
please try the folliwing for both visuals
Latest known value =
AVERAGEX (
VALUES ( TableName[Key] ),
CALCULATE (
MAXX (
FILTER (
CALCULATETABLE ( TableName, ALLEXCEPT ( TableName, TableName[Key] ) ),
TableName[Date] <= MAX ( TableName[Date] )
),
TableName[Value]
)
)
)
As I mentioned in the initial message, the dates in the fact table are not sequential. Thus, this code only results in non-null values for dates representing non-null dates in the fact table; so it doesn't return latest known value at any given date.
Hi @DmitryKo
if you tried the code please update me with the results that you are getting compared with your expectation.
I don't actually understand the meaning of sequential dates as the power bi columnar data engine applies no index on the data hence, the order or the rows is absolutely irrelevant unless you want to calculate something based on the order of the data as it appears. If this the case then an index column must be added using power query to help the engine understand the order of the data.
Please check my sample dataset. For key "A", there are only 2 dates in the fact table: 04.08 and 07.08. There are no values in the fact table for any dates inbetween (specifically, 05.08, 06.08).
When calculating "latest known value" measure for any of these dates inbetween, it should return latest known value. For example, when calculating for 06.08, there is no value for this date, latest known value is related to 04.08 with the value of 3, so this value should be returned by the measure.
Your code returns empty row (null value) for the said date, because it assumes that fact table has sequential dates, e.g. there is a fact value for each consecutive date.
Not sure to be honest, but you may try
Latest known value =
AVERAGEX (
VALUES ( TableName[Key] ),
CALCULATE (
VAR CurrentDate =
MAX ( CT[Date] )
VAR SelectedDates =
FILTER ( ALLSELECTED ( CT[Date] ), CT[Date] <= CurrentDate )
VAR MsrTable =
FILTER ( ADDCOLUMNS ( SelectedDates, "@MSR", [Msr] ), [@MSR] <> BLANK () )
VAR LastNonBlankRecord =
TOPN ( 1, MsrTable, [Date] )
RETURN
MAXX ( LastNonBlankRecord, [@MSR] )
)
)
Anyways, this code again only produces non-null values for dates that are non-null in the fact table. For any other dates, its null.
I guess if we change ALLSELECTED to ALL it should work. I will give it a try later on today
Thank you for the patience, but here goes the same question I've asked in a similar reply. LASTNONBLANKRECORD, from it's name, assumes certain order of rows (you can't tell first from last unless there's an ordered list). What exactly does it use to order table rows? How can this be managed? Documentation says nothing about it.
Also from the code alone it looks like it would be at least O(n^2) computational complexity, which seems to be a sure overkill for the task. Say there's a source table with at least 100k rows, and we're dead performance wise.
Let me actually narrow down the question. This code:
VAR LastTimeHadValue = MAXX(
FILTER(
ALLEXCEPT(Tbl, Tbl[Key]),
Tbl[Date] <= MAX(CT[Date])
),
Tbl[Date]
)
returns incorrect value over the sample above. It just returns the last date when any key had a value, thus ignoring the supposed logic of ALLEXCEPT(). I thought that ALLEXCEPT() over a table would keep the context filters related to the columns specified in 2nd argument - why isn't it doing so?
Hi,
I am not sure if I understood your question correctly, but please check the below picture and the attached pbix file.
If this post helps, then please consider accepting it as the solution to help other members find it faster, and give a big thumbs up.
Hello and thanks for the quick reply.
I'm afraid that the following logic in your metric uses assumption that dataset is ordered - LASTNONBLANK. The documentation says nothing on how exacly "last" value is taken. First/last logic assumes sequence in the dataset; how does LASTNONBLANK determine order of search?
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
46 | |
28 | |
22 | |
12 | |
8 |
User | Count |
---|---|
75 | |
52 | |
46 | |
15 | |
12 |