Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Anonymous
Not applicable

Calculated column to find the most recent value that fills certain criteria

Hello community, 

 

I would like to create a calculated column that evaluate the values from an imported data column (called "data_value"). 

 

What I want: If the value on "data_value" > 0, then give me this value. if not, give me the most recent value that is > 0.

 

Here the example of what I would like : 

 

Date Data_ValueCalculated Column 
01/01/202020
02/01/202121
03/01/202222
04/01/202323
05/01/20-1023
06/01/20-1023
07/01/20-1023

 

Any suggestions? 

 

Thank you so much in advance for your help 🙂 

 

Cheers! 

14 REPLIES 14
v-xuding-msft
Community Support
Community Support

Hi @Anonymous ,

Please try this with DAX.

Column = 
VAR a =
    MAXX (
        FILTER (
            ALL ( 'Table' ),
            'Table'[Date] < EARLIER ( 'Table'[Date] )
                && 'Table'[Data_Value] > 0
        ),
        'Table'[Date]
    )
VAR b =
    CALCULATE (
        MAX ( 'Table'[Data_Value] ),
        FILTER ( 'Table', 'Table'[Date] = a )
    )
RETURN
    IF ( 'Table'[Data_Value] < 0, b, 'Table'[Data_Value] )

1.PNG

Best Regards,

Xue Ding

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Best Regards,
Xue Ding
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Anonymous
Not applicable

Hello @v-xuding-msft , thanks again for you answer. By reducing the query size I was able to create the calculated  column, but I cannot use the same value for all the negative rows since I Have different kind of objects that record data for a same date. I give you an example here:  

 

Date ObjectData_ValueDesired Column
01/01/20A2020
02/01/20A-120
03/01/20A-99920
01/01/20B2323
02/01/20B-1023
03/01/20B-1023
01/01/20C66
02/01/20C-9526
03/01/20C-9456

 

Plus, the calculated column you suggest is giving the maximum value and not the latest positive value. Any suggestions for this? 

 

Thank you so much in advance. 

 

 

 

Hi @Anonymous  

any feedback on my solution? Just to say that it works on your new sample data as well:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjDUByIjAyMDJR0lRyAGM4BErA5Q0ghNUtcQSdIYXdLS0hJJGtlgJ5C4MYTAMNgJbLABkqwxXllkg52B2AyMMYx1BjvJ1AghawyWhEuZmEKkYgE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [#"Date " = _t, Object = _t, Data_Value = _t, #"Desired Column" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Data_Value", Int64.Type}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "CalculatedColumn", each if [Data_Value] > 0 then [Data_Value] else null),
    #"Filled Down" = Table.FillDown(#"Added Custom",{"CalculatedColumn"})
in
    #"Filled Down"

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Anonymous
Not applicable

Hello @ImkeF , I have previous transformations steps and a query modification, as it follows:

 

let Source = Oracle.Database(XXX), [HierarchicalNavigation=true, Query="select#(lf) *#(lf)from#(lf) icone.mesures#(lf)where#(lf) date_val > to_date('01/01/2019','DD/MM/YYYY') #(lf)"]), #"Unpivoted Columns" = Table.UnpivotOtherColumns(Source, {"CODE_MESURE", "DATE_VAL"}, "Attribute", "Value"), #"Split Column by Position" = Table.SplitColumn(#"Unpivoted Columns", "Attribute", Splitter.SplitTextByPositions({0, 1}, false), {"Attribute.1", "Attribute.2"}), #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Position",{{"Attribute.1", type text}, {"Attribute.2", Int64.Type}, {"DATE_VAL", type datetime}}), #"Extracted Date" = Table.TransformColumns(#"Changed Type",{{"DATE_VAL", DateTime.Date, type date}}), #"Extracted Date1" = Table.TransformColumns(#"Extracted Date",{}) in #"Extracted Date1"

 

 

I forgot to say also that all of this transformation allows me to create a date-time column since the original data had only date, and the different measures in time were represented as columns that show the values taken each half an hour. After this query i created a calculated column as follows: 

 

DATE = FORMAT(MESURES[DATE_VAL]+TIME(0,'MESURES'[Attribute.2]*30,0),"DD-MM-YYYY HH:MM:SS")

 

This is the Date I would like to use in the query so it has to be included in the advanced editor as well. 

Can you help me to integrate your solution in my already existing query, to form one query with all the required steps? 

 

Thanks again, 

Hi @Anonymous 

Looks doable, however to be on the safe side here, please give sample data that matches the column names you have used in the DAX-code.

Also, please indent your code to make it readable, thanks.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Anonymous
Not applicable

@ImkeF , here you will find all the description on my issue:  

 

1) The raw data file comes like this : 

 

CODE_MESUREDATE_VALD00D01D02... ...D47
A01/01/202021-9999-9999
A02/01/202220-9999-9999
B01/01/203030-9999-9999
B02/01/203232-9999-9999
C01/01/2011-9999-9999
C02/01/2021-9999-9999

 

Each DXX column represents a value taken for a determined time (each half an hour) during the day on DATE_VAL. For example,  the first row has the values taken every half an hour on January 1st 2020 for the measure A. The data base creates a row for each measure at 12:00 am of the day corresponding to the measure period, but until the time of the measure is still not reached during the day, the value of that cell will be -9999 (the negative values I'am avoiding). For example: on January 1st at 12:45 am, the first row will show the values of measure A at 12:00am (D00) and 12:30am(D01) but the value at 1:00 am (D02) will be -9999 since the measure is still not taken. 

 

Since I need to track the last value measured for each one of the measures, I transform my DXX columns into rows, using this: 

 

 

let
    Source = Oracle.Database(XXX)", [HierarchicalNavigation=true, Query="select#(lf) *#(lf)from#(lf) icone.mesures#(lf)where#(lf) date_val > to_date('01/01/2019','DD/MM/YYYY') #(lf)"]),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(Source, {"CODE_MESURE", "DATE_VAL"}, "Attribute", "Value"),
    #"Split Column by Position" = Table.SplitColumn(#"Unpivoted Columns", "Attribute", Splitter.SplitTextByPositions({0, 1}, false), {"Attribute.1", "Attribute.2"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Position",{{"Attribute.1", type text}, {"Attribute.2", Int64.Type}, {"DATE_VAL", type datetime}}),
    #"Extracted Date" = Table.TransformColumns(#"Changed Type",{{"DATE_VAL", DateTime.Date, type date}}),
    #"Extracted Date1" = Table.TransformColumns(#"Extracted Date",{})
in
    #"Extracted Date1"

 

 

Then, I created a calculated column to assign a time value to each one of my  values in the DXX  rows so this will be a date time column: 

 

DATE = FORMAT(MESURES[DATE_VAL]+TIME(0,'MESURES'[Attribute.2]*30,0),"DD-MM-YYYY HH:MM:SS")
 
Up to this point, my data looks like this, assuming that it is 01/01/2020 at 12:45 am: 
 
CODE_MESUREDATE_VALAttribute.2DATEValue
A01/01/20001/01/20 12:00 am20
A01/01/20101/01/20 12:30 am 21
01/01/20201/01/20 1:00 am-9999
B01/01/20001/01/20 12:00 am30
B01/01/20101/01/20 12:30 am30
B01/01/20201/01/20 1:00 am-9999
C01/01/20001/01/20 12:00 am1
C01/01/20101/01/20 12:30 am1
C01/01/202 01/01/20 1:00 am-9999

 

Now, since each row in the initial data is created at 12:00 am, in my transformed data I would have values for each time of that day even if the measure has still not been taken, containing -9999 values for the "future" measures.  in order to track the last value taken, I want to show in my report the last real measure taken for the time of the day in which the report is consultated. And here would come your solution.

 

I hope the issue is more clear 🙂 

 

Again, many thanks! 

 

For the DAX-solution to work you have to include filters on Object- level like so:

 

Column = 
VAR a =
    MAXX (
        FILTER (
            ALL ( 'Table' ),
            'Table'[Date] < EARLIER ( 'Table'[Date] )
                && 'Table'[Data_Value] > 0
                && 'Table'[Object] = EARLIER ( 'Table'[Object] )
        ),
        'Table'[Date]
    )
VAR b =
    CALCULATE (
        MAX ( 'Table'[Data_Value] ),
        FILTER ( 'Table', 'Table'[Date] = a 
            && 'Table'[Object] = EARLIER('Table'[Object] ) 
            )
    )
RETURN
    IF ( 'Table'[Data_Value] < 0, b, 'Table'[Data_Value] )

 

But this might consume more RAM than the my PQ-solution (due to the EARLIERs)

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Anonymous
Not applicable

Hello, and thank you for your response. Unfortunately my pc crashed after trying your suggestion. My data sets is > 20 million rows so apparently the task is too heavy for my pc. 

 

Thank yo ufor your help! 

ImkeF
Super User
Super User

Hi @Anonymous 

this can be done in the query editor like so:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjDUByIjAyMDJR0lIBGrAxQzQhYzhIgZI4sZQcRMkMWMIWKmSGK6hlADzbAJmqMLxgIA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [#"Date " = _t, Data_Value = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date ", type date}, {"Data_Value", Int64.Type}}),
    ListOfNumbers = List.Buffer( #"Changed Type"[Data_Value] ),
    NewColumn = List.Accumulate(
        ListOfNumbers, 
        {}, 
        (state, current) => 
            if current < 0 
                then state & {List.Last(state)} 
                else state & { current } 
                ),
    Custom1 = Table.FromColumns( Table.ToColumns( #"Changed Type" ) & { NewColumn } , Table.ColumnNames( #"Changed Type" ) & {"Calculated Column"} )
in
    Custom1

Paste the code into the advanced editor and follow the steps.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Anonymous
Not applicable

Hello, and thank you for your response. Unfortunately my pc crashed after trying your suggestion. My data sets is > 20 million rows so apparently the task is too heavy for my pc.  I have tried with both solutions proposed in this post but I had to restart Power BI in both cases. 

 

Thank yo ufor your help! 

Hi @Anonymous ,

Please you could use Performance Analyzer to examine report element performance .

 

https://docs.microsoft.com/en-us/power-bi/desktop-performance-analyzer 

https://www.sqlbi.com/articles/introducing-the-power-bi-performance-analyzer/

 

Best Regards,

Xue Ding

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Best Regards,
Xue Ding
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Hi @Anonymous 

it is possible that List.Generate executes better. But please be prepared that this might also freeze your machine:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjDUByIjAyMDJR0lIBGrAxQzQhYzhIgZI4sZQcRMkMWMIWKmSGK6hlADzbAJmqMLxgIA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [#"Date " = _t, Data_Value = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date ", type date}, {"Data_Value", Int64.Type}}),
    ListOfNumbers = List.Buffer( #"Changed Type"[Data_Value] ),
    NewColumn = List.Skip(List.Generate( ()=>
        [Result = {}, Counter = 0],
        each [Counter] <= List.Count(ListOfNumbers),
        each [
            CurrentElement = ListOfNumbers{[Counter]},
            Result = if CurrentElement < 0
                        then [Result]
                        else CurrentElement,
            Counter = [Counter] + 1
        ],
        each [Result] 
       )),
    Custom1 = Table.FromColumns( Table.ToColumns( #"Changed Type" ) & { NewColumn } , Table.ColumnNames( #"Changed Type" ) & {"Calculated Column"} )
in
    Custom1

How much RAM do you have & how many rows?

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Hi @Anonymous 

it looks like I've overcomplicated your requirement.

Replacing negative values with null an then fill down the (remaining) positives should give what you're after and should alse be fairly fast:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjDUByIjAyMDJR0lIBGrAxQzQhYzhIgZI4sZQcRMkMWMIWKmSGK6hlADzbAJmmMTtEASBNocCwA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [#"Date " = _t, Data_Value = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date ", type date}, {"Data_Value", Int64.Type}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "CalculatedColumn", each if [Data_Value] > 0 then [Data_Value] else null),
    #"Filled Down" = Table.FillDown(#"Added Custom",{"CalculatedColumn"})
in
    #"Filled Down"

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Anonymous
Not applicable

Hello @ImkeF ! And thanks for the time to answer the topic. I have been discussing with the IT team and we should reduce the query size since the original had 21 million rows and apparently is quite a lot for 8GB RAM. With a reduced query I was able to create a calculated column as the other user suggested, but the issue is that there are different types of values so I cannot just take the last value to replace all the others. I give you the complete example: 

 

Date ObjectData_ValueDesired Column
01/01/20A2020
02/01/20A-120
03/01/20A-99920
01/01/20B2323
02/01/20B-1023
03/01/20B-1023
01/01/20C66
02/01/20C-9526
03/01/20C-9456

 

Do you have any suggestions on this?

 

Again, thankyou so much in advance! 

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.