Reply
avatar user
Anonymous
Not applicable

Calculating new column using row values from same table: circular dependency; avoiding EARLIER?

Hello,

 

I have a LogTable listing all mutation events on some objects.

The row format is:

 

Object_ID: the object on which mutations are carried on

Mutation_Date: Timestamp of the mutation

Mutation_Parameter: the object parameter on which the mutation is done, e.g. Status, Urgency, Other

Old_Value: the value of the object parameter prior to the mutation

New_Value: the value of the object parameter after the mutation.

 

 

2018-09-06_233216.png

 

I would like to create a new column "Status_before_Mutation" showing the value of the status parameter prior to each mutation. In case of a status mutation, this is logically equal to the existing value in column Old_Value, but in case of other mutations, I need to consider the latest change of status prior to the time of the mutation being considered.

 

For each mutation, the logic I am pursuing is to select all rows with the same Object_ID, then filter rows with MutationParameter = "Status", then filter rows with Mutation_Date prior to Mutation_Date of the mutation being considered, sort the remaining rows in descending order of Date, pick the first one and extract the column vale "Old_Value" from this row.

 

Unfortunately, I get a circular dependency error. I guess I need to exclude the new column "Status_before_Mutation" from the CALCULATETABLE row context, which I thought I did with the ALLEXCEPT... 

 

Here is the code leading to this circular dependency: 

 

Status_before_Mutation = 
    CALCULATE(
        VALUES(LogTable[Old_Value]);
        TOPN(
            1;
            CALCULATETABLE(
                LogTable;
                ALLEXCEPT(LogTable; LogTable[Object_ID]);
LogTable[Mutation_Parameter] = "Status"; LogTable[Mutation_Date] <= EARLIER(LogTable[Mutation_Date]) ); LogTable[Mutation_Date]; DESC ) )

 

Another potential issue is maybe the use of EARLIER, which I have read is not recommended for large tables.

The LogTable is quite big, up to several hundred thousands rows. I have several thousands different objects each with a number of mutations in the range [1...50].

 

Any idea on solving this, possible avoiding the use of EARLIER?

 

Thanks for any suggestion.

 

Karim

1 ACCEPTED SOLUTION
avatar user
Anonymous
Not applicable

Hi Jimmy,

 

Thanks for your proposition. I have never used RANKX before....

In the meantime, I could get rid of the circular dependency problem I had by using DISTINCT instead of VALUES in my CALCULATE new column:

 

Status_after_Mutation = 
    CALCULATE(
        DISTINCT(LogTable[New_Value]);
        TOPN(
            1;
            CALCULATETABLE(
                LogTable;
                ALLEXCEPT(LogTable; LogTable[Object_ID]);
                LogTable[Mutation_Parameter] = "Status";
                LogTable[Mutation_Date] <= EARLIER(LogTable[Mutation_Date])
            );
            LogTable[Mutation_Date];
            DESC
        )
    )               

Also, I corrected the column to be "Status_after_Mutation" using the LogTable[New_Value].

 

Now I am getting what I want (see marked column below):

 

2018-09-07_234847.png

 

In the last column, I have also included what I got with your proposition:

 

Status_before_Mutation = 
VAR Ranking = RANKX(FILTER(LogTable, LogTable[Object_ID] = EARLIER(LogTable[Object_ID]) && LogTable[Mutation_Date] <= EARLIER(LogTable[Mutation_Date])), RANKX(ALL(LogTable), LogTable[Mutation_Date]), , DESC, Dense)
RETURN
IF(Ranking = 1, LogTable[New_Value])

 

I have two questions now:

- Your solution only provides the status for a status mutation. I tried to modify it without success, as I don't quite understand the use of RANKX within RANKX. How should I modify it to get the current status for all mutations on the same object? 

 

- With my solution using TOPN on a big mutation table (> 200,000 rows), I get the following error from PowerBI
"There is not enough memory to complete this operation. Please try again later when there may be more memory available".
Is RANKX or another approach more memory efficient on big tables? Of course I could first split the table in smaller tables before computing the new columns but if I could avoid it with a better method, that would be interesting.

 

Thanks

Karim

 

View solution in original post

3 REPLIES 3
avatar user
Anonymous
Not applicable

Addendum:

Based on the LogTable I gave as an example, here is the new column "Status_before_Mutation" I would like to get:

 

Object_IDMutation_DateMutation_ParameterOld_ValueNew_ValueStatus_before_Mutation
Object_A25.01.2018 17:38Status open 
Object_A25.01.2018 17:40Urgency 1open
Object_A26.01.2018 04:15Other yyopen
Object_A26.01.2018 07:30Urgency12open
Object_A29.01.2018 08:47Statusopenworkopen
Object_A29.01.2018 17:54Urgency21work
Object_A30.01.2018 06:08Otheryyzzwork
Object_A30.01.2018 06:53Statusworkreviewwork
Object_A31.01.2018 13:30Statusreviewclosedreview
Object_B24.01.2018 08:47Status open 
Object_B24.01.2018 08:47Urgency 3open
Object_B26.01.2018 14:43Urgency32open
Object_B26.01.2018 15:02Other xxopen
Object_B02.02.2018 15:38Statusopenworkopen
Object_B07.02.2018 10:06Urgency21work
Object_B07.02.2018 10:06Statusworkclosedwork

Hi Karim11,

 

Modify your calculate column using DAX below and check if it can meet your requirement:

Status_before_Mutation = 
VAR Ranking = RANKX(FILTER(LogTable, LogTable[Object_ID] = EARLIER(LogTable[Object_ID]) && LogTable[Mutation_Date] <= EARLIER(LogTable[Mutation_Date])), RANKX(ALL(LogTable), LogTable[Mutation_Date]), , DESC, Dense)
RETURN
IF(Ranking = 1, LogTable[New_Value])

Capture.PNG 

 

Regards,

Jimmy Tao

avatar user
Anonymous
Not applicable

Hi Jimmy,

 

Thanks for your proposition. I have never used RANKX before....

In the meantime, I could get rid of the circular dependency problem I had by using DISTINCT instead of VALUES in my CALCULATE new column:

 

Status_after_Mutation = 
    CALCULATE(
        DISTINCT(LogTable[New_Value]);
        TOPN(
            1;
            CALCULATETABLE(
                LogTable;
                ALLEXCEPT(LogTable; LogTable[Object_ID]);
                LogTable[Mutation_Parameter] = "Status";
                LogTable[Mutation_Date] <= EARLIER(LogTable[Mutation_Date])
            );
            LogTable[Mutation_Date];
            DESC
        )
    )               

Also, I corrected the column to be "Status_after_Mutation" using the LogTable[New_Value].

 

Now I am getting what I want (see marked column below):

 

2018-09-07_234847.png

 

In the last column, I have also included what I got with your proposition:

 

Status_before_Mutation = 
VAR Ranking = RANKX(FILTER(LogTable, LogTable[Object_ID] = EARLIER(LogTable[Object_ID]) && LogTable[Mutation_Date] <= EARLIER(LogTable[Mutation_Date])), RANKX(ALL(LogTable), LogTable[Mutation_Date]), , DESC, Dense)
RETURN
IF(Ranking = 1, LogTable[New_Value])

 

I have two questions now:

- Your solution only provides the status for a status mutation. I tried to modify it without success, as I don't quite understand the use of RANKX within RANKX. How should I modify it to get the current status for all mutations on the same object? 

 

- With my solution using TOPN on a big mutation table (> 200,000 rows), I get the following error from PowerBI
"There is not enough memory to complete this operation. Please try again later when there may be more memory available".
Is RANKX or another approach more memory efficient on big tables? Of course I could first split the table in smaller tables before computing the new columns but if I could avoid it with a better method, that would be interesting.

 

Thanks

Karim

 

avatar user

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.

Top Solution Authors (Last Month)
Top Kudoed Authors (Last Month)