Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply

Custom column index

Hi Guys,

 

First time poster but I have been working with Powerbi for couple of months so not completely new to it. I have been working on this problem for couple of days and have looked through multiple post including following solution by @ImkeF in the forms but couldn't get it to work for my case. 

Solved: Custom column Index or Ranking by other column - Microsoft Power BI Community

 

See sample tables below. I have 100s of different site and 100s of operations and each operation can have multiple steps. I need to keep track of number of times (Attempts) someone tried to complete an operation at specific site. Every time some start an operation, we increase the Attempt number (i.e. every time step is xxx.001). 

 

People can put operation on hold and jump to different opeartion or they can abort an operation and restart it or start different operation. Even if people abort on the first step (xxx.001) it will count as an attempt. 

 

Lastly, sometime people quickly click through operations so multiple rows (operations/steps) can have same time stamp.

 

Any help will be appreciated. Also please keep in mind that my data has 20,000 to couple 100,000 rows.

 

Thank you very much.

 

Sample Table

Site

Operation

Step

Status

Time Stamp

Attempt

A1

100

100.001

Complete

2020-10-15 2:55

1

A1

100

100.002

Complete

2020-10-15 2:56

1

A1

100

100.002

Complete

2020-10-15 2:59

1

A1

101

101.001

Complete

2020-10-15 3:02

1

A1

101

101.002

Complete

2020-10-15 3:03

1

A1

101

101.003

Hold

2020-10-15 3:04

1

A1

100

100.001

Complete

2020-10-15 6:18

2

A1

100

100.002

Abort

2020-10-15 6:21

2

A1

101

101.003

Complete

2020-10-15 6:22

1

A1

101

101.004

Complete

2020-10-15 6:22

1

A1

100

100.001

Complete

2020-10-15 6:22

3

A1

100

100.002

Complete

2020-10-15 6:22

3

A1

100

100.003

Complete

2020-10-15 6:23

3

A2

101

101.001

Complete

2020-10-18 17:04

1

A2

101

101.002

Complete

2020-10-18 17:05

1

A2

101

101.003

Complete

2020-10-18 17:23

1

A2

100

100.001

Complete

2020-10-18 18:06

1

A2

100

100.002

Complete

2020-10-18 18:10

1

A2

101

101.001

Complete

2020-10-18 18:13

2

A2

101

101.002

Hold

2020-10-18 18:15

2

A2

100

100.001

Complete

2020-11-03 13:29

2

A2

100

100.002

Complete

2020-11-03 13:30

2

A2

101

101.001

Abort

2020-11-03 13:35

3

A2

101

101.001

Complete

2020-11-03 13:35

4

 

I have colour coded different groups in the image below.

Capture1.PNG

1 ACCEPTED SOLUTION
AlB
Super User
Super User

@rashidmuhammed 

Try this, with a minor change on how the ties are broken. It uses the Index column, which we didn't have at the beginning but is very useful. Seems to match the expected results on your uploaded  sample:

 

Attempt V2 =
VAR breakTies_ =
    RANKX (
        CALCULATETABLE (
            Table1,
            ALLEXCEPT ( Table1, Table1[Site], Table1[Operation] ),
            Table1[Time Stamp] = EARLIER ( Table1[Time Stamp] ),
            RIGHT ( Table1[Step], 3 ) = "001"
        ),
        Table1[Index]
    ) - 1
RETURN
    CALCULATE (
        COUNT ( Table1[Step] ),
        ALLEXCEPT ( Table1, Table1[Site], Table1[Operation] ),
        RIGHT ( Table1[Step], 3 ) = "001",
        Table1[Time Stamp] <= EARLIER ( Table1[Time Stamp] )
    ) - breakTies_

 

 

Please mark the question solved when done and consider giving a thumbs up if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

SU18_powerbi_badge

 

View solution in original post

15 REPLIES 15
AlB
Super User
Super User

@rashidmuhammed 

Try this, with a minor change on how the ties are broken. It uses the Index column, which we didn't have at the beginning but is very useful. Seems to match the expected results on your uploaded  sample:

 

Attempt V2 =
VAR breakTies_ =
    RANKX (
        CALCULATETABLE (
            Table1,
            ALLEXCEPT ( Table1, Table1[Site], Table1[Operation] ),
            Table1[Time Stamp] = EARLIER ( Table1[Time Stamp] ),
            RIGHT ( Table1[Step], 3 ) = "001"
        ),
        Table1[Index]
    ) - 1
RETURN
    CALCULATE (
        COUNT ( Table1[Step] ),
        ALLEXCEPT ( Table1, Table1[Site], Table1[Operation] ),
        RIGHT ( Table1[Step], 3 ) = "001",
        Table1[Time Stamp] <= EARLIER ( Table1[Time Stamp] )
    ) - breakTies_

 

 

Please mark the question solved when done and consider giving a thumbs up if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

SU18_powerbi_badge

 

Thank yoouuu AIB this worked perfectly and it didn't take too long to run either(only couple of seconds).

mahoneypat
Employee
Employee

Here is another thing to try.  Add a new Index column to the data in the query editor and then use this updated column expression.

 

Attempts 3 =
VAR vThisDateTime = Attempts[Time Stamp]
VAR vThisIndex = Attempts[Index]
VAR vPrev1s =
    CALCULATE (
        COUNTROWS ( Attempts ),
        ALLEXCEPT (
            Attempts,
            Attempts[Site],
            Attempts[Operation]
        ),
        Attempts[Index] <= vThisIndex,
        Attempts[Time Stamp] <= vThisDateTime,
        FILTER (
            ALL ( Attempts[Step] ),
            RIGHT (
                Attempts[Step],
                1
            ) = "1"
        )
    )
RETURN
    vPrev1s

 

Regards,

Pat





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


Thanks Pat, this worked. But when I took it to my main dataset with 20,000+ rows computer took long time and then I got a message saying not enough memory. My laptop has 32GB memory and even when i upload my program to powerbi services and tried to refresh it with full dataset it give similar error on memory after trying for 30min.

 

Is there a way to make this run faster? In next couple of months I will be getting more data and my data set increase to couple hundred thousand rows.

 

Capture5.PNG

v-alq-msft
Community Support
Community Support

Hi, @rashidmuhammed 

 

Based on your description, I created data to reproduce your scenario. The pbix file is attached in the end.

a1.png

 

You may group rows and then expand the table column as is suggested by @ImkeF. Here are the m codes is 'Advanced Editor'.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("ndNBDoIwEAXQq5iuicxMKUJ3xI13IGyM7DAYw/1jEcSCM7W6aYDw6P9tqWtVoUoUAkzjHmC8P/bXW9cO7fNhiiYlINiRNUY1CWMoZPI/TOmbeQxm09Z9kDXyPM5o3mh3deq7y8f7Gd9FzpVbLOT+1bm/D1tAKIeSJiGhfBZt4spIRl5k2QT7zBtD3w9AMRk8LLuzRUw4DxkeMeneaB0vuHQvVFjIeRSIV1iEnxfCIWH1aH2sPWDi+mCqZ6MtlXF1fKMDbZZ/YQWE3QkFG03zAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Site = _t, Operation = _t, Step = _t, Status = _t, #"Time Stamp" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Site", type text}, {"Operation", Int64.Type}, {"Step", type number}, {"Status", type text}, {"Time Stamp", type datetime}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Site", "Operation", "Step"}, {{"Test", each Table.AddIndexColumn(_,"Attempt",1,1), type table }}),
    #"Expanded Test" = Table.ExpandTableColumn(#"Grouped Rows", "Test", {"Status", "Time Stamp", "Attempt"}, {"Test.Status", "Test.Time Stamp", "Test.Attempt"})
in
    #"Expanded Test"

 

Result:

a2.png

 

Best Regards

Allan

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

AlB
Super User
Super User

@rashidmuhammed 

Without the tie break part, @mahoneypat's code and mine are essentially the same, if I'm not mistaken. 

If you copy a table (or fragment) with the Copy table option from PBI or from Excel and paste it here it should be shown nicely as a table (unless perhaps it is too large).

 

1) Can you please share a file (Excel best)  with the data you show in the screen caps above so that I can copy and run some tests on it? Please highlight the rows where the code fails so that I can review it. 

2) Ok, so if there can be 021 we'll check for "001" at the rightmost part of the string instead  of only for "1". Would that be enough? Any other things we need to take into account?

3) If you've already improved the code please share your latest version

4) Have you tried the M solution? Does it have the same issues as the DAX one?

5) How did the code do in terms of speed? Too slow?

 

Please mark the question solved when done and consider giving a thumbs up if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

SU18_powerbi_badge

 

Hi AIB,

 

1) You can find sample data at this link. https://utoronto-my.sharepoint.com/:x:/g/personal/rashid_muhammad_mail_utoronto_ca/EfWZPER-dLxPuCMyz...

 

2)Yes this is smart, after looking at your reply I was thinking that I am very stupid for overlooking the obvious solution 😛

 

3)I did not really improve the code I just created another column that tells us if it its first step or not, but your second point has made this change irrelevant. @mahoneypat code is giving me correct answer for most part except for when there is repeat attempt within same time frame.

 

4)M Code ran smoothly for most part, except when there is repeat attempt within same time frame, then it created duplicated row of first step and labelled one 1 and other 2. See image below.

 

Also, I made slight change in the M Query. In the sorted rows step, I also added sort by index, since index is present in the raw data and it helps make sure that steps are in the proper order when time stamp is same. But I do not know how to take it further and make sure that they have correct attempt number.

#"Sorted Rows" = Table.Sort(#"Changed Type",{{"Site", Order.Ascending}, {"Operation", Order.Ascending}, {"Time Stamp", Order.Ascending},{"Index", Order.Ascending}})

 

Capture4.PNG

 

5)Both code ran surprisingly fast speed, so no issues there.

Do you guys also know why i can't add a table in my post? I tried to add table in reply to both AIB and Pat's along with the image but, i get error saying delete highlighted part and try again but nothing was highlighted.

AlB
Super User
Super User

@rashidmuhammed 

A solution in M. Place the following M code in a blank query to see the steps

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("ndNBDoIwEAXQqzSsNDEyM6UI3Rk33oGw0NiFEYPB3j9CRENrp6IbAoRH/59CVSVbTFYJAjyPa4Dhetdeb42xZripUoSUgECQViqpVwFDMZP/YcqpGY/RbFL3Lwwafp3eyLCR/dm+bU4fz2fhLnyuXGPB998e2876gJAPxS1CTPlstplXhjP8kHkT7TNuDH35AIq3wc17d3zkh3ORCiM/nYPcePzoJqjQkIdRJF6hEX4eRI+Y6dHks3aBmtUHZIo4IqmpnNXHRTLS5/U3eILZoGi2GBomIBa2FdbcrbBnI46dOVyWwZfUDw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Site = _t, Operation = _t, Step = _t, Status = _t, #"Time Stamp" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Site", type text}, {"Operation", Int64.Type}, {"Step", type text}, {"Status", type text}, {"Time Stamp", type datetime}}),


    #"Sorted Rows" = Table.Sort(#"Changed Type",{{"Site", Order.Ascending}, {"Operation", Order.Ascending}, {"Time Stamp", Order.Ascending}}),
    #"Filtered Rows" = Table.SelectRows(#"Sorted Rows", each (Text.End([Step], 1) = "1")),
    #"Grouped Rows" = Table.Group(#"Filtered Rows", {"Site", "Operation"}, {{"Count", each _, type table [Site=nullable text, Operation=nullable number, Step=nullable text, Status=nullable text, Time Stamp=nullable datetime, Index=number]}}),
    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.AddIndexColumn([Count], "SubIndex",1)),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Count"}),
    #"Expanded Custom" = Table.ExpandTableColumn(#"Removed Columns", "Custom", {"Step", "Status", "Time Stamp", "SubIndex"}, {"Step", "Status", "Time Stamp", "SubIndex"}),
    #"Merged Queries" = Table.NestedJoin(#"Sorted Rows", {"Site", "Operation", "Step", "Status", "Time Stamp"}, #"Expanded Custom", {"Site", "Operation", "Step", "Status", "Time Stamp"}, "Expanded Custom", JoinKind.LeftOuter),
    #"Expanded Expanded Custom" = Table.ExpandTableColumn(#"Merged Queries", "Expanded Custom", {"SubIndex"}, {"SubIndex"}),
    #"Filled Down" = Table.FillDown(#"Expanded Expanded Custom",{"SubIndex"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Filled Down",{{"SubIndex", Int64.Type}})
in
    #"Changed Type1"

 

Please mark the question solved when done and consider giving a thumbs up if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

SU18_powerbi_badge

mahoneypat
Employee
Employee

I was working on this at the same time, so will go ahead and share it too.  Similar to previous, but doesn't address the tie break on A2, 101.

 

Attempts 2 =
VAR vThisDateTime = Attempts[Time Stamp]
VAR vPrev1s =
    CALCULATE (
        COUNTROWS ( Attempts ),
        ALLEXCEPT (
            Attempts,
            Attempts[Site],
            Attempts[Operation]
        ),
        Attempts[Time Stamp] <= vThisDateTime,
        FILTER (
            ALL ( Attempts[Step] ),
            RIGHT (
                Attempts[Step],
                1
            ) = "1"
        )
    )
RETURN
    vPrev1s

 

Regards,

Pat





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


@mahoneypat thank pat, this code worked for most part, except for the tie break when timestamp is same for two different attempts. If possible can you provide an updated code that can handle tie break? I ran it for 20,000 rows and only found couple of instance where there was a tie break but in next couple of month my data will be in couple hundred thousand so i just wanted this covered if possible. 

 

Also, instead of step I used step2 column. Since I had more than 10 steps and 011,or 021 where all being treated as first step by RIGHT formula. so i created step2 which basically has 1 for the first step and 0 for rest of the steps.

 

Capture3.PNG

AlB
Super User
Super User

@rashidmuhammed 

It can be done in PQ, but I think it's more straightforward in DAX. It might be slow on your dataset:

Attempt =
VAR breakTies_ =
    RANKX (
        CALCULATETABLE (
            Table1,
            ALLEXCEPT ( Table1, Table1[Site], Table1[Operation] ),
            Table1[Time Stamp] = EARLIER ( Table1[Time Stamp] )
        ),
        Table1[Status]
    ) - 1
RETURN
    CALCULATE (
        COUNT ( Table1[Step] ),
        ALLEXCEPT ( Table1, Table1[Site], Table1[Operation] ),
        RIGHT ( Table1[Step], 1 ) = "1",
        Table1[Time Stamp] <= EARLIER ( Table1[Time Stamp] )
    ) - breakTies_

Note the breakTies_ part is necessary only to deal with cases like the last 2 rows in your example, where everything's equal other than the status. If it's slow we'll try to fix it

 

Please mark the question solved when done and consider giving a thumbs up if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

SU18_powerbi_badge

 

 

Thank you for this code. But when i tried it, in my master data its giving me 1 or 2 step behind the actual attempt number, which sometime turn into 0 and -1. I believe it has something to do with time stamp, as I mentioned earlier some times people go through these steps very quickly so time stamp stays the same. 

 

Also i had to create step2 column to identify first step since I had more than 10 steps and 011,or 021 where all being treated as 1 step by RIGHT formule.

 

I have provided sample snippets from my data below. 

Capture2.PNG

 

AlB
Super User
Super User

Hi @rashidmuhammed 

I don't understand. In your sample data, what is the input and what the expected result you need?

Please mark the question solved when done and consider giving a thumbs up if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

SU18_powerbi_badge

 

I need the attempt column to be generated automatically. The attempt column is basically an index which is telling me if the operation was being performed at the site for the 1 time, 2 time, 3 time, so on. 

 

Each operation has multiple steps but everytime they start from the first step (e.g 100.001) that means that its a new attempt. So it should be marked as such.

 

Thanks,

Rashid

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.