Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Kiwizqt
Frequent Visitor

Reset count of ID once they've all been seen, count how many days it took then reset until next time

Hi,

 

I have 1500 unique IDs, everyday many are seen and some multiple times a day.

 

I would like to do a cumulative count and know how many days does it takes to see them all 1500, once it reaches that amount, the cumulative sum needs to reset for another count until the next time the goal is met.

 

I've done a running total after a grouped count by date, but my issue aside from the needed reset is that the duplicates needs to be removed based on the time period needed to attain the goal. If i'm grouping my IDs by date then i'm losing the ability to sort out duplicates...

 

So i'm guessing I need to first find how much time is needed, then remove the duplicates from a custom column spanning over its own run tower the goal, and then, calculate the running...

 

Here's an ugly paint mockup of what i'd like my final result to look like (I realize I need my date format to be date.time otherwise i'll never get a correct goal and it'll be tricky)

Kiwizqt_0-1600449194030.png

 

 

I've found this solution here from Lind25 but the syntax is wrong, maybe from a previous PQ version:

https://community.powerbi.com/t5/Desktop/Running-total-that-resets-when-equal-to-or-greater-than-a-s...

 

 

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    AddedIndex = Table.AddIndexColumn(Source, "Index", 1, 1),
    #"Added Custom" = Table.AddColumn(AddedIndex, "Running total reset", each List.Accumulate(List.FirstN(AddedIndex[Difference],[Index]),0,(state,current)=>if state+current > 60 then 0 else state+current))
in
    #"Added Custom"

 

 

 

Any takers ? I'm completely lost on that one...

 

Thanks

 

4 ACCEPTED SOLUTIONS
ziying35
Impactful Individual
Impactful Individual

@Kiwizqt 

I've simulated a scenario with only 10 IDs using some of the data you provided, see if I'm on the right track? If I'm right, I'll think of the code to implement the end result.

Sample PQ 

View solution in original post

@Kiwizqt 

I did a test performing the core operation in Python and it seems to finish in less than 20 secs for the data you posted. See it in the attached file. You'll have to update the path to the excel file you shared. Here is the M code for the main query:

Note it should be cleaned up a bit, as it is doing a lot of stuff that might not be necessary

 

 

let
    Source = Excel.Workbook(File.Contents("d:\Downloads\Sample PQ Help.xlsx"), null, true),
    Maintenance_Itinérante___CSV_Table = Source{[Item="Maintenance_Itinérante___CSV",Kind="Table"]}[Data],
    #"Changed Type" = Table.TransformColumnTypes(Maintenance_Itinérante___CSV_Table,{{"Date&Time Seen", type datetime}, {"ID", Int64.Type}}),
    #"Removed Duplicates" = Table.Distinct(#"Changed Type", {"Date&Time Seen", "ID"}),
    #"Sorted Rows" = Table.Sort(#"Removed Duplicates",{{"Date&Time Seen", Order.Ascending}}),
    #"Added Custom" = Table.AddColumn(#"Sorted Rows", "Date", each Date.From([#"Date&Time Seen"])),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Date&Time Seen"}),
    #"Reordered Columns" = Table.ReorderColumns(#"Removed Columns",{"Date", "ID"}),
    #"Removed Duplicates1" = Table.Distinct(#"Reordered Columns", {"Date", "ID"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Removed Duplicates1",{{"Date", Int64.Type}}),
    #"Run Python script" = Python.Execute("# 'dataset' holds the input data for this script#(lf)groupeddataset = dataset.groupby(['Date'])['ID'].apply(lambda x: list(x)).to_frame().reset_index()#(lf)#test.groupby(['Pos'])['Pos2'].apply(lambda x: list(x)).to_frame().reset_index()#(lf)a = list(groupeddataset['ID']) #(lf)acc=list(initial['ID']); res=[]#(lf)for i in range(len(a)):#(lf)    acc=set(acc)-set(a[i])#(lf)    #acc=set(acc)-set([a[i]])#(lf)    if acc == set(): #(lf)        acc=initial#(lf)        res=res+[i]#(lf)#(lf)output=pandas.DataFrame(res,columns=['Positions'])",[dataset=#"Changed Type1", initial=Table.SelectRows(All_IDsT,each [ID]<> 15133)]),
    groupeddataset = #"Run Python script"{[Name="groupeddataset"]}[Value],
    groupeddataset2 = Table.TransformColumnTypes(groupeddataset,{{"Date", Int64.Type}}),
    #"Changed Type3" = Table.TransformColumnTypes(groupeddataset2,{{"Date", type date}}),
    #"Sorted Rows1" = Table.Sort(#"Changed Type3",{{"Date", Order.Ascending}}),
    CompletionPositionsT = #"Run Python script"{[Name="output"]}[Value],
    CompletionPositionsT2 = Table.TransformColumnTypes(CompletionPositionsT,{{"Positions", Int64.Type}}),
    result = List.Select(groupeddataset2[Date], each List.Contains(CompletionPositionsT2[Positions],_ - List.Min(groupeddataset2[Date]))),
    #"Converted to Table" = Table.FromList(result, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Changed Type2" = Table.TransformColumnTypes(#"Converted to Table",{{"Column1", type date}}),
    #"Renamed Columns" = Table.RenameColumns(#"Changed Type2",{{"Column1", "Completion dates"}})
in
    #"Renamed Columns"

 

 

The main step is #"Run Python script", with the following Python code:

 

 

groupeddataset = dataset.groupby(['Date'])['ID'].apply(lambda x: list(x)).to_frame().reset_index()
a = list(groupeddataset['ID']) 
acc=list(initial['ID']); res=[]
for i in range(len(a)):
    acc=set(acc)-set(a[i])
    if acc == set(): 
        acc=initial
        res=res+[i]

output=pandas.DataFrame(res,columns=['Positions'])

 

 

It groups the IDs by Dates (day level) and then operates on that to extract a list of the positions where each section with all IDs seen ends. Note I have filtered out ID 15133 from the list of IDs so that there is at least one section that has all IDs

Please mark the question solved when done and consider giving kudos if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

 

SU18_powerbi_badge

 

 

 

View solution in original post

As much a I love M, I will put another plug in for the DAX approach for this one.  I don't know if your IDs have category columns associated with them and you will want to also have slicers (which would require a DAX approach).  Attached is a pbix with your example data.  It uses the below DAX expression to generate the shown table (I also added a small table with the "Cycle" values of 1,2,3).  You could do them separate but it is calculation intensive (so since I already had the virtual table in the measure, I generated both outputs and concatenated them together).

 

mahoneypat_0-1600737218749.png

Time and Span for Completion =
VAR thiscycle =
    SELECTEDVALUE ( Cycle[Completion Cycle] )
VAR ids =
    ALLSELECTED ( Seen[ID] )
VAR idcount =
    COUNTROWS ( ids )
VAR summarylastcycle =
    ADDCOLUMNS (
        VALUES ( Seen[Date&Time Seen] ),
        "IDsSoFar",
            VAR thistime = Seen[Date&Time Seen]
            RETURN
                COUNTROWS (
                    FILTER (
                        ids,
                        CALCULATE (
                            COUNT ( Seen[ID] ),
                            Seen[Date&Time Seen] <= thistime
                        ) >= thiscycle - 1
                    )
                )
    )
VAR completiontimelastcycle =
    IF (
        thiscycle = 1,
        MIN ( Seen[Date&Time Seen] ),
        MINX (
            FILTER (
                summarylastcycle,
                [IDsSoFar] >= idcount
            ),
            Seen[Date&Time Seen]
        )
    )
VAR summarythiscycle =
    ADDCOLUMNS (
        FILTER (
            VALUES ( Seen[Date&Time Seen] ),
            Seen[Date&Time Seen] >= completiontimelastcycle
        ),
        "IDsSoFar",
            VAR thistime = Seen[Date&Time Seen]
            RETURN
                COUNTROWS (
                    FILTER (
                        ids,
                        CALCULATE (
                            COUNT ( Seen[ID] ),
                            Seen[Date&Time Seen] <= thistime
                        ) >= thiscycle
                    )
                )
    )
VAR completiontimethiscycle =
    MINX (
        FILTER (
            summarythiscycle,
            [IDsSoFar] >= idcount
        ),
        Seen[Date&Time Seen]
    )
VAR span =
    DATEDIFF (
        completiontimelastcycle,
        completiontimethiscycle,
        DAY
    )
VAR range = completiontimelastcycle & " - " & completiontimethiscycle
RETURN
    span & " days" & "; " & range

 

Regards,

Pat

 

 





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


View solution in original post

Anonymous
Not applicable

the code that implements the algorithm I described in the previous message

 

 

 

 

 

 

let
    completion = (tab)=>
    let
    grpID=Table.Group(tab, {"ID"}, {"grp", each _}),
    nids=Table.RowCount(grpID),
    currLastDate=List.Min(List.Last(grpID[grp])[Date]),
    rest= Table.SelectRows(tab, each _[Date] > currLastDate ),
    result= if Table.RowCount(Table.Distinct(rest,"ID")) < nids then {currLastDate} else {currLastDate} & @ completion(rest)  
    in 
    result 
in
    completion

 

 

 

 

 

This function  receives as input a table with Date and ID columns and provides a list of dates where the saturation cycle of all the distinct ids of the table is completed.

 

 

It seems very fast, respect previou solution based on list.generate and list.difference and so on ...

 

 

to obtain a more pleasant output, I grafted a copy of the vector of ids in the original table in three different random points of the [ID] column  😊

 

 

image.png

 

 

 

 

 

 

 

View solution in original post

48 REPLIES 48
ziying35
Impactful Individual
Impactful Individual

Hi, @Kiwizqt 

In the two worksheets you provided, the number of non-duplicate IDs in the seen worksheet is 1698, but the number of IDs in the Total ID worksheet is 1374.

Can you re-provide a correct form file?

Hi @ziying35 sorry for the inconvenience, I cleared wrong datas by applying an inner merge. The 324 ID excess came from wrong inputs from users and I thought it didn't matter for the solution as I would've done that merge before any solution given here, or that I could manually edit the m-code.

 

Bellow the finite data:

https://drive.google.com/file/d/1gWHYl3FxX3t2Dt6jQ8MCBWV1WF4d80V0/view?usp=sharing

ziying35
Impactful Individual
Impactful Individual

@Kiwizqt 

I provide a solution, but the solution deals with a file where all IDs must be valid, and my solution shows 10 non-duplicate IDs.

The result of the code run is shown below

333.png

 

// output
let
    Source = Table.FromRecords(Json.Document(Binary.Decompress(Binary.FromText("xdixbtwwDAbgVyk8dMoBpEja8s1ZOjdbkSHDDRnaqVvRd6/OjhSdVET6ZVw7B18oyzzql7/9mh5ffl4+P71+v3z6ern8mM6TI0cn0pObn4jOjs5E08P05TH8SRzxOv1+uJ9SgZVxXkuI7C5quSpW8LmOKeOoeCHyXWrOao2pnt3wV0VLoea7KlNYiRupBat1W6EHd/6YEoNXyDbyXKgS2la4Yh11UAnDisEZNaLsRLytsJw20qmMYcUyUisoYNpE5RTp3qDc9lzQjMqUE1jdvq9Wb2SKF1i5cm60lNTzsKOj/ovyI8p55H3ppso9bHXvMaVjasVVPkV5ZW71hlXdiyhWXOUrdNR+y1GZwUoUrDVXZ+VVuU7FM6wkn6LsybX2MCpRUHEIvUUtbqilOlMQlc35jhXu+bA8l1s9H5UoqMLyFKwVMlvYdgLPlKhshpUN1TLK+rBTKV5rz4foqReVSK4+THpJlXnjX6pGbzAdUg5XWFrOlCy4Km9tLbWnSjCzJaWKKxuqZchdLyl0N/bM5rFElFSWD3vV7b1yTHX0YVQmiJLqTOk4l5PKkl47b7wpxbJNUopkgF3JjCUH1mqFiDJcOfBOlJQgt7Zd2Qo+15bZnAwqfr+1tVPKVfGZPZZtklIFldx+CezZw7lOX4AS5E70phZw52vVkfSSCv8eUEt1/+r5LUfF0LRZqm9E46q1h1ExpHw1N3rm/EEFzfmo3vuwX3nwJKqzaM/7isqQLzA8lCqTYoJrcbkbjVru71k01Hr+Aw==", BinaryEncoding.Base64),Compression.Deflate))),
    chtype = Table.TransformColumnTypes(Source,{{"Date&Time Seen", type datetime}}),
    n = Table.RowCount(chtype),
    rows = List.Buffer(Table.ToRows(chtype)),
    gen = List.Generate(
              ()=>{[], null, 0, 0, 1},//{dictionary, counter, id_counter, index, cycle_index}
              each (_{1}? ??0)<n,
              each let 
                       fx1 = (i)=>if Record.HasFields(_{0}, Text.From(rows{i}?{1}?)) then _{2}
                                  else _{2}+1,
                       fx2 = (initial_value, variable)=>
                              if id_ctr=10 then 
                                  if fx1(ci+1)=10 and Date.From(rows{ci}?{0}?)=Date.From(rows{ci+1}?{0}?) then variable
                                  else initial_value
                              else variable,
                       ci = _{1}?+1 ??0,
                       dic =if Record.HasFields(_{0}, Text.From(rows{ci}?{1}?)) then _{0}
                             else _{0}&Record.AddField([], Text.From(rows{ci}?{1}?), id_ctr),
                       id_ctr = fx1(ci),
                       idx = Record.Field(dic, rows{ci}?{1}?),
                       dic_var = fx2([], dic),
                       id_ctr_var = fx2(0, id_ctr),
                       cycle_index = if _{3} = 10 then if idx = 10 then _{4} else _{4}+1 else _{4}
                       
                   in  { dic_var, ci, id_ctr_var, idx, cycle_index },
              each {rows{_{1}}{0}, rows{_{1}}{1}, Text.Format("#{1}-#{0}", {_{3}, _{4}}) }
          ),
    result = Table.FromRows(List.Skip(gen), type table [#"Date&Time Seen"= datetime, ID = any, Index = any])
in
    result

 

ziying35
Impactful Individual
Impactful Individual

@Kiwizqt 

I've simulated a scenario with only 10 IDs using some of the data you provided, see if I'm on the right track? If I'm right, I'll think of the code to implement the end result.

Sample PQ 

@ziying35Thank you, this is almost perfect and exactly what I need. However, do you know if there'd be a way to easily distinguish the completed cycles from one another ? Or even to modify the indexes to link them to their dates such as 1a-10a, 1b-10b etc in chronological orders of occurrences ? Or return them in another table ?

 

@AlB@Anonymous I've yet to open that pandora box but I for sure will one day, i'm currently in the process of learning me some Python. I haven't forgone both of your solutions and will learn from it, i'll try to implemant them and see what it is that you've designed, I've gotta learn myself some List.Accumulate, those seem very useful but quite dense to learn.

ziying35
Impactful Individual
Impactful Individual

@Kiwizqt 

I probably see what you mean.You want to put another special number on the result of each loop to distinguish the results of different loops

@ziying35That would be amazing yes, so that I could sort out by min & max later on and we'd be done here!

Anonymous
Not applicable

Hi @Kiwizqt 

 

You could try to implement this strategy:

 

1) Groups the Seen table (which is sorted on the date.time column) by ID.
2) Then you have the first block / completion of the Seen table by selecting all the rows preceding the first element of the last group of the previous step;


The new Seen table becomes the old one minus this first group.
Apply the same sequence to this one and so on ...

 

PS

This way to proceed reduce the size  of the problem  (and of the time !?!?) from the size of Seen table (100k) to the size of Tot ids (1k)

 

 

here the first two steps of  the  task ...

 

 

per scaricarli, fai click sul seguente link e segui le istruzioni.

https://giga.allegati.tim.it/r?d=alice.it&wr=rocco.lupoi@telecomitalia.it&ws=sprmnt21&e=alice.it&c=y...

 

I think with some work it is possible to automate this sequence of operations and have power query do everything ...

 

 

 

 

 

 

@Kiwizqt 

I did a test performing the core operation in Python and it seems to finish in less than 20 secs for the data you posted. See it in the attached file. You'll have to update the path to the excel file you shared. Here is the M code for the main query:

Note it should be cleaned up a bit, as it is doing a lot of stuff that might not be necessary

 

 

let
    Source = Excel.Workbook(File.Contents("d:\Downloads\Sample PQ Help.xlsx"), null, true),
    Maintenance_Itinérante___CSV_Table = Source{[Item="Maintenance_Itinérante___CSV",Kind="Table"]}[Data],
    #"Changed Type" = Table.TransformColumnTypes(Maintenance_Itinérante___CSV_Table,{{"Date&Time Seen", type datetime}, {"ID", Int64.Type}}),
    #"Removed Duplicates" = Table.Distinct(#"Changed Type", {"Date&Time Seen", "ID"}),
    #"Sorted Rows" = Table.Sort(#"Removed Duplicates",{{"Date&Time Seen", Order.Ascending}}),
    #"Added Custom" = Table.AddColumn(#"Sorted Rows", "Date", each Date.From([#"Date&Time Seen"])),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Date&Time Seen"}),
    #"Reordered Columns" = Table.ReorderColumns(#"Removed Columns",{"Date", "ID"}),
    #"Removed Duplicates1" = Table.Distinct(#"Reordered Columns", {"Date", "ID"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Removed Duplicates1",{{"Date", Int64.Type}}),
    #"Run Python script" = Python.Execute("# 'dataset' holds the input data for this script#(lf)groupeddataset = dataset.groupby(['Date'])['ID'].apply(lambda x: list(x)).to_frame().reset_index()#(lf)#test.groupby(['Pos'])['Pos2'].apply(lambda x: list(x)).to_frame().reset_index()#(lf)a = list(groupeddataset['ID']) #(lf)acc=list(initial['ID']); res=[]#(lf)for i in range(len(a)):#(lf)    acc=set(acc)-set(a[i])#(lf)    #acc=set(acc)-set([a[i]])#(lf)    if acc == set(): #(lf)        acc=initial#(lf)        res=res+[i]#(lf)#(lf)output=pandas.DataFrame(res,columns=['Positions'])",[dataset=#"Changed Type1", initial=Table.SelectRows(All_IDsT,each [ID]<> 15133)]),
    groupeddataset = #"Run Python script"{[Name="groupeddataset"]}[Value],
    groupeddataset2 = Table.TransformColumnTypes(groupeddataset,{{"Date", Int64.Type}}),
    #"Changed Type3" = Table.TransformColumnTypes(groupeddataset2,{{"Date", type date}}),
    #"Sorted Rows1" = Table.Sort(#"Changed Type3",{{"Date", Order.Ascending}}),
    CompletionPositionsT = #"Run Python script"{[Name="output"]}[Value],
    CompletionPositionsT2 = Table.TransformColumnTypes(CompletionPositionsT,{{"Positions", Int64.Type}}),
    result = List.Select(groupeddataset2[Date], each List.Contains(CompletionPositionsT2[Positions],_ - List.Min(groupeddataset2[Date]))),
    #"Converted to Table" = Table.FromList(result, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Changed Type2" = Table.TransformColumnTypes(#"Converted to Table",{{"Column1", type date}}),
    #"Renamed Columns" = Table.RenameColumns(#"Changed Type2",{{"Column1", "Completion dates"}})
in
    #"Renamed Columns"

 

 

The main step is #"Run Python script", with the following Python code:

 

 

groupeddataset = dataset.groupby(['Date'])['ID'].apply(lambda x: list(x)).to_frame().reset_index()
a = list(groupeddataset['ID']) 
acc=list(initial['ID']); res=[]
for i in range(len(a)):
    acc=set(acc)-set(a[i])
    if acc == set(): 
        acc=initial
        res=res+[i]

output=pandas.DataFrame(res,columns=['Positions'])

 

 

It groups the IDs by Dates (day level) and then operates on that to extract a list of the positions where each section with all IDs seen ends. Note I have filtered out ID 15133 from the list of IDs so that there is at least one section that has all IDs

Please mark the question solved when done and consider giving kudos if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

 

SU18_powerbi_badge

 

 

 

As much a I love M, I will put another plug in for the DAX approach for this one.  I don't know if your IDs have category columns associated with them and you will want to also have slicers (which would require a DAX approach).  Attached is a pbix with your example data.  It uses the below DAX expression to generate the shown table (I also added a small table with the "Cycle" values of 1,2,3).  You could do them separate but it is calculation intensive (so since I already had the virtual table in the measure, I generated both outputs and concatenated them together).

 

mahoneypat_0-1600737218749.png

Time and Span for Completion =
VAR thiscycle =
    SELECTEDVALUE ( Cycle[Completion Cycle] )
VAR ids =
    ALLSELECTED ( Seen[ID] )
VAR idcount =
    COUNTROWS ( ids )
VAR summarylastcycle =
    ADDCOLUMNS (
        VALUES ( Seen[Date&Time Seen] ),
        "IDsSoFar",
            VAR thistime = Seen[Date&Time Seen]
            RETURN
                COUNTROWS (
                    FILTER (
                        ids,
                        CALCULATE (
                            COUNT ( Seen[ID] ),
                            Seen[Date&Time Seen] <= thistime
                        ) >= thiscycle - 1
                    )
                )
    )
VAR completiontimelastcycle =
    IF (
        thiscycle = 1,
        MIN ( Seen[Date&Time Seen] ),
        MINX (
            FILTER (
                summarylastcycle,
                [IDsSoFar] >= idcount
            ),
            Seen[Date&Time Seen]
        )
    )
VAR summarythiscycle =
    ADDCOLUMNS (
        FILTER (
            VALUES ( Seen[Date&Time Seen] ),
            Seen[Date&Time Seen] >= completiontimelastcycle
        ),
        "IDsSoFar",
            VAR thistime = Seen[Date&Time Seen]
            RETURN
                COUNTROWS (
                    FILTER (
                        ids,
                        CALCULATE (
                            COUNT ( Seen[ID] ),
                            Seen[Date&Time Seen] <= thistime
                        ) >= thiscycle
                    )
                )
    )
VAR completiontimethiscycle =
    MINX (
        FILTER (
            summarythiscycle,
            [IDsSoFar] >= idcount
        ),
        Seen[Date&Time Seen]
    )
VAR span =
    DATEDIFF (
        completiontimelastcycle,
        completiontimethiscycle,
        DAY
    )
VAR range = completiontimelastcycle & " - " & completiontimethiscycle
RETURN
    span & " days" & "; " & range

 

Regards,

Pat

 

 





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


@AlB @mahoneypat @Anonymous @ziying35 All of you are wizards and honestly I kind of feel embarassed to have put you all to this much work, especially with how poorly it was initialy phrased.

 

This last solution from Pat gives me all that I want and then some, i'll apologize once again because I only have so many thanks and being new to this forum I hope that I can all verify those solutions simulatenously as they all gave a different take to my problem.

 

Again, thank you and I will be sure to learn to my best of my abilities all of the different method you've shown me so far. I'll do better on the problem description next time I have an issue and lurk around here in the meantime to learn new things.

 

Cheers

ziying35
Impactful Individual
Impactful Individual

@Kiwizqt 

I'm sorry, there's a judgement logic in my code that I haven't written yet, I just can't figure it out, I'll post it when the code is complete.

@Anonymous , @mahoneypat , @Kiwizqt 

If we remove ID 15133 (which is what @mahoneypat is effectively doing by considering  ALLSELECTED ( Seen[ID] )),

I only see one period with all the IDs in the data provided:  the one finishing on the 25th of August. From the 25th of August to the 3rd of September not all IDs come up. Neither do they if we extend the period up to the latest date on the data. I actually consider the second period would start after the 25th of August rather than on that very same day but that's just a matter of fine tuning.

 

@mahoneypat , I don't think your code complies with the requirements (at least how I've interpreted them).  For the Nth period, we need to know the completion date of the (N-1)th, and for the completion of the (N-1)th we need the (N-2)th, etc. That means for the Nth period we need first the completion dates of all the previous ones. It is this recursive aspect that makes it very difficult to implement in DAX, as it would imply a circular reference. I haven't been able to figure it out in DAX. I would certainly be interested in seeing a solution if you manage to get it. I have a feeling it cannot be done (unless you write a separate fragment of code for each period, which would not very flexible or elegant). That's why I resorted to M, and ultimately to Python given how slow the M solution was. A loop is all you need.

But getting back to why I think @mahoneypat's code doesn't provide the expected result: your

VAR summarylastcycle

looks at how many times each ID has appeared from the "beginning of time" up to the current date and gives the green light if each ID has been seen at least (thiscycle - 1) times. But what we need to do is to look at how many IDs have been seen in the  previous completion period at least once, irrespective of the cycle. These are two different things.

It can always be, of course, that I've misunderstood the code. I'd be happy to hear back if it were so.

 

PD: I fully agree with the numbers provided by @Anonymous regarding the number of IDs seen per period. Additionally, the number of IDs from the 25th of August up to the latest date is the same as up to the 19th of September. The  number of IDs in @mahoneypat's second period is 1202

 

Please mark the question solved when done and consider giving kudos if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

 

SU18_powerbi_badge

   

   

 

 

@AlB  @Kiwizqt Although I agree DAX is not the language for recursive calculation, the approach I've suggested is not recursive.  The calculation of N-1 does not require knowing N-2.  For example, the completion times for cycles 1, 2, and 3 is the minimum time where the count of all items is >=1, >=2, and >=3, respectively.  To get the span, one only needs the current and previous calculations (not all the values).  While it may need some fine tuning, I believe the approach is sound.

 

Also, I used ALLSELECTED() in case the ID table had other columns to be used as slicers, so that the calculation would be dynamic.

 

Regards,

Pat





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube



@mahoneypat wrote:

@AlB  @Kiwizqt

...the completion times for cycles 1, 2, and 3 is the minimum time where the count of all items is >=1, >=2, and >=3, respectively. 

@mahoneypat  Not really. In the example below, your reasoning (if I understand it correctly) would yield the highlighted completion times, while there is only one actual completion time ending on date 6.

 

Date ID  
1 100  
2 100  
3 100  
4 200  
5 200  
6 300 <--1st period
7 300 <--2nd period
8 200  
9 300 <--3rd period

 

Please mark the question solved when done and consider giving kudos if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

 

SU18_powerbi_badge

@AlB  You are right.  I will think more on a potential non-recursive DAX approach, but it may not be possible.  

Regards,

Pat

 





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


ziying35
Impactful Individual
Impactful Individual

@Anonymous 

 

PS: I have a question. Why is there no ID 15133 in the sheet named Seen in the file I downloaded, but there are multiple ones in the one you downloaded?

Anonymous
Not applicable

Hi @ziying35 ,

 

I'm not sure I understand your question. But, in the hypothesis that you have not completely read my previous message, I propose again here the part related to the presence of the ID 15133, in the hope that this can be an answer to your question.

 

<<

As you can verify the table Seen used contains 3 times the ID 15133 which is not rpesent at all in the original table.

Those are the three positions where I grafted the entire vector of ids (Total ID)  to get at least three full "completion" and be able to test the function to see if, as well as pretty, it really worked ...

>>

 

I take this opportunity to make a request.
Since now I don't have time to look at the code and analyze it, you could explain in detail the situation you were referring to with the following sentence:

 

"If the last ID happens to occur consecutively at the end of the loop, your code has a problem! For example, in the following case, which I've marked yellow." ?

 

 

 

 

ziying35
Impactful Individual
Impactful Individual

@Anonymous 

I see, because I'm using a machine translator, and there's a misspelling of a word, so I don't understand the translation.

In my simulation table, I think the result should look like this:

1.png

but your code looks like this:

2.png

my simulation table:

= Table.FromRecords(Json.Document(Binary.Decompress(Binary.FromText("xdixbtwwDAbgVyk8dMoBpEja8s1ZOjdbkSHDDRnaqVvRd6/OjhSdVET6ZVw7B18oyzzql7/9mh5ffl4+P71+v3z6ern8mM6TI0cn0pObn4jOjs5E08P05TH8SRzxOv1+uJ9SgZVxXkuI7C5quSpW8LmOKeOoeCHyXWrOao2pnt3wV0VLoea7KlNYiRupBat1W6EHd/6YEoNXyDbyXKgS2la4Yh11UAnDisEZNaLsRLytsJw20qmMYcUyUisoYNpE5RTp3qDc9lzQjMqUE1jdvq9Wb2SKF1i5cm60lNTzsKOj/ovyI8p55H3ppso9bHXvMaVjasVVPkV5ZW71hlXdiyhWXOUrdNR+y1GZwUoUrDVXZ+VVuU7FM6wkn6LsybX2MCpRUHEIvUUtbqilOlMQlc35jhXu+bA8l1s9H5UoqMLyFKwVMlvYdgLPlKhshpUN1TLK+rBTKV5rz4foqReVSK4+THpJlXnjX6pGbzAdUg5XWFrOlCy4Km9tLbWnSjCzJaWKKxuqZchdLyl0N/bM5rFElFSWD3vV7b1yTHX0YVQmiJLqTOk4l5PKkl47b7wpxbJNUopkgF3JjCUH1mqFiDJcOfBOlJQgt7Zd2Qo+15bZnAwqfr+1tVPKVfGZPZZtklIFldx+CezZw7lOX4AS5E70phZw52vVkfSSCv8eUEt1/+r5LUfF0LRZqm9E46q1h1ExpHw1N3rm/EEFzfmo3vuwX3nwJKqzaM/7isqQLzA8lCqTYoJrcbkbjVru71k01Hr+Aw==", BinaryEncoding.Base64),Compression.Deflate)))
Anonymous
Not applicable

Hi @ziying35 ,

 

if the data of dates are those (and there isn't typo in teh picture you provided)  they seems to me both correct 😁, then ther isn't contraddiction.

 

 

image.png

 

The logic that the function implements is that for which the start time of following completion period is greater (or not less ) than  time of previous completion period  and this is what the request seems to me to be.

is that so?

 

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors