Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
CroWorC
Frequent Visitor

Dataflow Convert from line to column with conditions

Hello, 

As a relative new user of PowerBI dataflow i'm stuck with this problem:

As source there is a table with id's, orderID's, Events, date of event and if needed a plandate.

Now I want to convert this table to a table in witch only the OnHold, OffHold and Plandates are shown by order.

I have made an example in Excel for the source and the wanted result.

 I tried to make a copy of the source with only the needed events and then merge it with the source but that will not do. Filtering results in a loop. 

Is there anyone who can help? 

 

main   table  wanted         result
IDOrderEventdateplandate  OrderPlan1IDOnHoldIDPlanOldIDPlanNewIDPdtPlan1PdPlanOldPdPlanNewdtOnholddtOffhold
1O1Start01/01/2020   O12731603/07/202003/07/202015/07/202015/01/202005/02/2020
2O1Plan105/01/202003/07/2020  O1221182703/07/202020/07/202006/08/202007/03/202014/07/2020
3O1Plan05/01/202003/07/2020  O3null11null14nullnullnull20/01/202022/01/2020
4O2Start05/01/2020   O31317142308/08/202008/08/202016/08/202028/02/202004/04/2020
5O2Plan106/01/202004/02/2020            
6O2Plan06/01/202004/02/2020            
7O1OnHold15/01/2020             
8O2Finish16/01/2020             
9O3Start17/01/2020             
10O4Start19/01/2020             
11O3OnHold20/01/2020             
12O3OffHold22/01/2020             
13O3Plan123/01/202008/08/2020            
14O3Plan23/01/202008/08/2020            
15O1OffHold05/02/2020             
16O1Plan05/02/202015/07/2020            
17O3OnHold28/02/2020             
18O1Plan01/03/202020/07/2020            
19O4Plan105/03/202005/05/2020            
20O4Plan05/03/202005/05/2020            
21O1OnHold07/03/2020             
22O3OffHold04/04/2020             
23O3Plan04/04/202016/08/2020            
24O4Finish06/05/2020             
25O3Finish13/07/2020             
26O1OffHold14/07/2020             
27O1Plan14/07/202006/08/2020            
28O1Finish06/08/2020             
6 REPLIES 6
Anonymous
Not applicable

@CroWorC ,

could you explain in more detail the logic with which you pass from the main table to the wanted one (keep in mind that, I don't know the meaning and the relations between the various acronyms used)?

 

As usual, @Smauro 's solutions are elegant and instructive, but it seems that his truly admirable effort of interpretation has produced a slightly different result from what you described as wanted. To hope to have an effective solution also on the size of your case, you should also say how many rows, columns (and possibly sub-groups) it is.

@Anonymous  The data I have is a table from an application for following orders.

Every step an order makes in the proces is logged in this table. 

The orders are projects build by diffrent contractors and in the building process there are two roughly two steps where something can get stucked. 

The first part is between orderintake (Start) and communicating first promissed date (Plan1). The second part is between communicating first promissed date and order ready.

In both parts there can be an onhold situation because of legitime reason or because of unlegitime reason. The onhold time because of legitime reason will be compensated. An Onhold situation can occure more than one time per order. The time lost is in both parts diffrent. Before communicating plandate the total time between Start and communicating first promissed date minus the time OnHold is the total runtime and that is one of the KPI's

The second part is more difficult. An onhold situation frustrates the planning of the contractor so the time between the plandate by Onhold and the plandate directly given after the Offhold moment is the time to compensate. The application asks for a new plandate bij the OffHold action. 

KPI: realize project on 1st promised date (including compensation)

KPI: number of replans <=2

Possible events:

wanted:

1 start

2 plan1 given

3 ready

 

not wanted

1 Start

2 OnHold legitime (date = 03/22/2021)

3 OffHold               (date = 01/14/2021) 

4 Plan given (this plandate is not relevat but after OffHold there must be a plandate)

5 Plan1 given         (plandate 03/19/2021)

6 OnHold (legitime)

7 OffHold

8 Pladate                (plandate 03/26/2021)

9 OnHold (not legitime)

10 OffHold 

11 Plandate           (plandate 04/02/2021)

12 OnHold (Legitime)

13 OffHold

14 Plandate           (plandate 04/09/2021)

15 ready                (date = 04/07/2021)

 

In the second example the compensation before giving the first plandate is 1 day,

The compensation after giving the first pandate is 7 days (step 5 and 😎 + 7 days (step 11 and 14) = 14 days.

The runtime of the order is total runtime - 15 days

The compensated plandate is the original plandate + 14 days (and in this case no score on 1th plandate)

 

In between the Events mentioned are diffrent other events for communication about the order.

In total there are about 150K orders with 3M events

 

The remoddeling of the table makes it possible for me to determine in witch part the event felt and how to compensate per order. 

 

Thanks @Anonymous , didn't catch the error there with missing HoldDates.

 

Issue was that they want to use the same value on different lines. For example, a plan's ID could be relevant in more than 1 holds. More, there could be no plan1 after a hold, so the previous should be used etc etc. That would all be solved if we were to look at the whole table everytime as I did in my first solution, but since we now know that there are 3 millions of records, I tried to look at every order's record only once. Of course, some values will not be there (like PdtPlan1)

 

Anyways, here the table for anyone that wants to play around:

Spoiler
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("hZNBisMwDEWvMnhdiCzbiXOBobsMdBm6KJQyhZKBtvenFlPJilUImMSGx9e39D3Pzrudm+hzeJ7uz/IH35WFgFAOX+64mx0y83M7LfSHpBgIHQz/B4KDhjfYSCzq4skUT8xI8V4Lxg6wCvYa3mAHNjot+7/buWy8rZ5Z8Pu6XB+/BPUGGgkK6hp+MIyn7RQ1NFrIs5J4QrAUCnW5MIYWC4xx5zDobuSuLOmGj5regpP0ThzQ6LBx0H/IgjDUbZUFP9i7ZyuZW8kS18AMNUtLjtxzHVyh6ZAqjaDpLVjejbgtlSv+fjgfJkUpjA22mlTLUOBU7zGyTYkkpTw1kokla3Dr2xNKBlT9+WgxeSpvfysGGn8yoZW/rCSPLw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Order = _t, Event = _t, date = _t, plandate = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Order", type text}, {"Event", type text}, {"date", type date}, {"plandate", type date}})
in
#"Changed Type"

 

 

And here's a solution that uses fill down to fix those issues:

 

let
    Source = Table.TransformColumnTypes(YourTable, {{"ID", Int64.Type}, {"date", Date.Type}, {"plandate", Date.Type}}),
    
    EventSort = [
        OnHold = 1,
        OffHold = 2,
        Plan = 3,
        Plan1 = 4
        Start = 5,
        Finish = 6
    ],
    rec = [
        Plan1ID = null,
        OnHoldID = null,
        PlanOldID = null,
        PlanNewID =null,
        PdtPlan1 = null,
        PdPlanOld = null,
        PdPlanNew = null,
        dtOnHold = null,
        dtOffHold = null
    ],
    tableType = type table [
        Plan1ID = Int64.Type,
        OnHoldID = Int64.Type,
        PlanOldID = Int64.Type,
        PlanNewID =Int64.Type,
        PdtPlan1 = Date.Type,
        PdPlanOld = Date.Type,
        PdPlanNew = Date.Type,
        dtOnHold = Date.Type,
        dtOffHold = Date.Type
    ],


    #"Fix Table Order" = Table.AddIndexColumn( Table.Sort(
        Table.AddColumn(Source, "EventSort"each Record.Field(EventSort, [Event]), Int64.Type),
        {"Order""date""EventSort"}), "temp"),
    #"Merge Data" = Table.CombineColumns(#"Fix Table Order",{"ID""date""plandate""Event"},(x) => {x{3}} & {[ID = x{0}, date = x{1}, plandate=x{2} ]},"Events"),
    GroupPerOrder = Table.Group(#"Merge Data", {"Order"}, {{"Holds"each try Table.FillDown( Table.FromRecords(List.Select(List.Accumulate([Events], {rec}, (s,c) =>
        if c{0} = "Plan1" then 
            let new = [PdtPlan1 = c{1}[plandate], Plan1ID = c{1}[ID]]
            in if List.Last(s)[OnHoldID] = null then List.RemoveLastN(s, 1) & {List.Last(s) & new}
            else s & {rec & new}
        else if c{0} = "Plan" then
            let new = [PlanOldID = c{1}[ID], PdPlanOld = c{1}[plandate]]
            in if List.Last(s)[OnHoldID] = null then List.RemoveLastN(s, 1) & {List.Last(s) & new}
                else if List.Last(s)[PlanNewID] = null then List.RemoveLastN(s, 1) & {List.Last(s) & Record.RenameFields(new, {{"PlanOldID""PlanNewID"}, {"PdPlanOld""PdPlanNew"}})} & {rec & new}
                else s & {rec & new}
        else if c{0} = "OnHold" then
            let new = [dtOnHold = c{1}[date], OnHoldID = c{1}[ID]]
            in if List.Last(s)[OnHoldID] = null then List.RemoveLastN(s, 1) & {List.Last(s) & new}
            else s & {rec & new}
        else if c{0} = "OffHold" then
            let new = [dtOffHold = c{1}[date]]
            in List.RemoveLastN(s, 1) & {List.Last(s) & new}
        else s
    ), each _[OnHoldID] <> null )), {"Plan1ID""PdtPlan1"}) otherwise null, tableType}}, GroupKind.Local),
    #"Filtered Holds" = Table.SelectRows(#"GroupPerOrder", each [Holds] <> null),
    #"Expanded Holds" = Table.ExpandTableColumn(#"Filtered Holds", "Holds", Record.FieldNames(rec))
in
    #"Expanded Holds"

 

 

Cheers,




Feel free to connect with me:
LinkedIn

CroWorC
Frequent Visitor

Hi Smauro,

Sorry for the late reaction and thanks for the possible solution. The data in my table seems to be too big for this rebuild. I have + 3 milion records in the table and the solution You gave runs out of time. So I tried to select a part of the records but it seems that Powerflow is using the original table still. It keeps on running out of time. Then I tried to get fewer records from the database using SQL statement in stead of chosing a table, but that won't work in Powerflow (same SQL works in PowerBI Dataset) So for now I'm running out of options. I wil come back if my dataproblem is solved. 

Hi there, yes, this solution would be very heavy in your dataset.

 

Best I could hastly do is this:

let
    Source = Table.TransformColumnTypes(YourTable, {{"ID", Int64.Type}, {"date", Date.Type}, {"plandate", Date.Type}}),
    
    EventSort = [
        Start = 5,
        Plan = 3,
        Plan1 = 4
        OnHold = 1,
        OffHold = 2,
        Finish = 6
    ],
    rec = [
        Plan1ID = null,
        OnHoldID = null,
        PlanOldID = null,
        PlanNewID =null,
        PdtPlan1 = null,
        PdPlanOld = null,
        PdPlanNew = null,
        dtOnHold = null,
        dtOffHold = null
    ],
    recType = type {[
        Plan1ID = Int64.Type,
        OnHoldID = Int64.Type,
        PlanOldID = Int64.Type,
        PlanNewID =Int64.Type,
        PdtPlan1 = Date.Type,
        PdPlanOld = Date.Type,
        PdPlanNew = Date.Type,
        dtOnHold = Date.Type,
        dtOffHold = Date.Type
    ]},


    #"Fix Table Order" = Table.AddIndexColumn( Table.Sort(
        Table.AddColumn(Source, "EventSort"each Record.Field(EventSort, [Event]), Int64.Type),
        {"Order""date""EventSort"}), "temp"),
    #"Merge Data" = Table.CombineColumns(#"Fix Table Order",
        {"ID""date""plandate""Event"},
        (x) => {x{3}} & {[ID = x{0}, date = x{1}, plandate=x{2} ]},"Events"),
    GroupPerOrder = Table.Group(#"Merge Data", {"Order"}, {{"Holds"each
        List.Select(List.Accumulate([Events], {rec}, (s,c) =>
        if c{0} = "Plan1" then 
            let new = [PdtPlan1 = c{1}[plandate], Plan1ID = c{1}[ID]]
            in if List.Last(s)[OnHoldID] = null then List.RemoveLastN(s, 1) & {List.Last(s) & new}
            else s & {rec & new}
        else if c{0} = "Plan" then
            let new = [PlanOldID = c{1}[ID], PdPlanOld = c{1}[plandate]]
            in if List.Last(s)[OnHoldID] = null then List.RemoveLastN(s, 1) & {List.Last(s) & new}
                else if List.Last(s)[PlanNewID] = null then
                    List.RemoveLastN(s, 1) & {List.Last(s) &
                    Record.RenameFields(new, {{"PlanOldID""PlanNewID"}, {"PdPlanOld""PdPlanNew"}})}
                    & {rec & new}
                else s & {rec & new}
        else if c{0} = "OnHold" then
            let new = [dtOnHold = c{1}[date], OnHoldID = c{1}[ID]]
            in if List.Last(s)[OnHoldID] = null then List.RemoveLastN(s, 1) & {List.Last(s) & new}
            else s & {rec & new}
        else if c{0} = "OffHold" then
            let new = [dtOffHold = c{1}[date]]
            in List.RemoveLastN(s, 1) & {List.Last(s) & new}
        else s
    ), each _[OnHoldID] <> null ), recType}}, GroupKind.Local),
    #"Expanded Holds' list" = Table.SelectRows(Table.ExpandListColumn(GroupPerOrder, "Holds"), each [Holds] <> null),
    #"Expanded Holds' records" = Table.ExpandRecordColumn(#"Expanded Holds' list", "Holds", Record.FieldNames(rec))
    
in
    #"Expanded Holds' records"

 

 

Instead of buffering and filtering everytime, it sorts the table once and then groups locally. Give it a try and let me know 🙂

(YourTable is your table's query name)

 

Cheers,

Spyros




Feel free to connect with me:
LinkedIn

Smauro
Solution Sage
Solution Sage

Hi @CroWorC ,

 

Well, this is a possible solution:

 

...
    FixTable = Table.Buffer(Table.TransformColumnTypes(PreviousStep, {{"ID", Int64.Type}, {"date", Date.Type}, {"plandate", Date.Type}})),
    
    AddNewColumns = Table.AddColumn(Table.SelectRows(FixTable, each [Event] = "OnHold"), "NewColumns", (rec) =>
        let
            #"OrderData" = Table.SelectRows(FixTable, each [Order] = rec[Order])
        in
            (try Record.RenameFields(
                Table.LastN(Table.SelectRows(#"OrderData", each [ID] < rec[ID] and [Event] = "Plan1"), 1)[[ID], [plandate]]{0},
                {{"ID""Plan1ID"}, {"plandate""PdtPlan1"}}) otherwise [])
            &
            (try Record.RenameFields(
                Table.LastN(Table.SelectRows(#"OrderData", each [ID] < rec[ID] and [Event] = "Plan"), 1)[[ID], [plandate]]{0},
                {{"ID""PlanOldID"}, {"plandate""PdPlanOld"}}) otherwise [])
            &
            (try Record.RenameFields(
                Table.FirstN(Table.SelectRows(#"OrderData", each [ID] > rec[ID] and [Event] = "Plan"), 1)[[ID], [plandate]]{0},
                {{"ID""PlanNewID"}, {"plandate""PdPlanNew"}}) otherwise [])
            &
            (try Record.RenameFields(
                Table.FirstN(Table.SelectRows(#"OrderData", each [ID] > rec[ID] and [Event] = "OffHold"), 1)[[date]]{0},
                {{"date""dtOffhold"}}) otherwise [])
            &
            Record.RenameFields(rec[[ID], [date]], {{"ID""OnHoldID"}, {"date""dtOnhold"}}),
        type record
        ),

    ColumnOrder = {"Plan1ID""OnHoldID""PlanOldID""PlanNewID""PdtPlan1""PdPlanOld""PdPlanNew""dtOnhold""dtOffhold"},
    #"Expand New Columns" = Table.ExpandRecordColumn(Table.SelectColumns(AddNewColumns,{"Order""NewColumns"}), "NewColumns", ColumnOrder),
    TableType = type table [
        Order = Text.Type,
        PlanID = Int64.Type,
        OnHoldID = Int64.Type,
        PlanOldID = Int64.Type,
        PlanNewID =Int64.Type,
        PdtPlan1 = Date.Type,
        PdPlanOld = Date.Type,
        PdPlanNew = Date.Type,
        dtOnhold = Date.Type,
        dtOffhold = Date.Type
    ],
    #"Fixed Type" = Value.ReplaceType(#"Expand New Columns",TableType)
in
    #"Fixed Type"

 

Try it by replacing PreviousStep with your previous step's name.

 

Cheers




Feel free to connect with me:
LinkedIn

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors