Solved: Re: Generate Table of Contiguous Segments from Tab...

ericOnline · ‎07-31-2020

Wow! This one is a baffler for me. Hoping some Power BI experts here can weigh in.

Given:

- A composite unique key (first 4 columns of the sample data)

- A table of individual segments

Generate:

- A new table of contiguous segments based on the MIN(CUT_SEG_START) and MAX(CUT_SEG_END) for each unique key

Sample Data: See below

Desired Output: (Based on Sample Data below)

CUT_NUMBER	CUT_SEGMENT	CUT_SUBSEGMENT	CUT_TYPE	CUT_SEG_START	CUT_SEG_END
15	7000	1	M	120.896	131.985
15	7000	2	M	120.896	131.6

PowerBI Code tried so far: (not working)

CONTIGUOUS_TABLE = 
    SUMMARIZE(
        'SEGMENT_TABLE', 
        'SEGMENT_TABLE'[CUT_NUMBER], 
        'SEGMENT_TABLE'[CUT_SEGMENT], 
        'SEGMENT_TABLE'[CUT_SUBSEGMENT], 
        'SEGMENT_TABLE'[CUT_TYPE],
        "CUT_SEG_START", 
            IF(
                'SEGMENT_TABLE'[CUT_SEG_START] = EARLIER('SEGMENT_TABLE'[CUT_SEG_END])),
                MINX('SEGMENT_TABLE'[CUT_SEG_START]),
                'SEGMENT_TABLE'[CUT_SEG_START]
            ),
        "CUT_SEG_END", 
            IF(
                EARLIER('SEGMENT_TABLE'[CUT_SEG_END]) = 'SEGMENT_TABLE'[CUT_SEG_START],
                MAXX('SEGMENT_TABLE'[CUT_SEG_END]),
                'SEGMENT_TABLE'[CUT_SEG_START]
            )
    )

@Greg_Deckler ever solved something like this with PBI?

Thank you!

---

Sample Data:

CUT_NUMBERCUT_SEGMENTCUT_SUBSEGMENTCUT_TYPECUT_SEG_STARTCUT_SEG_END

15	7000	1	M	120.896	128.342
15	7000	1	M	128.342	129.016
15	7000	1	M	129.016	129.191
15	7000	1	M	129.191	131.6
15	7000	1	M	131.6	131.806
15	7000	1	M	131.806	131.985
15	7000	2	M	120.896	128.344
15	7000	2	M	128.344	129.016
15	7000	2	M	129.016	129.192
15	7000	2	M	129.192	131.6

v-zhenbw-msft · ‎08-04-2020

Hi @ericOnline ,

Please try the Power BI Store Version to open our pbix.

You can try M Query, not use the DAX.

1. Add an index column.

2. Add another column that index column add 1.

3. Merge the table based on two columns.

4. Expand the Start column.

Then compare the new Start column and End column.

5. Then add a column, if the Custom is False, then NULL.

6. And we need to use the Fill up function.

7. At last we use the GROUP BY function to get the result.

If you have any question, please kindly ask here and we will try to resolve it.

Best regards,

Community Support Team _ zhenbw

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

BTW, pbix as attached.

View solution in original post

mahoneypat · ‎07-31-2020

If you prefer or need to do it with DAX, here is the expression for the calculated table.

CONTIGUOUS_TABLE =
ADDCOLUMNS (
    SUMMARIZE (
        'SEGMENT_TABLE',
        'SEGMENT_TABLE'[CUT_NUMBER],
        'SEGMENT_TABLE'[CUT_SEGMENT],
        'SEGMENT_TABLE'[CUT_SUBSEGMENT],
        'SEGMENT_TABLE'[CUT_TYPE]
    ),
    "CUT_SEG_START", CALCULATE ( MIN ( Segment_Table[Cut_Seg_Start] ) ),
    "CUT_SEG_END", CALCULATE ( MAX ( Segment_Table[Cut_Seg_end] ) )
)

If this works for you, please mark it as the solution. Kudos are appreciated too. Please let me know if not.

Regards,

Pat

Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.

@mahoneypa HoosierBI on YouTube

ericOnline · ‎07-31-2020

Hi @mahoneypat ,

Thank you for the insights! Do you know how I would accommodate gaps between segments? Thats really the crux of the issue.

Your suggested DAX results in a single record...

CUT_NUMBER	CUT_SEGMENT	CUT_SUBSEGMENT	CUT_TYPE	CUT_SEG_START	CUT_SEG_END
13547	0003	M	1	154.25	291.6

...where I'm looking for 3 records due to the gaps between the _END and _START of the segments...

CUT_NUMBER	CUT_SEGMENT	CUT_SUBSEGMENT	CUT_TYPE	CUT_SEG_START	CUT_SEG_END
13547	0003	M	1	154.25	171.513
13547	0003	M	2	172.2	235.632
13547	0003	M	3	237.131	291.6

Visualization:

Sample data for this example:

CUT_NUMBER	CUT_SEGMENT	CUT_SUBSEGMENT	CUT_TYPE	CUT_SEG_START	CUT_SEG_END
13547	3	M	1	154.25	155.039
13547	3	M	1	155.039	156.944
13547	3	M	1	156.944	157.053
13547	3	M	1	157.053	166
13547	3	M	1	166	171.4
13547	3	M	1	171.4	171.433
13547	3	M	1	171.433	171.513
13547	3	M	1	172.2	172.446
13547	3	M	1	172.446	175.5
13547	3	M	1	175.5	175.8
13547	3	M	1	175.8	180.109
13547	3	M	1	180.109	180.305
13547	3	M	1	180.305	181
13547	3	M	1	181	183.14
13547	3	M	1	183.14	183.229
13547	3	M	1	183.229	183.337
13547	3	M	1	183.337	184.2
13547	3	M	1	184.2	184.639
13547	3	M	1	184.639	184.9
13547	3	M	1	184.9	184.91
13547	3	M	1	184.91	185.079
13547	3	M	1	185.079	185.36
13547	3	M	1	185.36	185.381
13547	3	M	1	185.381	185.5
13547	3	M	1	185.5	186.125
13547	3	M	1	186.125	189.75
13547	3	M	1	189.75	197.669
13547	3	M	1	197.669	197.7
13547	3	M	1	197.7	197.9
13547	3	M	1	197.9	210.254
13547	3	M	1	210.254	210.479
13547	3	M	1	210.479	222.39
13547	3	M	1	222.39	222.396
13547	3	M	1	222.396	222.624
13547	3	M	1	222.624	235.517
13547	3	M	1	235.517	235.523
13547	3	M	1	235.523	235.63155
13547	3	M	1	235.63155	235.632
13547	3	M	1	237.131	237.238
13547	3	M	1	237.238	237.4
13547	3	M	1	237.4	239.788
13547	3	M	1	239.788	240
13547	3	M	1	240	241.9
13547	3	M	1	241.9	244.319
13547	3	M	1	244.319	244.555
13547	3	M	1	244.555	249.88
13547	3	M	1	249.88	254.514
13547	3	M	1	254.514	254.752
13547	3	M	1	254.752	269.901
13547	3	M	1	269.901	270.006
13547	3	M	1	270.006	270.155
13547	3	M	1	270.155	278.9
13547	3	M	1	278.9	279.125
13547	3	M	1	279.125	279.357
13547	3	M	1	279.357	291.397
13547	3	M	1	291.397	291.6

v-zhenbw-msft · ‎08-03-2020

Hi @ericOnline ,

We need to create three columns to group the data, then we can create a table to meet your requirement.

1. Create three columns in SEGMENT_TABLE.

Column = 
[CUT_SEG_END]
    = CALCULATE (
        SUM ( 'SEGMENT_TABLE'[CUT_SEG_START]),
        'SEGMENT_TABLE',
        'SEGMENT_TABLE'[Index]
            = EARLIER ( 'SEGMENT_TABLE'[Index] ) + 1,
        'SEGMENT_TABLE'[CUT_NUMBER] = EARLIER ( 'SEGMENT_TABLE'[CUT_NUMBER] ),
        'SEGMENT_TABLE'[CUT_SEGMENT] = EARLIER ( 'SEGMENT_TABLE'[CUT_SEGMENT] ),
        'SEGMENT_TABLE'[CUT_SUBSEGMENT] = EARLIER ( 'SEGMENT_TABLE'[CUT_SUBSEGMENT] )
)

Group = 
VAR temp =
    CALCULATE (
        MAX ( 'SEGMENT_TABLE'[Index] ),
        'SEGMENT_TABLE',
        NOT ( 'SEGMENT_TABLE'[Column] ),
        'SEGMENT_TABLE',
        'SEGMENT_TABLE'[Index]
            < EARLIER ( 'SEGMENT_TABLE'[Index] ),
        'SEGMENT_TABLE'[CUT_NUMBER] = EARLIER ( 'SEGMENT_TABLE'[CUT_NUMBER] ),
        'SEGMENT_TABLE'[CUT_SEGMENT] = EARLIER ( 'SEGMENT_TABLE'[CUT_SEGMENT] ),
        'SEGMENT_TABLE'[CUT_SUBSEGMENT] = EARLIER ( 'SEGMENT_TABLE'[CUT_SUBSEGMENT] )
    )
RETURN
   IF (
        CALCULATE (
            COUNTROWS ( 'SEGMENT_TABLE' ),
            'SEGMENT_TABLE'[Column],
            'SEGMENT_TABLE',
            'SEGMENT_TABLE'[Index]
                = EARLIER ( 'SEGMENT_TABLE'[Index] ) - 1,
            'SEGMENT_TABLE'[CUT_NUMBER] = EARLIER ( 'SEGMENT_TABLE'[CUT_NUMBER] ),
            'SEGMENT_TABLE'[CUT_SEGMENT] = EARLIER ( 'SEGMENT_TABLE'[CUT_SEGMENT] ),
            'SEGMENT_TABLE'[CUT_SUBSEGMENT] = EARLIER ( 'SEGMENT_TABLE'[CUT_SUBSEGMENT] )
        ) > 0,
        CALCULATE (
            MIN ( 'SEGMENT_TABLE'[Index] ),
            'SEGMENT_TABLE',
            'SEGMENT_TABLE'[Index] < EARLIER ( 'SEGMENT_TABLE'[Index] ),
            'SEGMENT_TABLE'[Index] > temp,
            'SEGMENT_TABLE'[CUT_NUMBER] = EARLIER('SEGMENT_TABLE'[CUT_NUMBER]),
            'SEGMENT_TABLE'[CUT_SEGMENT] = EARLIER ( 'SEGMENT_TABLE'[CUT_SEGMENT] ),
            'SEGMENT_TABLE'[CUT_SUBSEGMENT] = EARLIER ( 'SEGMENT_TABLE'[CUT_SUBSEGMENT] )
        ),
        'SEGMENT_TABLE'[Index]
)

Rank = 
RANKX(
    FILTER('SEGMENT_TABLE','SEGMENT_TABLE'[CUT_NUMBER]=EARLIER('SEGMENT_TABLE'[CUT_NUMBER]) && 'SEGMENT_TABLE'[CUT_SEGMENT]=EARLIER('SEGMENT_TABLE'[CUT_SEGMENT]) && 'SEGMENT_TABLE'[CUT_SUBSEGMENT]='SEGMENT_TABLE'[CUT_SUBSEGMENT]),
    'SEGMENT_TABLE'[Group],,ASC,Dense)

2. Then we can create a table using the following formula,

CONTIGUOUS_TABLE = 
ADDCOLUMNS (
    SUMMARIZE (
        'SEGMENT_TABLE',
        'SEGMENT_TABLE'[CUT_NUMBER],
        'SEGMENT_TABLE'[CUT_SEGMENT],
        'SEGMENT_TABLE'[CUT_SUBSEGMENT],
        'SEGMENT_TABLE'[Rank]
    ),
    "CUT_SEG_START", CALCULATE ( MIN ( 'SEGMENT_TABLE'[Cut_Seg_Start] ),FILTER('SEGMENT_TABLE','SEGMENT_TABLE'[Rank]=EARLIER('SEGMENT_TABLE'[Rank]))),
    "CUT_SEG_END", CALCULATE ( MAX ( 'SEGMENT_TABLE'[Cut_Seg_end] ),FILTER('SEGMENT_TABLE','SEGMENT_TABLE'[Rank]=EARLIER('SEGMENT_TABLE'[Rank] )))
)

If it doesn’t meet your requirement, could you please show the exact expected result based on the table that you have shared?

Best regards,

Community Support Team _ zhenbw

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

BTW, pbix as attached.

ericOnline · ‎08-03-2020

Thank you for the insights @v-zhenbw-msft . I will test the code you provided now.

I'm unable to open the attached .pbix for some reason.

It says "please install the latest version"...

...but I'm on the July 2020 PBI desktop version already...

No big deal 🙂 . I'll translate the code from your great post, just wanted you to be aware.

ericOnline · ‎08-03-2020

Hello @v-zhenbw-msft ,

I was able to translate the code to my PBI file. I'm able to execute the code to create the "column" and "rank" columns, as well as the output table, but the code to create the "group" column will not execute.

Error:

I've increased the Data Cache to 8192MB and closed all programs except for PBI. Still will not execute. Any ideas for getting this query optimized to work on an 8GB RAM laptop?

Thank you

v-zhenbw-msft · ‎08-06-2020

Hi @ericOnline ,

How about the result after you follow the suggestions mentioned in my original post?

If you've fixed the issue on your own please kindly share your solution.

If the above posts help, please kindly mark it as a solution to help others find it more quickly. Thanks!

Best regards,

Community Support Team _ zhenbw

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

ericOnline · ‎08-07-2020

Hi @v-zhenbw-msft ,

Thank you for the continued follow up. Sorry I'm late; I was unable to run the PowerQuery, again due to RAM(?). After a couple days, I tried again and was able to! Must have been background processes causing issue.

After running the steps you outlined, and troubleshooting a bit, I was able to get the desired results.

One caveat: In order for the method to work, I had to first sort the table by:

- CUT_NUM Ascending, then CUT_SEGMENT Ascending, then CUT_SUBSEGMENT Ascending

You're a great help @v-zhenbw-msft . Thank you for the expertise! Wow!

v-zhenbw-msft · ‎08-04-2020

Hi @ericOnline ,

Please try the Power BI Store Version to open our pbix.

You can try M Query, not use the DAX.

1. Add an index column.

2. Add another column that index column add 1.

3. Merge the table based on two columns.

4. Expand the Start column.

Then compare the new Start column and End column.

5. Then add a column, if the Custom is False, then NULL.

6. And we need to use the Fill up function.

7. At last we use the GROUP BY function to get the result.

If you have any question, please kindly ask here and we will try to resolve it.

Best regards,

Community Support Team _ zhenbw

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

BTW, pbix as attached.

mahoneypat · ‎07-31-2020

This can be done with a Group By step, grouping on the first 4 columns, taking the Min of the 5th, and the Max of the 6th. To see how it works, just create a blank query, go to Advanced Editor, and replace the text there with the M code below.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("fZBLCsAwCETv4jqImk/1ED1Byf2v0SQuSglxEWbgPQLj8wBXSHAR0Qge754phGptNcVcBHo6mc5nMyRugencGxuH5uSjZcbgx0U9lWJvcm+mdTPlsLsEpvPzbjns3m8pv93y7e4v", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Cut_Number = _t, Cut_Segment = _t, Cut_Subsegment = _t, Cut_Type = _t, Cut_Seg_Start = _t, Cut_Seg_end = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Cut_Number", Int64.Type}, {"Cut_Segment", Int64.Type}, {"Cut_Subsegment", Int64.Type}, {"Cut_Type", type text}, {"Cut_Seg_Start", type number}, {"Cut_Seg_end", type number}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Cut_Number", "Cut_Segment", "Cut_Subsegment", "Cut_Type"}, {{"Cut_Seg_Start", each List.Min([Cut_Seg_Start]), type nullable number}, {"Cut_Seg_End", each List.Max([Cut_Seg_end]), type nullable number}})
in
    #"Grouped Rows"

If this works for you, please mark it as the solution. Kudos are appreciated too. Please let me know if not.

Regards,

Pat

Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.

@mahoneypa HoosierBI on YouTube