Solved: Re: Merge multiple rows into one with multiple con...

rupertbrown · ‎01-23-2018

Hi there

I'm trying to "wrap up" multiple rows of data into a single row based on multiple conditions. I have limited experience using SQL (in the program TOAD) and have been able to achieve something similar using SORT, INDEX, PARTITION, RANK and report COUNT, SUM, MIN and MAX values for certain fields, but I can't work out a solution for Power BI Desktop. Any assistance would be greatly appreciated.

I have the following as an example. It shows for an individual, the date they were admitted to hospital, the date they were discharged (allowing for calculation of length of stay in hospital) and where they were discharged to. The problem is that when the DISCHARGE_TO field is populated with statistical separation, this isn't a real discharge (ie the person is still in hospital, there was just a change in their care). So I am trying to merge some rows to have a single row for each hospital admission.

This is what the data looks like.

Person	Admit Date	Discharge Date	Discharge To	Length of Stay
AB123	1/01/2014	4/01/2014	Home	3
AB123	21/04/2014	23/04/2014	Home	2
AB123	9/07/2014	10/07/2014	Home	1
AB123	15/06/2015	16/06/2015	Statistical Separation	1
AB123	16/06/2015	18/06/2015	Statistical Separation	2
AB123	18/06/2015	30/06/2015	Home	12
AB123	8/10/2017	16/10/2017	Statistical Separation	8
AB123	16/10/2017	1/12/2017	Home	46
AB123	11/12/2017	14/12/2017	Home	3
AB123	28/12/2017	30/12/2017	Statistical Separation	2
AB123	30/12/2017	10/01/2018	Statistical Separation	11
AB123	10/01/2018	13/01/2018	Home	3
AB123	19/01/2018	21/01/2018	Statistical Separation	2
AB123	21/01/2018	24/01/2018	Home	3
XY789	4/07/2016	8/07/2016	Statistical Separation	4
XY789	8/07/2016	12/07/2016	Home	4
XY789	1/09/2016	4/09/2016	Home	3
XY789	5/04/2017	6/05/2017	Home	31
XY789	9/05/2017	10/05/2017	Home	1
XY789	1/10/2017	4/10/2017	Home	3
XY789	30/11/2017	1/12/2017	Statistical Separation	1
XY789	1/12/2017	2/12/2017	Statistical Separation	1
XY789	2/12/2017	7/12/2017	Home	5
XY789	1/01/2018	2/01/2018	Statistical Separation	1
XY789	2/01/2018	6/01/2018	Home	4

And this is what I want the result to look like:

Person	Admit Date	Discharge Date	Length of Stay	Count_Statistical Separations
AB123	1/01/2014	4/01/2014	3	0
AB123	21/04/2014	23/04/2014	2	0
AB123	9/07/2014	10/07/2014	1	0
AB123	15/06/2015	30/06/2015	15	2
AB123	8/10/2017	1/12/2017	54	1
AB123	11/12/2017	14/12/2017	3	0
AB123	28/12/2017	13/01/2018	16	2
AB123	19/01/2018	24/01/2018	5	1
XY789	4/07/2016	12/07/2016	8	1
XY789	1/09/2016	4/09/2016	3	0
XY789	5/04/2017	6/05/2017	31	0
XY789	9/05/2017	10/05/2017	1	0
XY789	1/10/2017	4/10/2017	3	0
XY789	30/11/2017	7/12/2017	7	2
XY789	1/01/2018	6/01/2018	5	1

Rows should only be merged if the preceeding row has the discharge to field populated with statistcal separation, and the preceeding row discharge date is equal to the current row admit date.

The data I have contains more than 1 million rows so trying to achieve this in Excel isn't possible.

Any advice or suggestions would be fantastic. Thanks!!

MarkS · ‎01-23-2018

Hi @rupertbrown

I think that the basic idea of what you need to do is add an index, then add a calculated column that enters null where the value is "Statistical Separation" or the index value, then Fill Up on that column.

Then group on that column and the Person, with the aggregates of Min of Admit Date, Max of Discharge Date, and Sum of Length of stay.

Here is the M code:

let
    Source = Excel.Workbook(File.Contents("C:\testing1.xlsx"), null, true),
    Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(Sheet1_Sheet, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Person", type text}, {"Admit Date", type any}, {"Discharge Date", type any}, {"Discharge To", type text}, {"Length of Stay", Int64.Type}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
    #"Added Conditional Column" = Table.AddColumn(#"Added Index", "HomeIndex", each if [Discharge To] = "Statistical Separation" then null else [Index]),
    #"Grouped Rows" = Table.Group(#"Added Conditional Column", {"Person"}, {{"SecondTable", each _, type table}}),
    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.FillUp ([SecondTable],{"HomeIndex"})),
    #"Expanded Custom1" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Person", "Admit Date", "Discharge Date", "Discharge To", "Length of Stay", "Index", "HomeIndex"}, {"Person.1", "Admit Date", "Discharge Date", "Discharge To", "Length of Stay", "Index", "HomeIndex"}),
    #"Added Conditional Column1" = Table.AddColumn(#"Expanded Custom1", "Custom", each if [HomeIndex] = null then [Index] else [HomeIndex]),
    #"Removed Columns" = Table.RemoveColumns(#"Added Conditional Column1",{"SecondTable", "Person.1", "Index", "HomeIndex"}),
    #"Grouped Rows1" = Table.Group(#"Removed Columns", {"Custom", "Person"}, {{"Admit date", each List.Min([Admit Date]), type anynonnull}, {"Discharge Date", each List.Max([Discharge Date]), type anynonnull}, {"LOS", each List.Sum([Length of Stay]), type number}, {"S", each _, type table}}),
    #"Added Custom2" = Table.AddColumn(#"Grouped Rows1", "Count of Statistical Separation", each Table.RowCount( Table.SelectRows([S], each [Discharge To]="Statistical Separation"))),
    #"Removed Columns2" = Table.RemoveColumns(#"Added Custom2",{"S", "Custom"})
in
    #"Removed Columns2"

View solution in original post

Phil_Seamark · ‎01-23-2018

Hi @rupertbrown

This DAX calculated table might be close

Table = 

Var y = ADDCOLUMNS(
            'Table1',
            "Batch",1+
            CALCULATE(
                COUNTROWS('Table1'),
                FILTER(
                    'Table1',
                    'Table1'[Admit Date]<EARLIER('Table1'[Admit Date])
                    && 'Table1'[Person]=EARLIER('Table1'[Person])
                    && 'Table1'[Discharge To] = "Home"
                    )))
RETURN 
    SUMMARIZE(
            y,
            [Person],
            [Batch],
            "Admit Date" , MIN('Table1'[Admit Date]),
            "Discharge Date" , MAX('Table1'[Discharge Date]),
            "Length of Stay" ,INT(MAX('Table1'[Discharge Date]) - MIN('Table1'[Admit Date])),
            "Count Statistical Separations" , COUNT('Table1'[Admit Date])-1
            )

To learn more about DAX visit : aka.ms/practicalDAX

Proud to be a Datanaut!

View solution in original post

Phil_Seamark · ‎01-23-2018

Hi @rupertbrown

This DAX calculated table might be close

Table = 

Var y = ADDCOLUMNS(
            'Table1',
            "Batch",1+
            CALCULATE(
                COUNTROWS('Table1'),
                FILTER(
                    'Table1',
                    'Table1'[Admit Date]<EARLIER('Table1'[Admit Date])
                    && 'Table1'[Person]=EARLIER('Table1'[Person])
                    && 'Table1'[Discharge To] = "Home"
                    )))
RETURN 
    SUMMARIZE(
            y,
            [Person],
            [Batch],
            "Admit Date" , MIN('Table1'[Admit Date]),
            "Discharge Date" , MAX('Table1'[Discharge Date]),
            "Length of Stay" ,INT(MAX('Table1'[Discharge Date]) - MIN('Table1'[Admit Date])),
            "Count Statistical Separations" , COUNT('Table1'[Admit Date])-1
            )

To learn more about DAX visit : aka.ms/practicalDAX

Proud to be a Datanaut!

rupertbrown · ‎01-23-2018

Hi @Phil_Seamark

This is amazing. I find DAX easier to understand so this is a brilliant solution.

Thanks so much for your help.

Cheers

MarkS · ‎01-23-2018

Hi @rupertbrown

I think that the basic idea of what you need to do is add an index, then add a calculated column that enters null where the value is "Statistical Separation" or the index value, then Fill Up on that column.

Then group on that column and the Person, with the aggregates of Min of Admit Date, Max of Discharge Date, and Sum of Length of stay.

Here is the M code:

let
    Source = Excel.Workbook(File.Contents("C:\testing1.xlsx"), null, true),
    Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(Sheet1_Sheet, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Person", type text}, {"Admit Date", type any}, {"Discharge Date", type any}, {"Discharge To", type text}, {"Length of Stay", Int64.Type}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
    #"Added Conditional Column" = Table.AddColumn(#"Added Index", "HomeIndex", each if [Discharge To] = "Statistical Separation" then null else [Index]),
    #"Grouped Rows" = Table.Group(#"Added Conditional Column", {"Person"}, {{"SecondTable", each _, type table}}),
    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.FillUp ([SecondTable],{"HomeIndex"})),
    #"Expanded Custom1" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Person", "Admit Date", "Discharge Date", "Discharge To", "Length of Stay", "Index", "HomeIndex"}, {"Person.1", "Admit Date", "Discharge Date", "Discharge To", "Length of Stay", "Index", "HomeIndex"}),
    #"Added Conditional Column1" = Table.AddColumn(#"Expanded Custom1", "Custom", each if [HomeIndex] = null then [Index] else [HomeIndex]),
    #"Removed Columns" = Table.RemoveColumns(#"Added Conditional Column1",{"SecondTable", "Person.1", "Index", "HomeIndex"}),
    #"Grouped Rows1" = Table.Group(#"Removed Columns", {"Custom", "Person"}, {{"Admit date", each List.Min([Admit Date]), type anynonnull}, {"Discharge Date", each List.Max([Discharge Date]), type anynonnull}, {"LOS", each List.Sum([Length of Stay]), type number}, {"S", each _, type table}}),
    #"Added Custom2" = Table.AddColumn(#"Grouped Rows1", "Count of Statistical Separation", each Table.RowCount( Table.SelectRows([S], each [Discharge To]="Statistical Separation"))),
    #"Removed Columns2" = Table.RemoveColumns(#"Added Custom2",{"S", "Custom"})
in
    #"Removed Columns2"

rupertbrown · ‎01-23-2018

Hi @MarkS

This worked perfectly! Thank you so very much for the super quick reply and solution. Really appreciate the help.

Cheers

Merge multiple rows into one with multiple conditions

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024

How to Get Your Question Answered Quickly