Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Flona
Regular Visitor

Extract distinct data from one table (Source) into an new table

Hi

I have some problem extracting distinct data from on table into another. I have one large source table with 20k+ rows, but only around 3000 distinct IDs (se example under). I would like to extract the row with last start date under each ID. Due to historical data the description may also change on the different rows, and it is always the description on the row with the last start date that are interesting.

I’ve tried remove duplicates (ID Colum) in the source table in Power Query, but are not able to ensure that this does not remove the row with the last start date.

Here is an example of the table data I have in the source file, and how I would like to have it in a new table.

Table 1 (Source)

  

ID

Description

Start date

End Date

ID1

ID1 Text 1

01.01.2019

01.02.2019

ID1

ID1 Text 2

01.01.2019

01.02.2019

ID1

ID1 Text 3

20.01.2019

01.03.2019

ID2

ID2 Text 1

10.01.2019

20.01.2019

ID2

ID2 Text 1

15.01.2019

25.01.2019

ID3

ID3 Text 1

02.01.2019

03.01.2019

ID3

ID3 Text 1

05.01.2019

06.01.2019

ID3

ID3 Text 1

07.01.2019

08.01.2019

ID3

ID3 Text 2

09.01.2019

10.01.2019

ID4

ID4 Text1

26.01.2019

01.03.2019

 

Table 2 (New)

  

ID1

ID1 Text 3

20.01.2019

01.03.2019

ID2

ID2 Text 1

15.01.2019

25.01.2019

ID3

ID3 Text 2

09.01.2019

10.01.2019

ID4

ID4 Text1

26.01.2019

01.03.2019

 

Anyone have a solution for my problem, or knows how to solve this?

 

Regards 

Flona

3 ACCEPTED SOLUTIONS
Zubair_Muhammad
Community Champion
Community Champion

@Flona 

 

You can use Table.Max function in Power Query as follows

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rdJBC4MgFAfwryKdPeQz27qPYINOBTtEB7eEdmngEta3X3nIqekuuwiPpz/fX2zb5HwiCV5X1Ij3hNaiUeLV8xmjCx8VlzMiGEFKiqVVyofulOImrVaHPQr+R9GlqNVoSZB6VMXlfXAc0A58pRuUdGYyUvCSEOblY95Um8Usi2qLGusq+tHRYMP8qWlcq/mkpHWABVPmcap62tsPZio3/jEswX7EIhKR2K+faS7T3H5EyH/9ie4D", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Description = _t, #"Start date" = _t, #"End Date" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", type text}, {"Description", type text}, {"Start date", type date}, {"End Date", type date}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"ID"}, {{"All", each Table.Max(_,"Start date"), type record}}),
    #"Expanded All" = Table.ExpandRecordColumn(#"Grouped Rows", "All", {"ID", "Description", "Start date", "End Date"}, {"ID.1", "Description", "Start date", "End Date"})
in
    #"Expanded All"

 


Regards
Zubair

Please try my custom visuals

View solution in original post

AlB
Super User
Super User

Hi @Flona 

If you want it in DAX you can create a new calculated table as follows:

 

 

NewTable =
FILTER (
    Table1,
    Table1[ID]
        = CALCULATE ( MAX ( Table1[Start date] ), ALLEXCEPT ( Table1, Table1[ID] ) )
)

 

View solution in original post

@Flona 

 

You're quite right. There was an error in the code. Try this slightly modified version:

 

NewTable = 
FILTER (
    Table1;
    Table1[Start date]
        = CALCULATE ( MAX ( Table1[Start date] ); ALLEXCEPT ( Table1; Table1[ID] ) )
)

View solution in original post

4 REPLIES 4
AlB
Super User
Super User

Hi @Flona 

If you want it in DAX you can create a new calculated table as follows:

 

 

NewTable =
FILTER (
    Table1,
    Table1[ID]
        = CALCULATE ( MAX ( Table1[Start date] ), ALLEXCEPT ( Table1, Table1[ID] ) )
)

 

Flona
Regular Visitor

Hi

Thank you for your input, I have tried this solution before and cannot get it to work due to different formats, date and text. Error message fon DAX expresion "DAX comparison operatisn do not support comapring values of type Text with values of type Date. Consider using the Value or Format function to convert one of the values

NewTable =
FILTER (Table1;
Table1[ID]
= CALCULATE(MAX(Table1[Start date]);ALLEXCEPT(Table1;Table1[ID]))
)

@Flona 

 

You're quite right. There was an error in the code. Try this slightly modified version:

 

NewTable = 
FILTER (
    Table1;
    Table1[Start date]
        = CALCULATE ( MAX ( Table1[Start date] ); ALLEXCEPT ( Table1; Table1[ID] ) )
)
Zubair_Muhammad
Community Champion
Community Champion

@Flona 

 

You can use Table.Max function in Power Query as follows

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rdJBC4MgFAfwryKdPeQz27qPYINOBTtEB7eEdmngEta3X3nIqekuuwiPpz/fX2zb5HwiCV5X1Ij3hNaiUeLV8xmjCx8VlzMiGEFKiqVVyofulOImrVaHPQr+R9GlqNVoSZB6VMXlfXAc0A58pRuUdGYyUvCSEOblY95Um8Usi2qLGusq+tHRYMP8qWlcq/mkpHWABVPmcap62tsPZio3/jEswX7EIhKR2K+faS7T3H5EyH/9ie4D", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Description = _t, #"Start date" = _t, #"End Date" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", type text}, {"Description", type text}, {"Start date", type date}, {"End Date", type date}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"ID"}, {{"All", each Table.Max(_,"Start date"), type record}}),
    #"Expanded All" = Table.ExpandRecordColumn(#"Grouped Rows", "All", {"ID", "Description", "Start date", "End Date"}, {"ID.1", "Description", "Start date", "End Date"})
in
    #"Expanded All"

 


Regards
Zubair

Please try my custom visuals

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.