Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
hmenco
Frequent Visitor

Remove duplicates, keep the latest entry (Sharepoint as source)

Hi! I have here a FCONS table that has multiple Reservation No.

 

I only need the LATEST entry of the Reservation No. (highlighted in yellow)

 

hmenco_0-1627352612599.png

I have resolved this before with Table.Buffer on my local folder as source. But since I have migrated the source to Sharepoint, I need to redevelop this again and noticed that Table.Buffer may not be applicable anymore since it slows down the performance.

 

Hoping someone could help me with a workaround, Or walk me through the Table.Buffer with Sharepoint as source.

 

EDIT: As much as possible, the most EFFICIENT way applicable to BIG DATA.

Thanks!

1 ACCEPTED SOLUTION
hmenco
Frequent Visitor

Hi! Thanks for taking time. I have resolved the issue by sorting descendingly the RunDate then partition by Reservation No. - add index column, expand the partition table and remove its duplicates.

View solution in original post

7 REPLIES 7
hmenco
Frequent Visitor

Hi! Thanks for taking time. I have resolved the issue by sorting descendingly the RunDate then partition by Reservation No. - add index column, expand the partition table and remove its duplicates.

lbendlin
Super User
Super User

@ImkeF I see.  So yes, the Group By approach makes more sense as long as that bug exists.  Performance will be horrible in both cases, I assume.

ImkeF
Super User
Super User

Hi @hmenco ,

not sure if this would be faster, but it might be worth a try:

Group by reservation and cust no and add 2 fields:

max Rundate and All
Expand All and filter Rundate = MaxRundate

 

Although the grouping operation will cost performance, no need to sort and buffer.

 

@lbendlin ,
the sort order you've described is not guaranteed to stick:

Bug warning for Table.Sort and removing duplicates... - Microsoft Power BI Community

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

@ImkeF thank you for the warning. Since this is a SharePoint source I assume no query folding is happening and the sort is safe-ish?

Hi @lbendlin ,

no, that's not how I understood the thread:
.." or because when run locally it may constraint the operators in some ways that would make the operation inefficient or impossible to enforce.."
To me this means that even locally evaluated queries won't have a guaranteed sort order.

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

lbendlin
Super User
Super User

There's a simple trick. Sort your data as needed (which is unfortunate as it is very costly) - in your case you need to sort by Rundate descending - and then select the {Reservation No.]  column and choose "Remove Duplicates".  That will only keep the first row for each Reservation No.  which is actually the last row due to the sorting etc.

I've done this but it doesn't keep the LATEST entry. Any other workaround? Thanks!

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors