Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
laiksayyed
New Member

Filter out Duplicate records and retain the latest record

Working on a power BI report, with SQL datasource in DirectQuery Mode.

I have a location_docs Table with unique DocID, many LocationID and many DocumentTypes.

I am struggling to filter out duplicate records and generate report with these unique records.

Unique records means – there should be one DocType for each LocationID and the unique records should include the last uploaded Document, with the highest/Maximum DocID No.

Following is a snapshot from my current data and expected resultant data. Please remember it is in DirectQuery mode.

DocIDCountLocIDDocTypeExpiryDate
1828130231/05/24
1072130220/07/22
684130216/06/22
39130203/09/24
40130311/12/24
1790130420/08/24
38130527/11/23
17130501/06/22
18130612/07/24
44130831/05/24
42130828/11/22
41131213/05/22
1827131209/05/24
46131311/12/24
1791131420/08/24
1792131513/06/23
1793131610/10/23
45131831/12/21
47132225/03/24
1073132311/12/24
1081132422/12/24
1794132521/07/23
48132631/08/23
49132831/10/22
50133208/11/22
2488133216/10/24
1074133311/12/24
1075133412/12/24
1795133529/11/22
1796133629/11/22
1797133829/12/22
51134201/11/24
1076134311/12/24
1077134411/12/24
1798134530/12/21
1799134628/03/24
22081348 
1801135228/12/22
1078135411/12/24
1803135523/11/21
2576135512/11/23
1800135614/03/21
1804135820/07/22
20136227/11/22
1805136420/12/23
19136527/11/22
1808136516/11/22
2040136527/11/22
2575136512/11/23
1807136631/01/25
43136830/11/22
1809136831/12/22
2577136828/11/23

 

Expected Result:

DocIDCountLocIDDocTypeExpiryDate
1828130231/05/24
40130311/12/24
1790130420/08/24
38130527/11/23
18130612/07/24
44130831/05/24
41131213/05/22
46131311/12/24
1791131420/08/24
1792131513/06/23
1793131610/10/23
45131831/12/21
47132225/03/24
1073132311/12/24
1081132422/12/24
1794132521/07/23
48132631/08/23
49132831/10/22
2488133216/10/24
1074133311/12/24
1075133412/12/24
1795133529/11/22
1796133629/11/22
1797133829/12/22
51134201/11/24
1076134311/12/24
1077134411/12/24
1798134530/12/21
1799134628/03/24
22081348 
1801135228/12/22
1078135411/12/24
2576135512/11/23
1800135614/03/21
1804135820/07/22
20136227/11/22
1805136420/12/23
2575136512/11/23
1807136631/01/25
2577136828/11/23

Thank you for your support.

Laik

1 ACCEPTED SOLUTION
v-yueyunzh-msft
Community Support
Community Support

Hi , @laiksayyed 

According to your desciption, you want to filter the duplicate data in your table . Right?

Here are the steps you can refer to :
(1)My test data is the same as yours.

(2)We need to add an index column in Power Query Editor:

vyueyunzhmsft_0-1670206755308.png

(3)Then we apply the data to Desktop and we need to click "New Column" to create three calculated columns:

Flag1 = var _curDoc = [DocType]
var _index = [Index]
var _pre_Doc =MAXX( FILTER( 'Table', 'Table'[Index] = _index -1) , [DocType])
var _flag = IF(_curDoc = _pre_Doc , 1, 0)
return
_flag
Flag2 = var _flag1 = [Flag1]
var _index= [Index]
var _after =MAXX(FILTER('Table','Table'[Index]=_index+1) , [Flag1])
var _flag2 = IF( _after =1 ,1,_flag1)
return
_flag2
Max_DocID = var _flag2 = [Flag2]
var _index = [Index]
var _min_index = MAXX( FILTER('Table','Table'[Index]<= _index && 'Table'[Flag2] = 0 ) , [Index])
var _min_value = IF( _min_index =BLANK() ,1,_min_index+1)
var _max_index =MINX( FILTER( 'Table','Table'[Index]>= _index && 'Table'[Flag2] = 0) , [Index])
var _max_value = IF(_max_index=BLANK() , MAX('Table'[Index]) , _max_index-1)
var _max_docID =MAXX( FILTER( 'Table' , 'Table'[Index] >=_min_value && 'Table'[Index] <= _max_value) , [DocID])
return
IF(_flag2=0, [DocID] ,IF(_max_docID=[DocID],[DocID],BLANK()))

Then we can get this:

vyueyunzhmsft_1-1670206821608.png

(4)Then we can put the fields on the visual and we can meet your need:

vyueyunzhmsft_2-1670206853700.png

 

Thank you for your time and sharing, and thank you for your support and understanding of PowerBI! 

 

Best Regards,

Aniya Zhang

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

View solution in original post

4 REPLIES 4
v-yueyunzh-msft
Community Support
Community Support

Hi , @laiksayyed 

According to your desciption, you want to filter the duplicate data in your table . Right?

Here are the steps you can refer to :
(1)My test data is the same as yours.

(2)We need to add an index column in Power Query Editor:

vyueyunzhmsft_0-1670206755308.png

(3)Then we apply the data to Desktop and we need to click "New Column" to create three calculated columns:

Flag1 = var _curDoc = [DocType]
var _index = [Index]
var _pre_Doc =MAXX( FILTER( 'Table', 'Table'[Index] = _index -1) , [DocType])
var _flag = IF(_curDoc = _pre_Doc , 1, 0)
return
_flag
Flag2 = var _flag1 = [Flag1]
var _index= [Index]
var _after =MAXX(FILTER('Table','Table'[Index]=_index+1) , [Flag1])
var _flag2 = IF( _after =1 ,1,_flag1)
return
_flag2
Max_DocID = var _flag2 = [Flag2]
var _index = [Index]
var _min_index = MAXX( FILTER('Table','Table'[Index]<= _index && 'Table'[Flag2] = 0 ) , [Index])
var _min_value = IF( _min_index =BLANK() ,1,_min_index+1)
var _max_index =MINX( FILTER( 'Table','Table'[Index]>= _index && 'Table'[Flag2] = 0) , [Index])
var _max_value = IF(_max_index=BLANK() , MAX('Table'[Index]) , _max_index-1)
var _max_docID =MAXX( FILTER( 'Table' , 'Table'[Index] >=_min_value && 'Table'[Index] <= _max_value) , [DocID])
return
IF(_flag2=0, [DocID] ,IF(_max_docID=[DocID],[DocID],BLANK()))

Then we can get this:

vyueyunzhmsft_1-1670206821608.png

(4)Then we can put the fields on the visual and we can meet your need:

vyueyunzhmsft_2-1670206853700.png

 

Thank you for your time and sharing, and thank you for your support and understanding of PowerBI! 

 

Best Regards,

Aniya Zhang

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

Anonymous
Not applicable

Thanks Ibendlin for your reply... Measure or calculated column either will do. 

As mentioned, my concern is this report has to be in DirectQuery mode, not mixed.

That means a measure is your only option.  

Can you confirm that you only want to show the max DocID for each LocID and DocType combination?

 

lbendlin_0-1670184018932.png

So for the first group that would be DocID 1828 ?

 

Measure Show = 
var d = selectedvalue(Table_1[DocID])
var m = CALCULATE(max(Table_1[DocID]),REMOVEFILTERS(Table_1[DocID]))
return if(d=m,1,0)

lbendlin_1-1670184441901.png

 

lbendlin
Super User
Super User

Needs more details. Are you looking for a measure. or are you looking for a calculated column after you converted the direct query to mixed mode?

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.