Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

0

Bug: Remove Duplicates arbitrarily removing non duplicate values

Hello, we have recently came across an issue when removing duplicates based on a column.

 

Column contained 12000 unique values, however, after "Remove Duplicates" step was applied, some of the values got eliminated, making the table to return only 10.5k, 10.8k or any random lower number. With every refresh, different values would get dropped off. This was done on July 2017 Desktop version, and interestingly, only some people have encountered the issue on identical file, others didn't.

We have tried Group By and counting rows for each value, no duplicates.

Column in question was nvarchar(10) fields in SQL Server. Couldn't reproduce issue with Excel, but can with SQL Server dataset, happy to provide PBIX file.

 

Lastly, inserting "UPPERCASE" step, applying it to the column in question prevented the issue, despite the fact that values all appear to be uppercase already.

Status: Delivered
Comments
v-haibl-msft
Employee

@Anonymous

 

I cannot repro the same issue on my side. If you are using Import mode to get data from SQL Server, please share the PBIX file through online file service like OneDrive. You can also try with the latest Aug 2017 version of Power BI Desktop to see if issue persists.

 

Best Regards,
Herbert

Vicky_Song
Impactful Individual
Status changed to: Needs Info
 
Anonymous
Not applicable

Hello,

 

thanks for trying to look into this. I have put together a description with screenshot, sample SQL database and PBIX file to help you reproduce it:

 

Files are here: https://1drv.ms/u/s!AoldpcXjgJNxfJYUAZi5fCXJ84Q

 

So far it seems that it only happens under certain circumstances, including size of tables, referencing VIEW in the database, and particular instance-specific and client machine specific settings.

 

Please let me know if you need anything else.

Thanks

 

v-haibl-msft
Employee

@Anonymous

 

I still cannot repro the same issue on my side with your database and pbix file. Please refer to my recorded video below. The returned rows are always 12994.

 

 

Best Regards,
Herbert

Anonymous
Not applicable

Hi Herbet,

 

the value shown on the refresh modal windows is irrelevant, as that tells you how many rows in the source table it iterates through. What you need to look it is rowcount in the status bar of the page.

As I said though, for me to reproduce it, I have to use the same database, view and client computer. If any of these 3 variables change, issue goes away, therefore I'm skeptical this will get reproduced at all.

Feel free to close this, I will just put it down as "mysterious".

Thanks

 

Vicky_Song
Impactful Individual
Status changed to: Delivered