Bug: Remove Duplicates arbitrarily removing non du...

Anonymous · ‎08-13-2017

Hello, we have recently came across an issue when removing duplicates based on a column.

Column contained 12000 unique values, however, after "Remove Duplicates" step was applied, some of the values got eliminated, making the table to return only 10.5k, 10.8k or any random lower number. With every refresh, different values would get dropped off. This was done on July 2017 Desktop version, and interestingly, only some people have encountered the issue on identical file, others didn't.

We have tried Group By and counting rows for each value, no duplicates.

Column in question was nvarchar(10) fields in SQL Server. Couldn't reproduce issue with Excel, but can with SQL Server dataset, happy to provide PBIX file.

Lastly, inserting "UPPERCASE" step, applying it to the column in question prevented the issue, despite the fact that values all appear to be uppercase already.

v-haibl-msft · ‎08-14-2017

@Anonymous

I cannot repro the same issue on my side. If you are using Import mode to get data from SQL Server, please share the PBIX file through online file service like OneDrive. You can also try with the latest Aug 2017 version of Power BI Desktop to see if issue persists.

Best Regards,
Herbert

Vicky_Song · ‎08-14-2017

Anonymous · ‎08-15-2017

Hello,

thanks for trying to look into this. I have put together a description with screenshot, sample SQL database and PBIX file to help you reproduce it:

Files are here: https://1drv.ms/u/s!AoldpcXjgJNxfJYUAZi5fCXJ84Q

So far it seems that it only happens under certain circumstances, including size of tables, referencing VIEW in the database, and particular instance-specific and client machine specific settings.

Please let me know if you need anything else.

Thanks

v-haibl-msft · ‎08-17-2017

@Anonymous

I still cannot repro the same issue on my side with your database and pbix file. Please refer to my recorded video below. The returned rows are always 12994.

Best Regards,
Herbert

Anonymous · ‎08-17-2017

Hi Herbet,

the value shown on the refresh modal windows is irrelevant, as that tells you how many rows in the source table it iterates through. What you need to look it is rowcount in the status bar of the page.

As I said though, for me to reproduce it, I have to use the same database, view and client computer. If any of these 3 variables change, issue goes away, therefore I'm skeptical this will get reproduced at all.

Feel free to close this, I will just put it down as "mysterious".

Thanks

Vicky_Song · ‎08-18-2017

Bug: Remove Duplicates arbitrarily removing non duplicate values

Power BI semantic model account sign in for every ...

When `select all` is used, then slicer doesn't res...

published report with PERSONALIZE enabled - the sc...

Power BI desktop version

Bug in New Slicer Visual (Rounded Corners)