Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
Hi,
Issue with Power Query on Power BI Desktop (February 2022) potentially creating a few duplicates when importing a large Table from a PostgreSQL database.
- Imported a Table called "s_tache" from a PostgreSQL database :
- Distribution indicates duplicates in the Primary Key ("tae_id") field. Unique values does not match Number of values :
- Checking the source database with a SQL query. "tae_id" is indeed the primary key and does not countain duplicates (Yes, my name is Captain Obvious. How could you tell?)
- Deleting rows works. No more duplicates in the Primary Key field :
- Keep Rows > Keep duplicates DOES NOT return any row (???):
How does that happen? Is it a bug? Not making any sense to me but I could be missing something.
Edit : Reported the issue in the appropriate section
Solved! Go to Solution.
I was struggling with the same issue: I can confirm it is not simply a visual bug, I had a filter applied after that gave a lot of false positives because of the duplicates. Native query seems to have fixed it.
Thank you so much for your help!!
Fixed the issue by using a NativeQuery :
Hi @BastienC
Glad it works in the report.
It looks weird. Not sure if there is issue in the column profile view. From the first screenshot, the id 4457 is considered to appear in two rows by Power Query. You could try filtering the tae_id column to show only rows that tae_id equals 4457 and check whether the filtered result have one or two rows.
Best Regards,
Community Support Team _ Jing
Hi Jing,
Thanks for considering my issue. After filtering [tae_id] = 4457 only a single row comes up.
This really looks like a bug with the profile view. But then, what could Power Query be deleting when targeting duplicates? Really odd.
On top of that, my DAX measures are slightly off the expected results given by SQL queries. However, the data structure was heavily manipulated and the DAX queries are relatively complex. Mistakes could be on my part but with this little bug it's hard to be sure.
This seems to be a visual bug. Duplicates are nowhere to be seen after applying Power Query (without deleting duplicates ofc) and relations "1 to many" related to the field work just fine. Distribution is probably borking a little when the table is too large. Had issues further down the line with my PK not being unique for relations but it could be due to something else.
Will report back if anything new or interesting happens about the subject.