Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!
I’ve loaded a .txt file into PBI and am getting duplicates in what should be a column of unique values. These don’t appear in the source and when I load the same data into Tableau this doesn’t happen.
The data has 7022 rows but PBI only recognises 6989 unique values in the id column. I’ve copied one such duplication below. As you can see, the id with the suffix 96 is duplicated and the second takes the place of the one which would have had the suffix 100.
id | Canvas_id |
70360000000000093 | 93 |
70360000000000094 | 94 |
70360000000000095 | 95 |
70360000000000096 | 96 |
70360000000000096 | 100 |
70360000000000097 | 97 |
70360000000000098 | 98 |
70360000000000099 | 99 |
70360000000000101 | 101 |
70360000000000102 | 102 |
Does anyone know what might be happening here?
Hi, @DHB
Would you please upload your .txt file for furthur anaysis so that we can check the data source and find the root cause.
Thanks,
Yuliana Gu
I don't think I'll be permitted top share any of the files. I've been told that when I have access to data from the data lake PBI will detect the data type from Oracle and problem should not happen. This should be happening soon I hope. Thank you for your offer though.
Best,
DHB
These seems strange behaviour. I have never encountered such a problem. If you can share your file, I would test it in both the (Tableau and PowerBI) environments you have mentioned and see what is wrong with the ETL process.
Meanwhile, Go to the Query Editor, and Select Transform and Go to Statistics option.
Please select the distinct count values first and you will see how many no. of distinct rows you have in your text file.
Hi Bhavesh,
Thank you for the advice.
When I put the data into the Query Editor and changed the id column from whole number to text it resolved the issue. The same thing worked for other .txt files that had the same problem. This is good news but it's going to be awkward doing that every time. Can I automate this somehow?
I'll have to get permission to share a file so will let you know asap on that.
Hi DHB,
Unfortunately not, PowerBI selects the best data type suitable for your column automatically but sometimes it may not be the correct data type and you have to manually override that in the query stage.
Hope this has clarified your understanding.
@DHB I doubt PBI would be duplicating something in the process of import (never experienced it before). Can you double-check the source. Also via query editor in power bi desktop you can remove rows that are duplicates.
Thank you for your reply. I'm using the same source in both PBI and Tableau but Tableau doesn't have the same problem. Unfortunately I can't delete the dupicate as I would lose the data for the row which should end in 100 (id). What I need is for PBI to give that row the correct id value (....100) instead of ....96.
User | Count |
---|---|
139 | |
113 | |
103 | |
77 | |
63 |
User | Count |
---|---|
135 | |
126 | |
110 | |
70 | |
61 |