Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

COMMA in Excel text field interpreted during DATA IMPORT

Hi,

 

I have found an issue with comma's in Excel text fields. I have imported a column from an  Excel sheet and transformed it to be distinct (Remove Rows/ Remove Duplicates). However, when I tried to make a relation to another table, I got a message: You can't create a relationship between these two columns, because one of the columns must have unique values. Keep in mind that I imported one column and deduplicated it.

 

I have created a table visual and exported this visual to CSV. I have imported the previously exported CSV into Power BI and it appeared to have 2 (!) columns. This was caused by comma's in the text fields. An additional column was created containing data from after the comma. The comma itself was no longer in the data. 

 

By looking at the above steps, one could deduct that the PBI import does something strange here. It shows the imported Excel column as one column, but it in the engine something else happens. The Remove Duplicates seems to apply the distinct over the 2 columns. This makes it possible to have duplicate data in the first/main column. This column is then used to create a relation to another table and fails due to the duplicates.

 

I'm not sure if this is a known problem, I couldn't find anything similar.

 

Please let me know if you need a sample of the data.

 

Bye,

 

Rudi

Status: Needs Info
Comments
rudiklein
Advocate II

Sorry guys, it might not be the comma's after all. I'm not sure why the column seems to have problems being made distinct, but after replacing all comma's with dots in the source data, it still fails.

 

But, the comma issue might very well be AN issue, but might not be related to the fact that PBI has a problem with the distinct function. 

 

I'll try to get to the bottom of this and report back.

 

Bye,

 

Rudi

v-qiuyu-msft
Community Support

Hi @rudiklein,

 

I’m not able to reproduce the issue in the latest desktop version 2.44.4675.521. Here are my test Excel and .pbix file. Please try to test in the latest desktop use sample excel file. If the sample excel file works fine on your side, please share your sample Excel file for us to reproduce the issue.

 

Best Regards,
Qiuyun Yu

Vicky_Song
Impactful Individual
Status changed to: Needs Info
 
rudiklein
Advocate II

Hi,

 

I will try to find out exactly what is happening here and report back asap. However, it can take a couple of days. I will attempt to reproduce it with anonymized data and make the data set available.

 

Thanks for the quick reply.

 

Rudi

 

rudiklein
Advocate II

Hi @v-qiuyu-msft,

 

Looking at the test results (see my .zip file), it appears that there are multiple issues that prevent linking the distinct table to the main table. The only successful way is to clean the Excel input file manually, import the Excel input file, export the table visual data from that import and import the resulting CSV.
What I found during my tests:
1) Some fields in the Excel start with a minus sign. Row 52, 403 and 2420 (or 2421) return a #NAME error.
If you go into these fields in Excel and exit with an ENTER, the #NAME error occurs too. This looks similar to the output of PBI in the imported table.
2) Some fields in the Excel contain an @-sign.
3) Some fields in the Excel contain CRLF’s.
4) Some fields in the Excel contain commas.
5) Every test Excel file (containing 1 column), creates a second column during the export of table visual data of the previously imported Excel data.
6) None of the imported data from the Excel files will link to the main table (the tables/columns have errors, -blanks and duplicates removed in the query).
7) Only the imported CSV file, resulting from the table visual output of the PBI forum case - original table - clean.xlsx import, will link to the main table.

 

This lets me to believe that there are actual 2 issues that prevents PBI to make the column distinct. I guess the duplicate #NAME errors are one of them.
The CLEAN Excel file has been cleaned from CRLF, #NAME, commas and @-signs

 

I have created a set of files containing al input and output files and the .pbix I have used to test this.
I’m closer to the solution, but not there yet. The .zip file contains all data and a matrix of the test results. If I have overlooked the obvious, I will be embaressed, but happy with the solution

 

Here is a complete test set I have been using: Test set

 

Bye,

 

Rudi

 

 

v-qiuyu-msft
Community Support

Hi @rudiklein,

 

Would you please share the source Excel files? From your shared OneDrive link, there is only one .pbix file.

 

Best Regards,
Qiuyun Yu

rudiklein
Advocate II

Hi @v-qiuyu-msft,

 

Sorry, I shared the wrong file. Here is the zip.

 

bye,

 

Rudi

v-qiuyu-msft
Community Support

Hi @rudiklein,

 

Take the PBI forum case - original table minus rows 52 403 and 2421 (NAME error).xlsx as the sample, in Power BI desktop, the table “Without NAME errors” after use removed Duplicates feature still have “duplicates”, you can create a matrix visual place the [minus NAME] in Row Groups, [minus NAME] with count and Count(Distinct) aggregation in Values.

 

q1.PNG

 

You will find there are two values “t.v.v. Rxxx, xP”. When you go to the Query Editor, filter values with contains “t.v.v. Rxxx”, it will show you below:

 

q2.PNG

 

You can see the difference is “p” and “P”, lowercase letter and uppercase letter. After we remove one of those two values, then we can create the relationship between the “Full table” and “Without NAME errors”.

 

It seems that when we create the relationship, it will treat it as duplicates regardless the lowercase letter or uppercase letter. Regarding this issue, I already sent email internally, will update to you once get feedback.

 

Best Regards,
Qiuyun Yu

rudiklein
Advocate II

Hi @v-qiuyu-msft,

 

That was one angle I didn't look at: case sensitivity. It crossed my mind, but went on another track of investigation. I'm glad the problem has been identified. I can work with that for now.

 

I'm wondering why you used the PBI forum case - original table minus rows 52 403 and 2421 (NAME error).xlsx file. The reason why I'm asking this is that the #NAME errors also look wrong. I understand why they appear: they fire because some text in the inputfile is interpreted as a formula-with-an-error, but after importing and Removing Duplicates and Errors, I can still see two of them. Will they cause the link to fail?

 

Bye,

Rudi

 

 

v-qiuyu-msft
Community Support

Hi @rudiklein,

 

I got the information that data model isn't case sensitive while Power Query is. You can see this same thread: Creating Case Sensitive Table Relationships.

 

Based on my test in desktop 2.44.4675.521, when import data from the PBI forum case - original table minus rows 52 403 and 2421 (NAME error).xlsx , I didn't see the #NAME errors. Can you show me the image of that #NAME errors?

 

Best Regards,
Qiuyun Yu