Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Fix unexpected character deletion on load | Provide warning when data is altered on load

PROBLEM

Power BI load from text fails to load ASCII characters 1-31 and 127. These characters are simply dropped without warning on load.

They are supposed to be included in Western European 1252 (Windows) standard mapping.

 

IMPLICATION

Data corrupted on load without warning. Dangerous for any AN encrypted file data sources.

 

SCENARIO

123[ASCII character #7]4 loads as 1234. This is not the same value i.e. corrupt. No warning is displayed. 

 

Anyone else worried about data loss? We only discovered this because we use encrypted keys... and got duplicate key values manifesting in reports. (BTW PBI also changes the data model unexpectedly from 1:1 to 1:Many when this happens, without warning).

 

Looking forward to hearing ANY work-around or fix for this (other than telling us not to encrypt data), Thanks in advance!

Status: Needs Info
Comments
v-yuezhe-msft
Employee

@davemcnab ,

What is your Power BI Desktop version? I make a test with importing ASCII character into Power BI Desktop Feb release, everything works as expected.
1.PNG

In your scenario, please update your Power BI Desktop to latest version(2.66.5376.2161), and share sample data of your table so that we can reproduce.

In addition, maybe there are some duplicated values in your column due to data loss, which leads to 1:many relationship in Power BI Desktop. You can disable to auto-detect relationship in Power BI Desktop, then recreate relationship manually between tables in Power BI Desktop.
Capture.PNG



Regards,
Lydia

v-yuezhe-msft
Employee
Status changed to: Needs Info
 
davemcnab
Frequent Visitor

Thank you for responding.

Version is current: Version: 2.66.5376.2521 64-bit (February 2019)

 

Can't tell what chars you were trying with. Did you try an ASCII character that is not on the keyboard e.g. 7 BEL (bell) - the one we observed the fail on?

 

Anything on the keyboard works fine, it is the special characters outside keyboard range that fail to load and disappear... every time.

 

Regarding comment:

maybe there are some duplicated values in your column due to data loss, which leads to 1:many relationship in Power BI Desktop

Data loss is the problem. Changing relationships is not an acceptable solution to data loss. Preventing data loss is the requirement. Or more correctly, consuming all ASCII characters from 0 to 127 as valid data.

 

We look forward to learning how to fix this. Thank you

v-yuezhe-msft
Employee

@davemcnab ,

I make a test in the same Power BI Desktop version as yours, and I am unable to reproduce this issue. Could you please share the Excel file containing your Ascii characters here? I wil test it on my side.
Capture.PNG

Regards,
Lydia

davemcnab
Frequent Visitor

Thank you for your continued interest in this issue. I do not see a capability to attach a file in this forum, so have provisioned some screenshots that demonstrate the issue in detail. 

 

Screenshot#1 - Excel, saved as csv, showing all characters

 

testdatainExcel (csv).PNG

 

 

 

 

Screenshot #2 - Identical data saved as txt showing all characters

testdatainNotepad (txt).PNG

 

 

 

 

 

 

Screenshot #3 - Load csv to Power BI (split file, promote header) with data loss. The data loss is pparent immediately on load.

testdatainPBI (csv).PNG

 

 

 

 

 

 

 

 

 

 

Screenshot #4 - Load txt to Power BI with data loss.

testdatainPBI (txt).PNG

v-yuezhe-msft
Employee

@davemcnab ,

Please upload your file to OneDrive and post shared link of the file here.

Regards,
Lydia

davemcnab
Frequent Visitor

Here is a sample file with two columns and three records exemplifying the issue,

https://1drv.ms/u/s!AteUhXljf58Ei1rm2LfG-XZwjRtE

 

Cols are Before and After. They contain different values as you will see if you open it in any desktop tool such as Excel, Worpad, Notepad, Access, etc., yet when loaded into Power BI, the values change to identical Before and After values.

 

IN other words it drops the character on load, which is a data integrity issue.

v-yuezhe-msft
Employee

@davemcnab ,

I suspect the issue is related to CSV encoding. When I save your file as .xlsx file and import it to Power BI Desktop, I get the following table.
1.png

Regards,
Lydia

davemcnab
Frequent Visitor

Thank you for your continued attention to this issue, Lydia.  Saving as XLSX is not an option for us as the csv source files contain several million rows, exceding Excel row limits.  

 

The CSV source provided is encoded as MIcrosoft Excel Comma Separated Values according to Properties and all characters are part of the ASCII character set. We tried every encoding scheme in the Power BI load scripting and none of them produced the desired result - they always deleted the character on load. 

 

How exactly did you produce the _x0019 value represented in your image and get it into Power BI?

Specifically what was the source file encoded as and what encoding scheme did you use to import?

We can't reproduce that... is it binary or some other encoding? In our Excel/CSV etc the value shows as a clear rectangular box similar to this [] . Our encoding settings are out of the box Excel, nothing fancy.

 

Since excel is not a valid option due to size limitations, we still have a data integrity and useability issue here.  Our expectation is that Power BI should be able to consume all ASCII characters with no problems just like every other MS tool, but it does not.

 

Looking forward to your next thoughts. I am grateful  you are helping us with this.

davemcnab
Frequent Visitor

Still no satisfactory explanation on data corruption issue. Please advise - are special characters in standard ASCII set supported or not?