Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
KCinMelbourne
Resolver I
Resolver I

Data Format Causes Matching Data to Appear Inconsistent

Hi all, 


Seeking assistance on a data consistency issue.  We currently extract a department name field from a system that carries with it some sort of invisible formatting (possibly a HTML format??).  The data is copied from the system and pasted into Excel and is subsequently pulled into PowerBI.

 

When we type what appears to be the exact same name into the sheet, Excel/PowerQuery/PowerBI all believe they are different content even though they appear the same.  If you were to filter on a column with the extracted and manually typed names, you'll see this for example:

 

Capture Filter Error.PNG

 

For a range of reasons, this repeatedly comes up as an issue when we pull the data through or use the field for reporting.  Is there anyway to remove the formatting or convert the data to a text format so the system can read the data as consistent? Constantly trying to resolve 'invisible errors' is becoming unmanageable. 

A few notes:

  • This is not a TRIM issue. There is no 'extra space' on the end of the fields
  • I've tried using functions like Text.From but I get the same result in the new column
  • The data "Type" is "Text" in PowerBI already so this is not the issue either
  • I have an example excel file to show this but it seems I can only attach photos/videos so I can't share it????

Cheers

Kirsty

1 ACCEPTED SOLUTION

Thanks Yuliana @v-yulgu-msft!  Naturally I can't manually replace the apparent spaces but knowing they were "non-printable characters", helped with searching for a solution. 

 

For others with this problem: Lots of research indicated that a CLEAN (Dax/Excel) or Text.Clean (M Code) function would cleans the data by removing the non-printable character.  However, I found this did not work for me.  Feel free to try it for your problem  in PowerQuery by selecting the relevant column --> Transform menu --> Format menu --> Clean. 

 

For me, I had to replace the non'printable character specifically by using the code #(00A0) as what needed to be 'found' and 'repalced'.  Code as follows:

= Table.ReplaceValue(#"Prior step", "#(00A0)"," ",Replacer.ReplaceText,{"Column Being Validated"})

 

 

View solution in original post

4 REPLIES 4
v-yulgu-msft
Employee
Employee

Hi @KCinMelbourne,

 

You could upload the sample Excel file to OneDrive, and paste the shared link here so that I can download.

 

Regards,

Yuliana Gu

Community Support Team _ Yuliana Gu
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Thanks @v-yulgu-msft,

I thought I was just having a dumb moment not being able to find the attachment. I've popped on google drive, it should be public but let me know if you can't access it. 

 

Cheers

KC

 

Example Excel Spreadsheet

Hi @KCinMelbourne,

 

I tested with your sample file there might existing some non-printable characters in column values. After removing all space, the duplicates disappeared.

1.PNG

 

Then, I typed space manually again, the problem didn't persist.

2.PNG

 

Best regards,

Yuliana Gu

Community Support Team _ Yuliana Gu
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Thanks Yuliana @v-yulgu-msft!  Naturally I can't manually replace the apparent spaces but knowing they were "non-printable characters", helped with searching for a solution. 

 

For others with this problem: Lots of research indicated that a CLEAN (Dax/Excel) or Text.Clean (M Code) function would cleans the data by removing the non-printable character.  However, I found this did not work for me.  Feel free to try it for your problem  in PowerQuery by selecting the relevant column --> Transform menu --> Format menu --> Clean. 

 

For me, I had to replace the non'printable character specifically by using the code #(00A0) as what needed to be 'found' and 'repalced'.  Code as follows:

= Table.ReplaceValue(#"Prior step", "#(00A0)"," ",Replacer.ReplaceText,{"Column Being Validated"})

 

 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.