Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Diana2022
Helper I
Helper I

PowerQuery merge issue

Hello,

I have two fairly simple table I want to merge in PowerQuery.  Each has a Period and an Item column and all four columns are Text data types.  Each column has been trimmed and cleaned in PowerQuery. 

 

If I merge only on Period then there's 10928 matches of 10928...so my issue is with Item.  If merging only on Item there's zero matches.    I've also "UPPERCASE" the Item columns in Format (inside PowerQuery).

 

Creating a relationship and then pulling data from each query results in nothing...so it feels fundamental to the item column.

 

I even created a Merge Key using Text.Trim for each field and then trimmed and cleaned the MergeKey column.  I'd expect about 8000 matches and I can see matches when comparing the two queries.   

 

No joy.  I'm at a complete loss as to what to try next.  Thoughts?

Diana2022_0-1666643780402.png

 

1 ACCEPTED SOLUTION

I found a solution...in PowerQuery I created a new column with the following formula:

 

Table.AddColumn(#"Cleaned ""Item"" column", "Part Number", each Text.Select([Item],{Character.FromNumber(48)..Character.FromNumber(57),Character.FromNumber(65)..Character.FromNumber(90)}))

 

This selects only numeric and capitalized alpha charcters from unicode table.  Then I checked length.

 

I really can't thank you enough for helping me get to this solution.

 

View solution in original post

14 REPLIES 14
Diana2022
Helper I
Helper I

All:

It's even weirder.   I decided to append the three datasets that contain the Item/Part Numbers into a table called "Part Numbers".  And then delete duplicates.   And the result is only one "10C410", etc.   But when loaded and visualized...doubled up again.

 

In Manage Relationships each of the three datasets has an active relationship to the Parts Number table and no relationship to one another.

 

Diana2022_0-1666906137458.pngDiana2022_1-1666906466234.png

I did try John's suggestion but it resulted in null values.   

Interestning... This is obviously because the Items are not equal. And, I think, this is because of some hidden/invisible characters. 

See my further testing suggestions in my post below (currently at the bottom of the thread). We will test equality and containment in this round.

Next step to compare number of charcters: kind of if Text.Length(Item) = Text.Length([Item.Bulk]) then true else Text.Length(Item) *100 + Text.Length([Item.Bulk]). if not equal will show number of charcters separated by 0 as I think you do not have any thing longer then 10 chars.

If we will find inequality on the length, then we can try using Text.Start([Item], List.Min({Text.Length(Item), Text.Length([Item.Bulk]}) = Text.Start([Item.Bulk], List.Min({Text.Length(Item), Text.Length([Item.Bulk]}) to see if we can get them to equate. And so on.

I found a solution...in PowerQuery I created a new column with the following formula:

 

Table.AddColumn(#"Cleaned ""Item"" column", "Part Number", each Text.Select([Item],{Character.FromNumber(48)..Character.FromNumber(57),Character.FromNumber(65)..Character.FromNumber(90)}))

 

This selects only numeric and capitalized alpha charcters from unicode table.  Then I checked length.

 

I really can't thank you enough for helping me get to this solution.

 

Hi- thank you for helping me!   Each Part Number in MADD_YTD table has two more characters than expected...even after TRIM and CLEAN.  This is true for every Part number examined regardless of its visual length.  

Diana2022_0-1666957376339.png

 

What kind of data source throws hidden characters into text? I thought spaces at the end of my SQL text fields was annoying enough! Glad you got it sorted!

emarc1
Advocate II
Advocate II

If you can't get matches for just the item codes on their own, I'd first check that you can actually see some that should match (in the example screenshot, none do).

Extra spaces are a very common issue, but you said you trimmed everything already. I'd still double-check both table's key columns. You can click on cells in Power Query and then select the text at the bottom to verify that the spaces are there or not. Try copying text that should match into excel and see if it matches there with a simple =IF(A1=A2... If it doesn't, try working out if particular characters are causing you issues. I'm not sure if there are look-alike letters or numbers in some encodings, but there are some things like different spaces. Depends if the data comes from two very different sources or not.

Different data types can sometimes stop matching, but it shouldn't be an issue between text strings.

yes, about 8000 would match.  From PowerQuery I copied each table into Excel and did a A2=B2 type bit and they're TRUE.  So, Excel is happy.  I'm absolutely stumped.

ValeriaBreve
Post Patron
Post Patron

Another thought: are you sorting and removing duplicates in any table before merging? This could cause this type of issue if you don't buffer your table when you sort as PowerQuery will not necessarily keep the sorting order (and all sort of weird things happen when you merge). Hope it helps!

No, cannot do that.  I'm basically adding data fields from one table to another.  That's why I'm using Date and Item to match records from each table...in Excel it'd be "=XLOOKUP(A2&B2, Sheet1!A:A&Sheet1!B:B,Sheet1!C:C)" type thing.

HotChilli
Super User
Super User

Is it something nutty like all the 0s are zeros in one table and capital O in the other table?

Trim and Clean usually sorts wacky merge problems so I think we need to look at the item codes at that point.

Can you create columns in both tables which are truncated versions of the item(maybe use Text.End or Text.Start) and see if you can start to match them, then work back, adding in more letters until it breaks?

--

Also, what's the data source?

Hi...both are coming from Excel.  From PowerQuery I copied each table and pasted into Excel and did a A2=B2 type bit and they're TRUE.  So, Excel is happy.  I'm absolutely stumped.  I do this sort of thing in powerQuery a lot and have never come across this unless the datatypes weren't the same.

Hi @Diana2022,

 

Hmm... this sounds interesting...

Do you mind trying this code:

let
    #"Added Custom" = Table.AddColumn(MADD_YTD, "Data", (x)=> Table.SelectRows(Bulk_Log_YTD, each [Item]=x[Item])),
    #"Expanded Data" = Table.ExpandTableColumn(#"Added Custom", "Data", {"Item", "Bulk Amt"}, {"Item.Bulk", "Bulk Amt"})
in
    #"Expanded Data"

Just add a blank query and replace the content in Advanced Editor to the above. This is a "manual" join on Item. This is threfore may be much slower than a stock Table.Join, but I wonder if this would work.

 

Cheers,

John

Null values I'm afraid.

 

Obvious next step, if you do not mind:

let
    #"Added Custom" = Table.AddColumn(MADD_YTD, "Data", each Bulk_Log_YTD),
    #"Expanded Data" = Table.ExpandTableColumn(#"Added Custom", "Data", {"Item", "Bulk Amt"}, {"Item.Bulk", "Bulk Amt"}),
    #"A=B" = Table.AddColumn(#"Expanded Data", "A=B", each [Item]=[Item.Bulk]),
    AEB = Table.AddColumn(#"A=B", "AEB", each Text.Contains([Item], [Item.Bulk])),
    BEA = Table.AddColumn(AEB, "BEA", each Text.Contains([Item.Bulk], [Item]))
in
    BEA

We check if there is any exact matches, Item contains Item.Bulk or Item.Bulk contains Item.

Can you see any "TRUE" in the added (last three) columns?

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors