Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
mbuhary
Helper I
Helper I

Power Query to Match Bank vs OR amounts within each Branch

Hello Everyone,
I have a situation where I am trying to use Power Query to match a list of data, see table below.
We have Columns ID, Match, Net, Branch & Dates which are important in this process:
Bank has Positive Numbers in Net Column
OR has Negative Numbers in Net Column
I would like to make a one to one match of Bank Numbers against OR Numbers and populate in Column B Match with value 1 for all those amounts matching.
If the numbers match exactly with each other that's fantastic, but in general it is not the case,
and I am prepared to allow a tolerance limit for the difference from minus 2 to plus 2.
So if the compared amounts throw out a difference within this range it can be assumed they are matched.
Some Rules
the matching should be between Bank vs OR on absolute values eg plus 100 against minus 100 assumed matched
numbers matching within each branch not against another branch number
for the moment we can leave out date criteria

IDMatchNet  TypeBranch Date
BANK 102.67 11804-Oct-19
BANK 183.42 11804-Oct-19
BANK 313.01 11805-Oct-19
OR 183.75 11806-Oct-19
OR -102.57 11806-Oct-19
OR -183.17 11806-Oct-19
OR -183.17 11806-Oct-19
OR -312.96 11806-Oct-19
OR -237.24 11827-Nov-19
BANK 237.25 11827-Nov-19
OR -249.11 11827-Nov-19
BANK 336.94 11827-Nov-19
OR -337.04 11827-Nov-19
BANK 348.9 11827-Nov-19
BANK 588.14 11827-Nov-19
OR -588.63 11827-Nov-19
BANK 1616.85 15710-Oct-19
OR -1637.54 15710-Oct-19
BANK 879.22 15721-Oct-19
BANK 336.94 15722-Oct-19
OR -337.04 15722-Oct-19
BANK 364.09 15722-Oct-19
OR -364.15 15722-Oct-19
OR -879.22 15722-Oct-19
19 REPLIES 19
Mariusz
Community Champion
Community Champion

Hi @mbuhary 

 

Please see the attached file with a solution.

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rdLBDsIgDAbgVzE7j4a2UOCoV5Mt8brs5AN4MT6/bE4nSlYPXggkX/7SwjA0h313bNoGLYGEvNlNB4x5tc7056vB1Izt6iKDI90xMlgsnX93/WlJC75U8qXMdDkfdJbT8G+MkSCJyogDkCsYBdNdbh8DmZ3fcEucS4CoxzELpK2ySxO5qv3hduwiJJ35GAH1qhMT1uNQUCC+hjI/MdrKi0luw7sNtwTGkICocIS131lM7+Go8gWK6VXYM04c2KTHZYZeZbUmVjbeAQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, #"Net " = _t, #" Type" = _t, Branch = _t, #" Date" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", type text}, {"Net ", type number}, {" Type", type text}, {"Branch", Int64.Type}, {" Date", type date}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"ID", "Branch"}, {{"tbl", each _, type table [ID=text, #"Net "=number, #" Type"=text, Branch=number, #" Date"=date]}, {"sum", each List.Sum([#"Net "]), type number}}),
    #"Grouped Rows1" = Table.Group(#"Grouped Rows", {"Branch"}, {{"tbl", each _, type table [ID=text, Branch=number, tbl=table, sum=number]}, {"sum", each List.Sum([sum]), type number}}),
    #"Added Custom" = Table.AddColumn(#"Grouped Rows1", "Match", each if [sum] <= 100 and [sum] >= -100 then 1 else 0),
    #"Changed Type1" = Table.TransformColumnTypes(#"Added Custom",{{"Match", Int64.Type}}),
    #"Removed Other Columns" = Table.SelectColumns(#"Changed Type1",{"Branch", "tbl", "Match"}),
    #"Expanded tbl" = Table.ExpandTableColumn(#"Removed Other Columns", "tbl", {"ID", "tbl"}, {"ID", "tbl.1"}),
    #"Expanded tbl.1" = Table.ExpandTableColumn(#"Expanded tbl", "tbl.1", {"Net ", " Type", " Date"}, {"Net ", " Type", " Date"}),
    #"Changed Type2" = Table.TransformColumnTypes(#"Expanded tbl.1",{{" Date", type date}, {"Net ", type number}})
in
    #"Changed Type2"
Best Regards,
Mariusz

If this post helps, then please consider Accepting it as the solution.


 

In your query you are allowing a difference of up to +/- 100 by grouped sum by Branch. That does't give us the required accuracy level at each distinct amount.

 

That is why I mentioned in my post that at each match the difference can only go up to + / - 2.

 

What we expect from the posted data is similar to the attached picture, we are left with four rows highlighted in RED not matched:

DataPOS_Matched.jpg

Mariusz
Community Champion
Community Champion

Hi @mbuhary 

 

Try This

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rdLBDsIgDAbgVzE7j4a2UOCoV5Mt8brs5AN4MT6/bE4nSlYPXggkX/7SwjA0h313bNoGLYGEvNlNB4x5tc7056vB1Izt6iKDI90xMlgsnX93/WlJC75U8qXMdDkfdJbT8G+MkSCJyogDkCsYBdNdbh8DmZ3fcEucS4CoxzELpK2ySxO5qv3hduwiJJ35GAH1qhMT1uNQUCC+hjI/MdrKi0luw7sNtwTGkICocIS131lM7+Go8gWK6VXYM04c2KTHZYZeZbUmVjbeAQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Net = _t, Type = _t, Branch = _t, Date = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", type text}, {"Net", type number}, {"Type", type text}, {"Branch", Int64.Type}, {"Date", type date}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"ID", "Branch"}, {{"_tbl", each _, type table [ID=text, Net=number, Type=text, Branch=number, Date=date]}}),
    #"Pivoted Column" = Table.Pivot(#"Grouped Rows", List.Distinct(#"Grouped Rows"[ID]), "ID", "_tbl"),
    #"Renamed Columns1" = Table.RenameColumns(#"Pivoted Column",{{"BANK", "M"}, {"OR", "Records"}}),
    #"Renamed Columns2" = Table.RenameColumns(#"Pivoted Column",{{"BANK", "Records"}, {"OR", "M"}}),
    #"Appended Query" = Table.Combine({#"Renamed Columns1", #"Renamed Columns2"}),
    #"Added Custom" = Table.AddColumn(#"Appended Query", "MM", each ( [M] )[Net], type list),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Branch", "Records", "MM"}),
    #"Expanded Records" = Table.ExpandTableColumn(#"Removed Other Columns", "Records", {"ID", "Net", "Type", "Date"}, {"ID", "Net", "Type", "Date"}),
    #"Added Custom1" = Table.AddColumn(#"Expanded Records", "Match", each List.Count( List.Select( [MM], (c) => let c1 = c * -1 in c1 -2 <= [Net] and c1 +2 >= [Net] )) > 0 ),
    #"Changed Type1" = Table.TransformColumnTypes(#"Added Custom1",{{"Match", Int64.Type}}),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type1",{"MM"})
in
    #"Removed Columns"

 

Best Regards,
Mariusz

If this post helps, then please consider Accepting it as the solution.

 

sorry my reply got deleted

Mariusz
Community Champion
Community Champion

Hi @mbuhary 

 

Very sorry, I'm struggling to understand the requirement, can you explain why this 4 not matched?

Is there any rule to how you match Bank and Or?

 

Thanks

Mariusz

Yes, I will try to explain more.

If you see first amount in Red -249.11

-there is no exact match of a positive amount (or Bank amount) within the same branch (118);

- also there isn't any other positive amount  (or Bank amount) within the same branch (118) which would come as close as it would be only differ by plus or minus 2.

 

An exact match is like for example in Branch 118 if we had Bank  & Net 25.5 matched against OR & Net -25.5.

But in the data sheet we dont have this situation. All amounts come close but doesn't match to the 2nd decimal point.

 

To give you a an of examples where we matched with allowing our threshold for difference (+/- 2) is you can look at:

data row 1 +102.67 & 5 -102.57 (1 is "Bank" & 5 is "OR" while both are within same branch (118) and difference of the sum of both is within +/- 2 ( 102.57 - 102.67 = -0.10).

 

I hope its cleat enough.

Jimmy801
Community Champion
Community Champion

Hello @mbuhary 

 

and what should happen when a row can match 2 items form the table? BANK: 102 / OR: 101 OR:102?

 

Jimmy

We can choose the most close one if possible.

Meaning which could give the smallest difference.

Jimmy801
Community Champion
Community Champion

Hello @mbuhary 

 

check out this solution. It gives you every row that is within the threshold of +/-2.

Basically I select only the BANK-rows on add a column where do I select all OR-rows that are within your threshold.

You think this could suit your needs?

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rdLBDsIgDAbgVzE7j4a2UOCoV5Mt8brs5AN4MT6/bE4nSlYPXggkX/7SwjA0h313bNoGLYGEvNlNB4x5tc7056vB1Izt6iKDI90xMlgsnX93/WlJC75U8qXMdDkfdJbT8G+MkSCJyogDkCsYBdNdbh8DmZ3fcEucS4CoxzELpK2ySxO5qv3hduwiJJ35GAH1qhMT1uNQUCC+hjI/MdrKi0luw7sNtwTGkICocIS131lM7+Go8gWK6VXYM04c2KTHZYZeZbUmVjbeAQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, #"Net " = _t, #" Type" = _t, Branch = _t, #" Date" = _t]),
    #"Geänderter Typ" = Table.TransformColumns(Source,{{"Net ",each Number.Abs(Number.From(_,"en-US")), type number}}),
    OrTable =  Table.SelectRows(#"Geänderter Typ", each [ID]="OR"),
    BankTable = Table.SelectRows(#"Geänderter Typ", each [ID]="BANK"),
    AddColumn = Table.AddColumn
    (
        BankTable,
        "Or-table",
        (add)=>
            Table.SelectRows
            (
                OrTable,
                (select)=>
                    select[#"Net "]>= add[#"Net "]-2 and 
                    select[#"Net "]<= add[#"Net "]  + 2 and 
                    select[Branch]= add[Branch]
            )
    )
    
in
    AddColumn

Copy paste this code to the advanced editor to see how the solution works

If this post helps or solves your problem, please mark it as solution.
Kudos are nice to - thanks
Have fun

Jimmy

 

It gives me the base for some idea.

Let me try to work more on your ways.

We might see some light at the end of the Tunnel : )

 

I will keep you posted !

Jimmy801
Community Champion
Community Champion

Hello @mbuhary 

 

a feedback would be much appreciated

 

Jimmy

@Jimmy801 

I managed to proceed with your codes to a certain extent.

We are missing something here very important. "One to One" & should not be "ONE to MANY"

Please see the below table output from your query (I have Filtered Branch 118 only)

BANK has 6 rows as matched and OR has 7 rows as matched. The expectation is it should be equal a one to one match. Once an amount is matched with another (Bank & OR) they cannot be used to match another. I think this is the hardest part of this process.

IDMatchNetTypeBranch Date
BANK1102.67 11804-Oct-19
BANK1183.42 11804-Oct-19
BANK1313.01 11805-Oct-19
OR0183.75 11806-Oct-19
OR1-102.57 11806-Oct-19
OR1-183.17 11806-Oct-19
OR1-183.17 11806-Oct-19
OR1-312.96 11806-Oct-19
OR1-237.24 11827-Nov-19
BANK1237.25 11827-Nov-19
OR0-249.11 11827-Nov-19
BANK1336.94 11827-Nov-19
OR1-337.04 11827-Nov-19
BANK0348.9 11827-Nov-19
BANK1588.14 11827-Nov-19
OR1-588.63 11827-Nov-19

 

Jimmy801
Community Champion
Community Champion

Hello @mbuhary 

 

this is what i meant when I asked you about what happens when 2 rows are found. Keep only one and keep the other one for matching to other...

but in such a scenario it all would depend what row is processed first... so depending on your sort order there would be different outcomes. I don't think that would in your interested. The solution I provided is a kind of fuzzy join where a row of table B can suit multiple (or none) rows from table A.

For the result your are asking for I have no solution right away and this behavior was not specified in your initial request.

How important is this for you having this? Is this a one time shot or a daily process? How big is your database?

 

If this post helps or solves your problem, please mark it as solution.
Kudos are nice to - thanks
Have fun

Jimmy

I am sorry if I have not made it clear enough earlier.

 

But the one to one match is very important for us.

 

And the data is processed at the moment manually on a monthly basis.

 

Each month's data can accumulate up to 20,000 to 30,000 rows on average.

Jimmy801
Community Champion
Community Champion

Hello @mbuhary 

this seems quite complicated to achieve

 

BR

Jimmy

@Jimmy801 

Yes True. And thank you so much for your time and efforts.

Jimmy801
Community Champion
Community Champion

Hello @mbuhary 

 

you're welcome.

 

BR

 

Jimmy

Thank you so much. The codes are working fine in the test file.

 

Now I am trying to apply the same in my live data. I am facing couple of issues:

 1. In some rows I have Branch Name as Text or Empty fields, so I changed the column data type to Text and replaced the Empty fields with "Blanks". I hoep this doesn't affect any of the steps to follow.

 2. There are some times the Pivoted Column Tables are null, so it returns an error in the later step where we try to list each ( [M] )[Net ], type list. So I tried to embed an if statement before this to execute if RECORDS<>null and M<>null still I get an error. I am sure I am doing something wrong here.

Please advise. If i can attach my file here it would be best, But I dont know how to do it : (

Jimmy801
Community Champion
Community Champion

Hello @mbuhary,

 

the blank-rows shouldn't be of any problem. In my solution I filter for Blanks and ORs and try to match the ORs to the blank (so no full outer join!!).

About your second question.. pivoting was never part of the question. So I would suggest to mark the post as solution that is working for you and create for other question a new topic.

It would also always help if you try to explain your whole problem / goal at the beginning, what exactly you want to achieve. Otherwise we are proposing a micro step help without nowing what the real goal and then we get stuck on every step. 🙂 This only as general advice.


If this post helps or solves your problem, please mark it as solution.
Kudos are nice to - thanks
Have fun

Jimmy

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors