Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
BobBonner
New Member

Iterative matching using weighting of multiple columns

I have to come up with an algorithm to find suitable matches for mentors and proteges.  To simplify it (maybe), we have a table with employee id, role (mentor or protegee), 5 columns of answers to preferences (where the weighting will happen).  In the columns for preferences, an employee has ranked his/her preferences 1 through 5.  E.g. PreferenceTopic1 = 4, PreferenceTopic2 = 3, PreferenceTopic3 = 1, PreferenceTopic4 = 5, and PreferenceTopic5 = 2.  If this person were a protege, a good mentor would have rank similarly.  So, I have to iterate through combinations of employee ids who are proteges against the employee ids who are mentors and weight them on the shortest differences in preference topics.  E.g. If possible mentor rates PT1 = 4, PT2 = 3, PT3 = 2, PT4 = 5, PT5 = 1, then they would be a really good match for the original example.  There low number (good match) would be 0+0+1+0+1 = 2.  I'm not worried about perfectly matching every mentor to a protege, but care about ranking them as "best fits".  Not sure if this is best suited for m language in editor or DAX in formula bar.

3 REPLIES 3
OwenAuger
Super User
Super User

@BobBonner

Assuming you use @ImkeF's method to calculate the "difference" between all possible pairs of Mentors and Proteges, do you also want to come up with an optimal pairing of Mentors and Proteges?

 

This would be an example of the Assignment Problem (assuming you define your objective as minimizing the sum of differences between pairs selected), which you could solve using lp.assign in the lpSolve R package, within Power Query.

I have previously used the lpSolve package in the Power BI service to solve linear programming problems, so there shouldn't be a problem using that package in the service.

 

It would be possible but difficult to write your own M code to solve the Assignment Problem in Power Query without using R, so would certainly recommend R.

 

Post back if you need help with that.

 

Regards,

Owen


Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn

Thanks @OwenAuger for pointing towards this pattern! 

As this seems to be a general pattern, I thought it is worth a try in M 😉 

 

 
let
    CatFields = {"pref1","pref2","pref3","pref4","pref5"}, 
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUcpNzSvJLwIyQBwjIDYGYhMgNlWK1YkGi8CVwKQg0jAlxuhKjKDYEIxBSkAaCoryS1LTU6FaTZDUQtSYoqjB7hozFDXGUGwIxUA1sQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [#"employee id" = _t, role = _t, pref1 = _t, pref2 = _t, pref3 = _t, pref4 = _t, pref5 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"employee id", Int64.Type}, {"role", type text}, {"pref1", Int64.Type}, {"pref2", Int64.Type}, {"pref3", Int64.Type}, {"pref4", Int64.Type}, {"pref5", Int64.Type}}),
    Mentors = Table.Buffer(Table.SelectRows(#"Changed Type", each ([role] = "mentor"))),
    Proteges = Table.SelectRows(#"Changed Type", each ([role] = "protege")),

	IterativeCombine = List.Skip(List.Generate( ()=>
		[Proteges = Proteges, Counter = 0],
		each [Counter]<=Table.RowCount(Mentors),
		each [	SelectedMentor = Record.FieldValues(Record.SelectFields(Mentors{[Counter]}, CatFields)),
				SelectVals = Table.AddColumn([Proteges], "CatVals", each Record.FieldValues(Record.SelectFields(_, CatFields))),
				CombineWithSelectedMentor = Table.AddColumn(SelectVals, "Combine", (x) => List.Zip({x[CatVals],List.Repeat(SelectedMentor, List.Count(x[CatVals]))})),
				Score = Table.AddColumn(CombineWithSelectedMentor, "Score", each List.Sum(List.Transform([Combine], each Number.Abs(_{0}-_{1})))),
				EmpID = Table.SelectRows(Score, each ([Score] = List.Min(Score[Score]))){0}[employee id],
				Proteges = Table.SelectRows([Proteges], each [employee id] <> EmpID),
				Counter = [Counter]+1],
		each [EmpID]
	)),
    CreateTable = Table.FromColumns({Mentors[employee id], IterativeCombine}, {"Mentor", "Protege"})
in
    CreateTable

It doesn't use the algorithm you mentioned, just iterates through the (table of) mentors, checks who is closest, and returns the first of that result. Then it eliminates the allocated protege from the table of proteges that goes into the next round of matching.

So not a real optimization, as it is dependent on the order of the elements, but a reasonalbe result.

 

Would be interested if that solves your problem @BobBonner and how the performance of this code is. Thx!

(You have to replace the code in step "Source" by a reference to your table and eventually adjust the values in step "CatFields")

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

ImkeF
Super User
Super User

You can do this in the query editor:

1) create a list with the profiles attributes for both

2) do a cartesian join, so that each item will be compared with all of the others.

3) For the comparison use List.Sum(List.Transform(List.Zip({protegeList, mentorList}), each Number.Abs(_{0}-_{1}))). That will add up all deviations

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors