Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Appreciable all
I require your support for the following, I need to assign clients to the closest manager by location, but these assignments should be distributed evenly, if the closest manager is already saturated, the client should be assigned to the next closest one.
I appreciate any suggestions you have ..
This is what I have done so far
This is the manager’s table
Client’s table
With the manager’s table i Grouped By location
To this table I add another column containing the records that belong to each location
To this new table, I add another column using a function that calculates the distances that exist between the client and each possible administrator. It is a ordered table from less to more distance.
Distances’ Table
Having this last table I can assign the closest manager, as it will be in the first record of the table of distances
Finally I expand this last column and remove auxiliary columns
This is how I assign the closest administrator, however, I do not find how to limit it so that it is distributed equally. The maximum number of records each manager can have would be the total number of clients in the location divided by the number of managers there are.
I thank you in advance for all your help
The code:
//The code with the functions
let
// Functions
Calculate_Distance=(Latitude1 as number, Longitude1 as number, Latitude2 as number, Longitude2 as number) =>
let
EarthRadius = 6378.1,
constante = Number.PI / 180,
DifferenceLat = Latitude1 - Latitude2,
DifferenceLon = Longitude1 - Longitude2,
a = Number.Power(Number.Sin(DifferenceLat * constante / 2) , 2) + Number.Cos(Latitude1 * constante) * Number.Cos(Latitude2 * constante) * Number.Power(Number.Sin(DifferenceLon * constante / 2) , 2),
b = 2 * Number.Asin(Number.Sqrt(a)),
FinalDistance = b * EarthRadius
in FinalDistance,
#"Data with Distance" = (Locations as table, DATA as table) as table =>
let
DataWithDistance=Table.AddColumn(DATA,"Distances", each
Table.Sort(
Table.AddColumn(Locations,"Distance", (RECORD)=>
Calculate_Distance(RECORD[latitude], RECORD[longitude],_[latitude],_[longitude])
), {"Distance"} )
)
in DataWithDistance,
// Processing
Managers = Manager,
Locations = Table.Group(Managers, {"Location"}, {{"Count", each Table.RowCount(_), type number}, {"Managers", each _, type table}}),
Records = Table.AddColumn(Locations, "Data",
each Table.SelectRows(Data,(MainTable) => if MainTable[Location]=[Location] then true else false)
,type table),
#"Records with distance" = Table.AddColumn(Records,"Records_Distance", each #"Data with Distance"(_[Managers],_[Data]), type table),
Assigned = Table.AddColumn(#"Records with distance","Assigned", each
Table.AddColumn([Records_Distance],"Manager", each _[Distances][Manager]{0})),
Expand = Table.ExpandTableColumn(Assigned, "Assigned", {"ID", "Manager"}, {"ID", "Manager"}),
Final = Table.RemoveColumns(Expand,{"Count", "Managers", "Data", "Records_Distance"})
in
Final
clients
managers
distrClientsByLocation (via ListAccumulate)
let
distr=(loc) =>
let
manTabLoc=managers{[location=loc]}[tabLoc],
listClients=clients{[location=loc]}[clientsLoc][id],
recsCl=Table.ToRecords(clients{[location=loc]}[clientsLoc]),
m=Table.RowCount(manTabLoc),
r=Number.RoundUp(List.Count(listClients)/m),
exManTab=Table.AddIndexColumn(Table.FromRecords(List.Combine(List.Transform(Table.ToRecords(manTabLoc), each List.Repeat({_},r)))),"idx",0,1),
lacc= List.Accumulate(recsCl,[cl={},man={},manT=exManTab],
(s,c)=> s& [cl=s[cl]&{c}, man=s[man]&{closestManager(c,s[manT])},manT=Table.RemoveMatchingRows(s[manT],{closestManager(c,s[manT])})])
in Table.FromColumns({lacc[cl],lacc[man]})
in
distr
closestManager
let
closest=(client, managers) =>
let
distances=List.Transform(List.Zip({managers[lat],managers[long]}), each distance(client[lat],client[long],_{0},_{1}))
in managers{List.PositionOf(distances, List.Min(distances))}
in
closest
invoked function for location 1
let
Source = distrClientsByLocLA("l1"),
#"Expanded Column1" = Table.ExpandRecordColumn(Source, "Column1", {"id"}, {"id"}),
#"Expanded Column2" = Table.ExpandRecordColumn(#"Expanded Column1", "Column2", {"man", "location", "lat", "long"}, {"man", "location", "lat", "long"})
in
#"Expanded Column2"
results in:
PS
I dont't have enough time now to give some important (I think) comment.
I'll come back to it as soon as possible.
Thank you so much @Anonymous ,
After doing a lot of tests and learning how to use the List.Generetate function, -thank you for your codes, it helped me a lot to understand how does it work - i did the following code
Although, I have tried to improve the algorithm, it only works with few records, otherwise, it takes too much time that is impossible to process.
I also tested your last code, and likewise, it takes a lot of time.
Any suggestions to make it faster? ,
I am not sure if the process of removing records is the one that slows down the process.
BTW I'm reading the data from an Excel file
PS. I share the code of the whole process in case someone helps , You only need to paste it in the advanced query editor
// Manager's Table
let
Managers = Table.TransformColumnTypes(Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("XdA9CsMwDAXgqxTPqdCvJd+hPUHIEErp1kLvP9RxCEGdZPD34EnzXO7re309vxcqU7l9HtvQBoRSnfv76grC1tzLMp2ak2Y1atgGF3DG4Ehcdr4NI0CyCI3BCdg4FGvy+udDpIYcgUrRmHLC9sQWtIDmnY8+gcBObpp0PXVFQBbEUScIwpyIk/ZdbzsbA1PvO05THfoaqi3pyNqEXA/NpJqbtKSl/5P7waMG9zWXHw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Manager = _t, Location = _t, latitude = _t, longitude = _t]),{{"latitude", type number}, {"longitude", type number}}),
// Client's Table
DATA = Table.TransformColumnTypes(Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("lZQ5coNAFETvQixTf1/u4BuoFPkAvn/mBkZIowgroQaePt09Pdzvi7IQRy235fv3R3FxWk2YvGNbfSWvkmLeqcvjdl8aT9J54rW4rXjg+Cdpau14UgpJTnhqS1X54DUiuDpPnvc3X+OJBK/ODzllZVqD78pQjqE+RY0m3KXYVGzglVgmycEzpfE83txg9pQTaUJhevLsH3KEUkn15Btxag89ZbAz8VLUgSmDz3aYH3EW8ZbGrN9wq+KpnzmIpWXwzU764bddsmTnaS3hDbcnDzc980QKQJ88GxFMH/GjPU528NslCHkble6hfRXsNCRIHfkzCUHSi/dYyy2Qz7FdvSJOTvcXz0/9l/ite/LG5+pIv1GTwbcSu482MxVZ5j/0o/82+8UEA7DnWbImZyET3vlNPTFPfCgq6Of8LlZFyQbvJPHBp9VxbxtPjqVZv/CsKZ5m/JJj2A1pFRyYF9962e7BX99e3bbX+6p8JUOfr6e/8+1Xx+MQe/B7eXplw8chqcb46hIVnK3HHw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Location = _t, latitude = _t, longitude = _t]),{{"latitude", type number}, {"longitude", type number}}),
// Function to calculate the distance
Calculate_Distance = (Latitude1 as number, Longitude1 as number, Latitude2 as number, Longitude2 as number) =>
let
EarthRadius = 6378.1,
constante = Number.PI / 180,
DifferenceLat = Latitude1 - Latitude2,
DifferenceLon = Longitude1 - Longitude2,
a = Number.Power(Number.Sin(DifferenceLat * constante / 2) , 2) + Number.Cos(Latitude1 * constante) * Number.Cos(Latitude2 * constante) * Number.Power(Number.Sin(DifferenceLon * constante / 2) , 2),
b = 2 * Number.Asin(Number.Sqrt(a)),
FinalDistance = b * EarthRadius
in FinalDistance,
//Function Asign Managers
Asign = (ManagersP as table, Records as table) =>
let
TotalMan= Table.RowCount(ManagersP),
ClientbyMan= Number.RoundUp(Table.RowCount(Records)/TotalMan),
TableDistances=Table.Sort(Table.Buffer(Table.ExpandTableColumn(Records,"Distances", {"Manager", "Distance"}, {"Manager", "Distance"})),"Distance"),
//Function to get the available manager
Next= (TableO,Asigned)=>
List.Last(List.Generate(()=>
[Continue = 1, TableI=TableO]
, each [Continue]=1
, each [
Continue= if Asigned{List.PositionOf(ManagersP[Manager], [TableI]{0}[Manager])} < ClientbyMan then 0 else 1,
TableI= if Continue = 1 then Table.RemoveRows([TableI],0) else [TableI]
]
, each [TableI]
)),
// Loop the table asigning the closest manager
Final=List.Generate( ()=>
[ Record=Table.First( TableDistances,[ID=null, Manager=null,Distance=null]),
Dispo= List.PositionOf(ManagersP[Manager],Record[Manager]),
TableSort= Table.RemoveMatchingRows(TableDistances,{[ID=Record[ID]]},"ID"),
Counter= List.ReplaceRange(List.Repeat({0},Table.RowCount(ManagersP)),Dispo, 1,{1}),
RecordOut=[ Id=Record[ID], Manager=Record[Manager], Distance=Record[Distance] ],
Continue= if Table.RowCount( TableSort)>0 then true else false
]
, each [Continue]
, each
[ Record=Table.First( Next([TableSort],[Counter]),[Manager=null] ),
Dispo=List.PositionOf(ManagersP[Manager],Record[Manager]),
TableSort= Table.RemoveMatchingRows([TableSort],{[ID=Record[ID]]},"ID"),
Counter= List.ReplaceRange([Counter],Dispo, 1,{[Counter]{Dispo}+1}),
RecordOut = [ Id=Record[ID],Manager=Record[Manager], Distance=Record[Distance] ],
Continue= if Table.RowCount( [TableSort])>0 then true else false
]
, each [RecordOut]
)
in Table.FromList(Final,Record.FieldValues,{"Id","Manager","Distance"}),
//Funcion to add a table with the calculate distances
#"Data with Distance" = (Locations as table, DATA as table) as table =>
let
DataWithDistance=Table.AddColumn(DATA,"Distances", each
Table.Sort(
Table.AddColumn(Locations,"Distance", (RECORD)=>
Calculate_Distance(RECORD[latitude],RECORD[longitude],_[latitude],_[longitude])
), {"Distance"} )
)
in DataWithDistance,
// Final Process
Locations = Table.Group(Managers, {"Location"}, {{"Count", each Table.RowCount(_), type number}, {"Managers", each _, type table}}),
Records = Table.AddColumn(Locations, "Data", each Table.SelectRows(DATA,(MainTable) => if MainTable[Location]=[Location] then true else false),type table),
#"Records with distance" = Table.AddColumn(Records,"Records_Distance", each #"Data with Distance"(_[Managers],_[Data]), type table),
AsignarProcess = Table.AddColumn(#"Records with distance","Final Asign", each Asign(_[Managers],_[Records_Distance])),
#"Remove Columns" = Table.RemoveColumns(AsignarProcess,{"Location", "Count", "Managers", "Data", "Records_Distance"}),
Final = Table.ExpandTableColumn(#"Remove Columns", "Final Asign", {"Id", "Manager", "Distance"}, {"Id", "Manager", "Distance"})
in
Final
to deepen the performance aspects (certainly the use of list.accumulate is not highly recommended), it is necessary to know:
the number of locations?
for each location how many managers? 10 to 15 managers, for example
for each location how many customers? 100 to 250 customers, for example
PS
What data did you test the code on and how long did it take?
what is a time you reasonably expect?
@Anonymous
Searching on the net, i found this article,
I made the change in the code and it takes less time than before.
The data set I used for the test is about 300 records, the data is in the code in my previous answer
In my first test it took me more than 5 minutes for only 200 records...
Now I can process 1500 in 6 minutes.
I am still looking for improvements because in my final data set there are 33,000 records (every month it's increase) divided in 20 places each one has between 1 and 4 managers having each one on average 1,200 customers; I also have to do other transformations to get the final result, so I need the fastest code , plus I'm not a very patient person 😄
Hi @Dihros
my code, although more concise than yours, it is no more efficient.
try to implement the following scheme if it is acceptable as a solution.
Once the distances matrix has been obtained, having the managers as columns and the distances from the various customers as values, sort by increasing values on the first column (man1, for example) and take the first [customers / man].
remove the first column and the first [customers / man] from the matrix and continue to the last column.
This scheme has the advantage that heavy control operations are performed by functions (therefore with more efficient algorithms and probably faster because they are made in a lower level language)
a script that deals with a table of 1700 customers on 8 managers in 4 locations, in a short time
per scaricarli, fai click sul seguente link e segui le istruzioni.
manTab and clTab are the data tables.
ClientsDistribution the query that produce (using the custom functions provided) the resulting distribution table.
####edit#####
I didn't pay attention to the fact that there were more than ten managers in the data and so I used a formula
Text.End (c, 1) which takes only the last character of the string, so "Manager 1" and "Manager 11" get mixed up.
The following is the correct (and also much simpler) formulation of the function:
Thank you for your help which I appreciate very much, I was very surprised to see the speed of your code,
While understanding the code I tested it with my test records and noticed that some customer numbers are repeated
It seems to me that they repeat when they have the same distance
I'll keep checking the code,
Thank you so much
Greetings
It is not very clear what you mean by the following expression:
"While understanding the code I tested it with my test records and noticed that some customer numbers are repeated"
But if it means what I understand, maybe you should read better the rest of my message, the part following the header #### edit ##### and make the indicated changes to the code
if i understand well i had done the change in the code already
Also my test data set has only 9 manager in total
What i mean is :
this is the result of your code using my test data set
Checking the data, I found that some IDs are duplicated and which I mark in red,
Taking for example the ID:3100615, it is located in row 287 and 1673 with manager 7 and 8.
I also checked your Table but it doesn't have duplicated. im no sure what's happening
mmhhhh ... as far as I remember the code makes use of the list of customer ids only to associates the various managers. Each customer id is only treated once, and the list is not manipulated.
Have you checked that the duplicates are not already in your original list?
Could you just provide the list of IDs you use?
i don't know what happend with my last reply. i edited it to remove empty lines and now it deseapeared
Yes.. you'r correct, one client has only one location and will be asigned to only one manager
Here's my Test Dataset https://we.tl/t-HJUVuUihTZ
Also If we asign clients in blocks to each manager, we have the problem that some clients will not be assigned to their closest manager. Using your code (thanks for it) i made changes trying to reduce this issue sorting the whole table first
disTabLoc2 = (loc as text)=> let
tabLoc = clTab{[Location=loc]}[clLoc],
//Distances Matrix
ac=List.Accumulate(
manTab{[Location=loc]}[manLoc][Manager],
tabLoc,
(s,c)=>Table.AddColumn(s, c,
each distance([latitude],[longitude],
manTab{[Location=loc]}[manLoc]{[Manager=c]}[latitude],
manTab{[Location=loc]}[manLoc]{[Manager=c]}[longitude]))),
// Get Manager ordered by shortest Distances
cols= Table.Sort(
Table.AddColumn(manTab{[Location=loc]}[manLoc], "Min Dis",each List.Min(Table.Column(ac,_[Manager])))
,"Min Dis")[Manager],
size=Number.RoundUp(Table.RowCount(ac)/List.Count(cols)),
//Sort the Table by Manager Ordered and get ID
Ids = Table.Sort(ac,cols)[ID],
// Asign Manager
Loop = List.Accumulate(cols,
[row=0,
IdsbyMan=List.Range(Ids,row*size,size),
Man=List.Repeat({cols{row}},List.Count(IdsbyMan)),
Final = Table.FromColumns({IdsbyMan,Man},{"ID","Manager"})
]
,(s,c)=>
[row=s[row]+1,
IdsbyMan=List.Range(Ids,row*size,size),
Man=List.Repeat({cols{row}},List.Count(IdsbyMan)),
Final = s[Final]&Table.FromColumns({IdsbyMan,Man},{"ID","Manager"})
]
)
in Loop[Final]
The setting of the last code assigns each manager the group of closest customers, starting from the "first", continuing assigning the group of the closest of the remaining customers to the "second" and so on.
IN EVERY CASE the result depends on the order in which the managers are chosen and IN EVERY CASE some clients will not be assigned to their closest.
Of this it is not difficult to give examples.
Hence the following observation makes very limited sense as is limited the position of the problem .
"Also If we asign clients in blocks to each manager, we have the problem that some clients will not be assigned to their closest manager. Using your code (thanks for it) i made changes trying to reduce this issue sorting the whole table first"
If it is a concrete problem, as I imagine it is, try to find the solution that best suits your needs. As a mathematical theoretical problem it is not well posed.
I would like to suggest the following strategy, if you or someone else would like to implement it.
In a given locatlity, for each customer the closest manager is associated.
There will be, in general, an imbalance with respect to the desired balanced distribution. So for managers who have had more customers associated with them than they were entitled to, the extra customers are put into play and the manager is eliminated from subsequent processing. So in the next step, there are surplus customers and underemployed managers.
One proceed with the assignment of unallocated customers to the closest underemployed managers and so on until all the managers are occupied.
Thank you very much for your suggestion.
After struggling trying to figure out how to implement your suggestion (I'm not used to programming anymore) i share the code; also, I think I know why the IDs were duplicated. After sorting the table, Power "optimizes" the table and changes the original order of the IDs,
This is what i mean .
The step Just sorting the data
The same step Using Buffer
Then, when we take the first N records and they have the same distance it seems to choose the same IDs twice,
Anyway Here's the code
FinalAsign =
(Clients as table, Managers as table ) => let
TotalManager = Table.RowCount(Managers),
size = Number.RoundUp(Table.RowCount(Clients)/TotalManager),
ClientswithDistance =
List.Accumulate(
Managers[Manager],
Clients,
(s,c)=>Table.AddColumn(s, c,
each distance([latitude],[longitude],
Managers{[Manager=c]}[latitude],
Managers{[Manager=c]}[longitude]))),
Man=Managers[Manager],
TableProcess =
Table.TransformColumns(
Table.Group(
Table.SelectColumns(
Table.Unpivot(ClientswithDistance,Man,"Manager","Distance")
,{"ID","Manager","Distance"})
, "ID", {{"Managerss",each _}})
, {"Managerss",each Table.Sort(_,"Distance")}),
Asigning = (RestofClients as table, RestofManagers as list) => let
Loop = if List.Count(RestofManagers)>1 then
List.Accumulate(RestofManagers,
[
Pending = #table( Table.ColumnNames(RestofClients), {} ) ,
Final=#table(List.RemoveItems(Table.ColumnNames(RestofClients),{"Managerss"})&{"Manager", "Distance"},{} ),
IterManager = RestofManagers
]
, (s,c)=>
[
ClientsSorted = Table.Buffer(Table.Sort(Table.SelectRows(
Table.ExpandRecordColumn(Table.AddColumn(RestofClients,"Ideal",each _[Managerss]{0} ), "Ideal", {"Manager", "Distance"}, {"Manager", "Distance"})
,each [Manager] = c[Manager])
,{"Manager", "Distance"}
)),
Dispo = c[Capacity],
Asigned = Table.Buffer( Table.FirstN(ClientsSorted,Dispo)),
NumAsigned = Table.RowCount(Asigned),
NewCounter = Dispo - NumAsigned,
Final = s[Final]&Table.RemoveColumns(Asigned,"Managerss" ),
Pending = s[Pending] & Table.Skip(Table.SelectColumns(ClientsSorted,{"ID","Managerss"}),NumAsigned),
IterManager = if NewCounter > 0 then List.Skip(s[IterManager]) & {[Manager = s[IterManager]{0}[Manager], Capacity = NewCounter]} else List.Skip(s[IterManager])
]
)
else
[ Final=Table.RemoveColumns(
Table.ExpandRecordColumn(Table.AddColumn(RestofClients,"Ideal",each _[Managerss]{0} ), "Ideal", {"Manager", "Distance"}, {"Manager", "Distance"})
,"Managerss" ),
Pending = #table( Table.ColumnNames(RestofClients), {} )
]
in if Table.RowCount(Loop[Pending]) > 0 then
Loop[Final] &
@Asigning(Table.TransformColumns(Loop[Pending],{"Managerss",(T_Man)=>Table.Buffer(Table.SelectRows(T_Man,each List.Contains(Table.FromRecords(Loop[IterManager])[Manager], _[Manager] )))}),Loop[IterManager])
else
Loop[Final]
in Asigning(TableProcess,List.Transform(Man, each [Manager=_, Capacity=size]))
Another thing that made me struggle was that Powerquery did not correctly calculate the number of records in a table
i had to use Table.Buffer before counting the number of records because otherwise it did not return the correct value
As in my previous code, we only call the function on the grouped tables of both clients and managers
I used this method, and it is faster than my previous ones, although I don't know if it would be faster to recalculate the entire distance table each time the clients to be reassigned instead of affecting the table by removing the records with the saturated managers.
Greetings
Hi @Dihros ,
I'm having trouble remembering the details of the last code. I have to find time to look at it again.
Of course it would be better if those who make a request were more punctual in interactions with those who take the trouble to respond.
Can you attach the files with which you do the tests, to have a situation as close as possible to the real one?
This seems to me an unusual, albeit very interesting, problem for the context.
the link of a pbix file with a draft solution:
there is a function that allows to obtain the manager-client associations for each location, invoking it with the name of the location.
per scaricarli, fai click sul seguente link e segui le istruzioni.
I'm not sure if I have interpreted the need correctly.
The idea is this (assuming that customers are more than managers :)):
I group both customers and managers by location;
for each location I cycle through the list of managers and find the closest customer in the list of customers, which I gradually exclude from the list. This is until the customer list is completely empty.
A different possible way would be to scroll through the list of customers and assign each one to the closest manager, excluding from the list of managers those who gradually become saturated.
an aesthetic change, to make the control of the end of the cycle less brutal.
I would like to go back to the literal meaning of the specification for what I understand, to share the following observation.
Assigning to the various customers the closest manager still available (not saturated) without any other specification, implies that the result is highly dependent on the order in which the customers are chosen to be assigned to the closest manager.
If all customers are equally privileged and you choose, starting from the first on the list to the last, you could have a situation in which the sum of the distances thus obtained is much greater than the sum of distances that would be obtained if you scrolled the list of customers on the contrary from the last to the first ,for example.
Thank you very much @Anonymous
I was able to analyze your proposal, which I appreciate very much.
In order to be a little clearer what I require is to distribute the total number of clients among the total number of administrators that exist for each location, but this assignment should be based on the shortest distance (or any other factor).
Taking the following example:
All the clients could be assigned to manager 1, since this one is the closest to all the clients, however, it is necessary to take advantage of the resources and assign clients to the rest of the managers, therefore, when the manager already has the number of clients that correspond to him, the client is assigned to the next closest manager and so on.
Clients for each administrator = Total of clientes / Total of managers
Thank you very much in advance
Greetings
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.