Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
actually doing this in powerpivot but the code woudl be the same. We have an issue with our employees creating duplicated cusotmers as not bothering to search for thme properly. We aslo have fragmented data as not every custoemr has the same fields populated. I've writen some code to locate people sharing mobile number sor email addresses which all works fine as the FILTEr is using all positives for the EARLIER comparison.
I've got a bit stuck on the alst step where i want to find cuostmers that sahre address and surname, but not if we also have mobile or email data that says that are different (most commonly husband and wife say)
so for example the Barkers are clearly difffent people as we have mobiles that are all different, whereas for Daley the last 2 are differnet, but the first one has a blank mobile and email so could be a duplicate of one of the others. Ideally my code would only be flagging the 4th line
Customer No | Salutation Code | First Name | Surname | Address | Address 2 | Post Code | Mobile Phone No | |
1390568 | MRS | Rachel | Barker | 2 Nettleton Close | BD4 0SS | 7929088569 | ||
2025565 | MRS | Anna | Barker | 2 Nettleton Close | BD4 0SS | 7508610410 | ||
1392564 | MS | Victoria | Barker | 2 Nettleton Close | BD4 0SS | barker.ve@hotmail.com | 7923240056 | |
2004798 | MR | Daley | 28 Speedwell Road | BS5 7SB | ||||
2004796 | MR | M | Daley | 28 Speedwell Road | BS5 7SB | yaris246@blueyonder.co.uk | 7776261368 | |
1302226 | MR | Daley | 28 Speedwell Road | BS5 7SB | wesleydaley.wd@gmail.com | 7768382597 |
The code I have so far is
=CALCULATE( COUNTROWS(Customer), FILTER( Customer, Customer[Post Code]<>"DONT USE" && Customer[Post Code]=EARLIER(Customer[Post Code]) && Customer[Post Code]<>BLANK() && Customer[Address]=EARLIER(Customer[Address]) && Customer[Sex Code]=EARLIER(Customer[Sex Code]) && Customer[Mobile Phone No]<>EARLIER(Customer[Mobile Phone No]) && OR( LEFT(Customer[Surname],3)=LEFT(EARLIER(Customer[Surname]),3), RIGHT(Customer[Surname],3)=RIGHT(EARLIER(Customer[Surname]),3)) ) )
but the mobile no bit doesn't do what I need. Also we have lots of blanks. I cannot work out anyway to automatically exclude the Barkers (as the mobiles are all different). If one mobile is blank and other is populated that would need ot be flagged, so they are onyl not counted if both have the data AND it is different.
Hopefully someone more intelligent can point me in the right direction.
Thnaks for any help
Mike
Hi @masplin,
This calculated column might be a start. It is a way of identifying the barkers as being safe to exclude based on the number of distinct mobiles to the number of rows. This column could then be used in a new filter to help you fine tune.
Customers To Exclude = var ColAddress = 'Customer'[Address] var ColPostCode = 'Customer'[Post Code] var RowsPerCustomer = CALCULATE( COUNTROWS('Customer'), FILTER( ALL(Customer), 'Customer'[Address] = ColAddress && 'Customer'[Post Code] = ColPostCode ) ) var DistinctMobilesPerCustomer = CALCULATE( DISTINCTCOUNT('Customer'[Mobile Phone No]), FILTER( ALL(Customer), 'Customer'[Address] = ColAddress && 'Customer'[Post Code] = ColPostCode && 'Customer'[Mobile Phone No] <> BLANK() ) ) RETURN IF(RowsPerCustomer= DistinctMobilesPerCustomer,1,0)
Hi Phil
Is the clause
'Customer'[Address] = ColAddress
doing somethnig similar to the use of EARLIER i.e grouping them by these fuields with common entries?
So i think if ALL the cusotmers in the rowspercusmter group are differnet it would exclude them e.g. Barker, but if there are either 2 mobiles the same OR one mobile blank it would not?
Seems brilliant!!!!
HI @masplin
Yeah, the syntax I have used is just the same as using the EARLIER function. I use it because I find it slightly more readable but it's functionally the same.
Hi Phil. i see how it works in principal but it doesn't work in practice. i have 113,000 customers 32GB of RAM 64 bit and it just can't complete the distinct calc. I just tried this simple version and it never completes. Thisi trying find cusotmers with same mobile, gender and surname, but not where the initial is the non-blank and the same.
Guess no other optins on this?
= CALCULATE( DISTINCTCOUNT(Customer[Initial]), FILTER( Customer, Customer[Mobile Phone No]=EARLIER(Customer[Mobile Phone No]) && Customer[Mobile Phone No]<>BLANK() && Customer[Sex Code]=EARLIER(Customer[Sex Code]) && OR( LEFT(Customer[Surname],3)=LEFT(EARLIER(Customer[Surname]),3), RIGHT(Customer[Surname],3)=RIGHT(EARLIER(Customer[Surname]),3)) && Customer[Initial]<>BLANK() ) )
Hi @masplin,
Maybe you can try to add more than one Calculated Columns to get this work, which would calculate it step by step rather than one big step. Maybe this could work once a step.
1. Add a Calculated Column "Tag".
Tag = IF ( Customer[Mobile Phone No] <> BLANK () && Customer[Initial] <> BLANK (), 1, 0 )
2. Add another Calculated Column.
= VAR mpn = Customer[Mobile Phone No] VAR sc = Customer[Sex code] RETURN CALCULATE ( DISTINCTCOUNT ( Customer[Initial] ), FILTER ( Customer, Customer[Mobile Phone No] = mpn && Customer[Sex Code] = sc && Customer[Tag] = 1
&& OR ( LEFT ( Customer[Surname], 3 ) = LEFT ( [Surname], 3 ), RIGHT ( Customer[Surname], 3 ) = RIGHT ( [Surname], 3 ) ) ) )
Best Regards!
Dale
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
109 | |
95 | |
77 | |
65 | |
53 |
User | Count |
---|---|
144 | |
105 | |
102 | |
89 | |
63 |