The ultimate Microsoft Fabric, Power BI, Azure AI & SQL learning event! Join us in Las Vegas from March 26-28, 2024. Use code MSCUST for a $100 discount. Register Now
I get a flatfile from my online store with hundreds of orders. Some customers - the repeating ones - create a profile and purchase always from the same username. Others buy sporadically and don't bother to create a profile, instead purchasing as a guest.
I've got a lot of their data, such as addresses and phone numbers from the shipping details, but addresses are sometimes also not the same, eg. Unit 1 at 22 apple st, vs U 1 at 22 Apple street.
My data entry team is happy to do some manual data cleaning for a few records, but looking backwards we have a lot of records to clean and ideally we should try to do fix it with some Power Query M code.
My question is: as we identify different clients, particularly with different names, should I keep this on a separate table, outside Power query.
Anyone can share how they handle similar situations?
Hi @FalconOz
It is recommend to have a separate customer table as a dimension table. There should be a unique ID column in the customer table to distinguish every customer. The customers who don't bother to create a profile should also have unique IDs.
When cleaning data, you may try the Fuzzy merge feature. It is part of Merge Queries feature. You need a standard table in advance, then merge columns like address to it with fuzzy merge.
Best Regards,
Community Support Team _ Jing
Join us at our first-ever Microsoft Fabric Community Conference, March 26-28, 2024 in Las Vegas with 100+ sessions by community experts and Microsoft engineering.