Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and a 50 percent discount on exams.
Get startedEarn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
Hi all,
I've spent all morning creating detailed sample data to replicate what people do when they enter their address into a form. I have 38 examples. I very much doubt it covers all possibilities but it feels like it does.
There are people that put their town as the County, their county as the City and their locality as the Town. They even put their County as their Town! They put commas instead of spaces, they put commas and spaces and generally make our lives challenging.
This is what I have:-
This is what I would like to achieve:-
And this is my code so far:-
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("1Zhrc+I2FIb/iiafGWoZc/G3AklguyHNhGwzne1+ELaCNcgSI8lhvb++R7IxzrbTBbmX7Ux28YWEh3N73+OPH6/wVe8qDIbw/6Q/QLutPRigJLeX69dhUL1WPzgI0IJTgd5pTtADJwmtbsyJ4sXOntz/MK0ufRDM0BS9Z2KbSvsnFrPqhpA0J4z/WL/2E3f33X04GARXn3o1Vmix4n5UYcVfYw3eYmH0k8wEWhbCUIWmr1QUFuZJSgOfD0d3UqRSwMGN2HIiUn/G8MQ4ivtxbD9/0MfncD5lFM0UyTecaji9pnojlSy2GXqUxCKtiNpRg5ZE1TcsOWUJ1fC1dMYU9ecenLjfpvxbsQ2BWcqdjZo+gsLPMy03iqVb2rzVhypqZTysonk2WITuKddSoCeqVFOK7ykzNm4rqUjpDzb6/sHGwSVQQ7Qm/EBKdK3Yq0WqTom2TDPI5F4qY6tSKk2NP9+44cOjfjS5BHGEVoxz9JwR9WIrjLxSZVzXzjPXAvVRt0aIT5kdXJrZMVobmew0eiZlD60zuue0zuWySFNo0hdGeeraQxv0q1S7brQRPoUzjC+knczRnLMXCGGTdDtQSpMR+7Y7IhLtT9Zq3aiO45nz2s1BtkVLWWjaQziI0UwKwksESJZyVSgG8XOpv2OvQFWfuKjeSZMxIvzJW70d4H48ccE6M6g4mKJfCKetSQjjMYcy6KFGZPzRTt0T4ksUENtfmUFC0YJyziCQum7rLXVqt2IpSDfJpdp2CN2wrdAXqh/GoR0vNnYtgU6BKKWdgjY8SfLkXGW7l67QMB6gJQTsFf6Vwna1u6dMhm6ILTUmEsnFsYcXiuV689fv8fsOo9BrrD8WOitzmjrnBaMc3dGmMqcl2IxNocousQ3H3bjQXBpDnEm4VUSDOfunMS92ZbPiyxeiUt1D86xQSYaWIEJw3X6+FjbPllJnL1Co9jIV/uo4wW3Oi5onmoH2QOBgUKraiKFpqarbzYEn1uCrsXOuxOBhb6HkpnQ4NoBScXMcgSsYR41uvznxpJz4DscRhI6A60bPUqagiVLbgnS29mCvPIDvdhlXO5DGHnpQjfoANunW2pPYb6v5IMBPWLyrSvcO1JVl67BTKcaBf8vMOfw1w6zEPGdMGw7G56jda8bfJNqT7uQucHxBRT5JBZrnUofx2CU2Ld1iujZM8WoZ9Kc6jZkouMR94wm6JrCb1gpTpxHd2ET2bvZa5h2woqDVvtEFFbZQtGwN6Hr8Va0Meyrk8Fh7VFWWsUvFRbjlGJ1Qn1Vut4QnYGCngAafmBjt4hk7A7k2RcqkvWIXBypYkaP6/bW63MLQBsRmU12SfN+pnSM89lykw8ApC/r5tyIIwvEtUY0GWuMGo9OWxnRDwQ/Rtp3w4ww7bA4hnqNHWBiUZmkF6awO3LqnB7tO3MJiC3fhwgMvc1k4jbymr9LbWkbhwPPhTxiiW2UDRvKWKW+8+Ix+bnY0H6z2mjO6aM3BcNkORXQj0iPZHHZbq5BoSUoKluOBmCQ7uKEAZh1Ex871tQ0ouHVZ/A1PfiII5h++wdmFEH3rG/zbX+DcypjJA6wVaF3s9xx2SXDpOavmGQ6HsKkdwIo+UliODEwRRamBkUePqm8LHcYiMFZjzx995I9uH3+8wR79KfZ3Rj13PquNPe79L7il0EYViWFQ2m38Se8/oL/EhN+QLW/vWTDtQWi2GTAvVJFvOLEdCQoD4K6BpRIH4gykJ+jXDynOniYxWlGSHo4eaEnUnopu23/Uep6HJ43ODc+ysXK/hyhBLq1jtbVAaQay18iyIslOVF57Rjs+yovsY6dPvwM=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Product title" = _t, #"Payment total" = _t, #"Shipping Weight" = _t, Length = _t, Width = _t, Height = _t, #"Full name (delivery)" = _t, #"Company name (delivery)" = _t, #"Address line 1 (delivery)" = _t, #"Address line 2 (delivery)" = _t, #"Town / city (delivery)" = _t, #"County (delivery)" = _t, #"Postcode (delivery)" = _t, #"Country (delivery)" = _t, #"Country code (delivery)" = _t, #"Telephone number (delivery)" = _t, #"Email address" = _t, #"Order #" = _t]),
#"Split Column by Delimiter" = Table.SplitColumn(Source, "Shipping Weight", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.None, false), {"Shipping Weight.1", "Shipping Weight.2"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Shipping Weight.1", type number}, {"Shipping Weight.2", type text}}),
#"Split Column by Delimiter1" = Table.SplitColumn(#"Changed Type1", "Length", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.None, false), {"Length.1", "Length.2"}),
#"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Length.1", Int64.Type}, {"Length.2", type text}}),
#"Split Column by Delimiter2" = Table.SplitColumn(#"Changed Type2", "Width", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.None, false), {"Width.1", "Width.2"}),
#"Changed Type3" = Table.TransformColumnTypes(#"Split Column by Delimiter2",{{"Width.1", Int64.Type}, {"Width.2", type text}}),
#"Split Column by Delimiter3" = Table.SplitColumn(#"Changed Type3", "Height", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.None, false), {"Height.1", "Height.2"}),
#"Changed Type4" = Table.TransformColumnTypes(#"Split Column by Delimiter3",{{"Height.1", Int64.Type}, {"Height.2", type text}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type4",{"Shipping Weight.2", "Length.2", "Width.2", "Height.2"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Shipping Weight.1", "Shipping Weight"}, {"Length.1", "Length"}, {"Width.1", "Width"}, {"Height.1", "Height"}}),
#"Added Custom" = Table.AddColumn(#"Renamed Columns", "Girth", each [Length]+(2*([Width]+[Height]))),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom", each Table.SelectRows(#"Added Custom", each [Girth] < 300)),
Custom = #"Added Custom1"{0}[Custom],
#"Split Column by Delimiter4" = Table.SplitColumn(Custom, "Address line 1 (delivery)", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, false), {"Address line 1 (delivery).1", "Address line 1 (delivery).2"}),
#"Changed Type6" = Table.TransformColumnTypes(#"Split Column by Delimiter4",{{"Address line 1 (delivery).1", type text}, {"Address line 1 (delivery).2", type text}}),
#"Renamed Columns1" = Table.RenameColumns(#"Changed Type6",{{"Product title", "Item Name"}, {"Payment total", "Value"}, {"Shipping Weight", "Weight"}, {"Full name (delivery)", "Name"}, {"Company name (delivery)", "Organisation"}, {"Address line 1 (delivery).1", "Property"}, {"Address line 1 (delivery).2", "Street"}, {"Address line 2 (delivery)", "Locality"}, {"Town / city (delivery)", "Town"}, {"County (delivery)", "County"}, {"Postcode (delivery)", "PostCode"}, {"Country code (delivery)", "Country"}, {"Telephone number (delivery)", "Telephone"}, {"Email address", "Email"}, {"Order #", "Reference"}}),
#"Reordered Columns" = Table.ReorderColumns(#"Renamed Columns1",{"Item Name", "Value", "Reference", "Weight", "Length", "Width", "Height", "Name", "Organisation", "Property", "Street", "Locality", "Town", "County", "PostCode", "Country (delivery)", "Country", "Telephone", "Email", "Girth"}),
#"Removed Columns1" = Table.RemoveColumns(#"Reordered Columns",{"Country (delivery)"})
in
#"Removed Columns1"
I have thought that perhaps I should split Address Line 1 and Address Line 2 into as many columns as there are delimiters but then fall flat on my face thinking how can you intelligently reconstruct those columns into the two available columns, Property and Street?
Perhaps I need to some reference tables for the likes of UK County Names so some function can be written to extract any County name from any of the cells that it shouldn't be in.
I am sure some great brain out there will think of something I've not considered so far.
Copying the sample data I had created into an existing query was fun. I had to think what applied steps to delete to correct the code and then apply some renames at the end. I feel I am beginning to think a bit more laterally. There's probably a simple tip or trick I am not aware of?
Thanks in advance.
Hi @Rhothgar
Based on the picture you have offered, i have some questions, the country column.
e.g in your first picutre, it has the country, it has England, but in the second picture, it does not have, I don't know the logic about it, can you provide some logic about it?
Best Regards!
Yolo Zhu
Hi Yolo.
OK. So basically this data is roughly sanitised data which customers have entered into a form (over which I have NO control). I have changed House Number and House names to protect their identity. The rest of the data is how they entered it.
If you look carefully, that particularly person entered "England" as a County and NOT a Country (even then it would technically be incorrect as GB is the ISO code not England.
So in the second photo I deleted it because that should not appear as a "country" in a "county" header.
Any code would ideally need some kind of function to either remove a country from a county entry, maybe convert it using a reference table elsewhere and then enter it into the correct column which you cannot see in the above example as, at the moment, customers are technically only UK / GB based.
Hope that makes sense.