Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
Rhothgar
Helper III
Helper III

They say nothing is IMPOSSIBLE! I don't believe it is BUT I cannot see a solution to this quandry!

Hi all,

I've spent all morning creating detailed sample data to replicate what people do when they enter their address into a form.  I have 38 examples.  I very much doubt it covers all possibilities but it feels like it does.

There are people that put their town as the County, their county as the City and their locality as the Town.  They even put their County as their Town!  They put commas instead of spaces, they put commas and spaces and generally make our lives challenging.

This is what I have:-

What I have.PNG

This is what I would like to achieve:-

What I would like to ahieve.PNG

And this is my code so far:-

 

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("1Zhrc+I2FIb/iiafGWoZc/G3AklguyHNhGwzne1+ELaCNcgSI8lhvb++R7IxzrbTBbmX7Ux28YWEh3N73+OPH6/wVe8qDIbw/6Q/QLutPRigJLeX69dhUL1WPzgI0IJTgd5pTtADJwmtbsyJ4sXOntz/MK0ufRDM0BS9Z2KbSvsnFrPqhpA0J4z/WL/2E3f33X04GARXn3o1Vmix4n5UYcVfYw3eYmH0k8wEWhbCUIWmr1QUFuZJSgOfD0d3UqRSwMGN2HIiUn/G8MQ4ivtxbD9/0MfncD5lFM0UyTecaji9pnojlSy2GXqUxCKtiNpRg5ZE1TcsOWUJ1fC1dMYU9ecenLjfpvxbsQ2BWcqdjZo+gsLPMy03iqVb2rzVhypqZTysonk2WITuKddSoCeqVFOK7ykzNm4rqUjpDzb6/sHGwSVQQ7Qm/EBKdK3Yq0WqTom2TDPI5F4qY6tSKk2NP9+44cOjfjS5BHGEVoxz9JwR9WIrjLxSZVzXzjPXAvVRt0aIT5kdXJrZMVobmew0eiZlD60zuue0zuWySFNo0hdGeeraQxv0q1S7brQRPoUzjC+knczRnLMXCGGTdDtQSpMR+7Y7IhLtT9Zq3aiO45nz2s1BtkVLWWjaQziI0UwKwksESJZyVSgG8XOpv2OvQFWfuKjeSZMxIvzJW70d4H48ccE6M6g4mKJfCKetSQjjMYcy6KFGZPzRTt0T4ksUENtfmUFC0YJyziCQum7rLXVqt2IpSDfJpdp2CN2wrdAXqh/GoR0vNnYtgU6BKKWdgjY8SfLkXGW7l67QMB6gJQTsFf6Vwna1u6dMhm6ILTUmEsnFsYcXiuV689fv8fsOo9BrrD8WOitzmjrnBaMc3dGmMqcl2IxNocousQ3H3bjQXBpDnEm4VUSDOfunMS92ZbPiyxeiUt1D86xQSYaWIEJw3X6+FjbPllJnL1Co9jIV/uo4wW3Oi5onmoH2QOBgUKraiKFpqarbzYEn1uCrsXOuxOBhb6HkpnQ4NoBScXMcgSsYR41uvznxpJz4DscRhI6A60bPUqagiVLbgnS29mCvPIDvdhlXO5DGHnpQjfoANunW2pPYb6v5IMBPWLyrSvcO1JVl67BTKcaBf8vMOfw1w6zEPGdMGw7G56jda8bfJNqT7uQucHxBRT5JBZrnUofx2CU2Ld1iujZM8WoZ9Kc6jZkouMR94wm6JrCb1gpTpxHd2ET2bvZa5h2woqDVvtEFFbZQtGwN6Hr8Va0Meyrk8Fh7VFWWsUvFRbjlGJ1Qn1Vut4QnYGCngAafmBjt4hk7A7k2RcqkvWIXBypYkaP6/bW63MLQBsRmU12SfN+pnSM89lykw8ApC/r5tyIIwvEtUY0GWuMGo9OWxnRDwQ/Rtp3w4ww7bA4hnqNHWBiUZmkF6awO3LqnB7tO3MJiC3fhwgMvc1k4jbymr9LbWkbhwPPhTxiiW2UDRvKWKW+8+Ix+bnY0H6z2mjO6aM3BcNkORXQj0iPZHHZbq5BoSUoKluOBmCQ7uKEAZh1Ex871tQ0ouHVZ/A1PfiII5h++wdmFEH3rG/zbX+DcypjJA6wVaF3s9xx2SXDpOavmGQ6HsKkdwIo+UliODEwRRamBkUePqm8LHcYiMFZjzx995I9uH3+8wR79KfZ3Rj13PquNPe79L7il0EYViWFQ2m38Se8/oL/EhN+QLW/vWTDtQWi2GTAvVJFvOLEdCQoD4K6BpRIH4gykJ+jXDynOniYxWlGSHo4eaEnUnopu23/Uep6HJ43ODc+ysXK/hyhBLq1jtbVAaQay18iyIslOVF57Rjs+yovsY6dPvwM=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Product title" = _t, #"Payment total" = _t, #"Shipping Weight" = _t, Length = _t, Width = _t, Height = _t, #"Full name (delivery)" = _t, #"Company name (delivery)" = _t, #"Address line 1 (delivery)" = _t, #"Address line 2 (delivery)" = _t, #"Town / city (delivery)" = _t, #"County (delivery)" = _t, #"Postcode (delivery)" = _t, #"Country (delivery)" = _t, #"Country code (delivery)" = _t, #"Telephone number (delivery)" = _t, #"Email address" = _t, #"Order #" = _t]),
    #"Split Column by Delimiter" = Table.SplitColumn(Source, "Shipping Weight", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.None, false), {"Shipping Weight.1", "Shipping Weight.2"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Shipping Weight.1", type number}, {"Shipping Weight.2", type text}}),
    #"Split Column by Delimiter1" = Table.SplitColumn(#"Changed Type1", "Length", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.None, false), {"Length.1", "Length.2"}),
    #"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Length.1", Int64.Type}, {"Length.2", type text}}),
    #"Split Column by Delimiter2" = Table.SplitColumn(#"Changed Type2", "Width", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.None, false), {"Width.1", "Width.2"}),
    #"Changed Type3" = Table.TransformColumnTypes(#"Split Column by Delimiter2",{{"Width.1", Int64.Type}, {"Width.2", type text}}),
    #"Split Column by Delimiter3" = Table.SplitColumn(#"Changed Type3", "Height", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.None, false), {"Height.1", "Height.2"}),
    #"Changed Type4" = Table.TransformColumnTypes(#"Split Column by Delimiter3",{{"Height.1", Int64.Type}, {"Height.2", type text}}),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type4",{"Shipping Weight.2", "Length.2", "Width.2", "Height.2"}),
    #"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Shipping Weight.1", "Shipping Weight"}, {"Length.1", "Length"}, {"Width.1", "Width"}, {"Height.1", "Height"}}),
    #"Added Custom" = Table.AddColumn(#"Renamed Columns", "Girth", each [Length]+(2*([Width]+[Height]))),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom", each Table.SelectRows(#"Added Custom", each [Girth] < 300)),
    Custom = #"Added Custom1"{0}[Custom],
    #"Split Column by Delimiter4" = Table.SplitColumn(Custom, "Address line 1 (delivery)", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, false), {"Address line 1 (delivery).1", "Address line 1 (delivery).2"}),
    #"Changed Type6" = Table.TransformColumnTypes(#"Split Column by Delimiter4",{{"Address line 1 (delivery).1", type text}, {"Address line 1 (delivery).2", type text}}),
    #"Renamed Columns1" = Table.RenameColumns(#"Changed Type6",{{"Product title", "Item Name"}, {"Payment total", "Value"}, {"Shipping Weight", "Weight"}, {"Full name (delivery)", "Name"}, {"Company name (delivery)", "Organisation"}, {"Address line 1 (delivery).1", "Property"}, {"Address line 1 (delivery).2", "Street"}, {"Address line 2 (delivery)", "Locality"}, {"Town / city (delivery)", "Town"}, {"County (delivery)", "County"}, {"Postcode (delivery)", "PostCode"}, {"Country code (delivery)", "Country"}, {"Telephone number (delivery)", "Telephone"}, {"Email address", "Email"}, {"Order #", "Reference"}}),
    #"Reordered Columns" = Table.ReorderColumns(#"Renamed Columns1",{"Item Name", "Value", "Reference", "Weight", "Length", "Width", "Height", "Name", "Organisation", "Property", "Street", "Locality", "Town", "County", "PostCode", "Country (delivery)", "Country", "Telephone", "Email", "Girth"}),
    #"Removed Columns1" = Table.RemoveColumns(#"Reordered Columns",{"Country (delivery)"})
in
    #"Removed Columns1"

 

 


I have thought that perhaps I should split Address Line 1 and Address Line 2 into as many columns as there are delimiters but then fall flat on my face thinking how can you intelligently reconstruct those columns into the two available columns, Property and Street?

Perhaps I need to some reference tables for the likes of UK County Names so some function can be written to extract any County name from any of the cells that it shouldn't be in.

I am sure some great brain out there will think of something I've not considered so far.

Copying the sample data I had created into an existing query was fun.  I had to think what applied steps to delete to correct the code and then apply some renames at the end.  I feel I am beginning to think a bit more laterally.  There's probably a simple tip or trick I am not aware of?

Thanks in advance.

2 REPLIES 2
v-xinruzhu-msft
Community Support
Community Support

Hi @Rhothgar 

Based on the picture you have offered, i have some questions, the country column.

e.g in your first picutre, it has the country, it has England, but in the second picture, it does not have, I don't know the logic about it, can you provide some logic about it?

vxinruzhumsft_0-1714628995623.png

vxinruzhumsft_1-1714629108968.png

Best Regards!

Yolo Zhu

 

 

 

 

Hi Yolo.

OK. So basically this data is roughly sanitised data which customers have entered into a form (over which I have NO control).  I have changed House Number and House names to protect their identity.  The rest of the data is how they entered it.

If you look carefully, that particularly person entered "England" as a County and NOT a Country (even then it would technically be incorrect as GB is the ISO code not England.

So in the second photo I deleted it because that should not appear as a "country" in a "county" header.

Any code would ideally need some kind of function to either remove a country from a county entry, maybe convert it using a reference table elsewhere and then enter it into the correct column which you cannot see in the above example as, at the moment, customers are technically only UK / GB based.

Hope that makes sense.

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

Top Solution Authors
Top Kudoed Authors