Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hey everyone,
My first time on the forum so please excuse me if I've missed a solution to this somewhere.
I am busy cleaning address data collected from e-commerce transactions. The address data is collected as free text and so is prone to typos.
In particular, people often fail to add a space between the number of their house and the name of their road (e.g. 23Wiltshire Place).
I'm trying to find any string where a number is subsequently followed by any letter and add a space, but for all the functions I have worked with so far, I don't have the knowledge on how I could reasonably accomplish this.
Can anyone point me in the right direction?
Thanks,
Callum
Solved! Go to Solution.
Try this Custom Column
Please see attached file as well
It works with your sample data 🙂
=let mylist=Text.ToList([Data]),
mycount=List.Count(mylist),
num={"0".."9"},
alpha={"A".."Z","a".."z"}
in
Text.Combine(List.Generate(()=>[a=0,b=mylist{a}],
each [a]< mycount,each [a=[a]+1,b= if
List.Contains(num,mylist{a})
and
List.Contains(alpha,mylist{a+1}) then mylist{a} & " " else mylist{a}],each [b] ))
Hi @cfraser,
another solution (2 lines of code) which also works with your sample. It splits the column by the last transition from digit to char and then combine these 2 new columns again together.
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("LYpLDgIhDECv0rB2kkFjjMsKVapMUSjqhHD/a0jU5fu0Zuy0n3azVZRLqFcWj5HhxBKQ34O4aGanoIwyqmNdTd80c7AeKa4CPvOTYDxhSeL/LWDW8krJQ074k+eICkcIVG6kAVyqWWFr50clkgL3nBYS9PSdwfT+AQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Data = _t]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"Data", type text}}), SplitColumn = Table.SplitColumn(#"Changed Type", "Data", Splitter.SplitTextByCharacterTransition({"0".."9"}, {"A".."Z"}), 2), CombineColumn = Table.CombineColumns(SplitColumn, {"Data.1", "Data.2"}, Combiner.CombineTextByDelimiter(" ", QuoteStyle.None), "Data")in CombineColumn
These are both amazing solutions. I'll have to spend a bit of time digesting exactly how they work, but thank you VERY much indeed! @Zubair_Muhammad and @Nolock you are the best!
Hi @cfraser ,
Do you want to achieve your desired output with dax or power query?
With Dax, you could have a try with format function.
With Power query, you could create a custom column in query editor.
If you still need help, please share your data sample so that we could help further on it.
Best Regards,
Cherry
@v-piga-msft I was hoping to achieve it using Power Query, but I simply don't know what functions to use to achieve it.
I have some additional examples below of actual data.
Data | Desired output |
1-5-301TANGHUJINDALI BINHAIXINDISTRICT TIANJINCITY | 1-5-301 TANGHUJINDALI BINHAIXINDISTRICT TIANJINCITY |
71DAELYN DRIVE RICHMOND | 71 DAELYN DRIVE RICHMOND |
71HARTSWOOD ROAD | 71 HARTSWOOD ROAD |
FLAT 9 HESKETH COURT 210QUEENS PROMENADE | FLAT 9 HESKETH COURT 210 QUEENS PROMENADE |
Thanks,
Callum
Try this Custom Column
Please see attached file as well
It works with your sample data 🙂
=let mylist=Text.ToList([Data]),
mycount=List.Count(mylist),
num={"0".."9"},
alpha={"A".."Z","a".."z"}
in
Text.Combine(List.Generate(()=>[a=0,b=mylist{a}],
each [a]< mycount,each [a=[a]+1,b= if
List.Contains(num,mylist{a})
and
List.Contains(alpha,mylist{a+1}) then mylist{a} & " " else mylist{a}],each [b] ))
Hi @cfraser,
another solution (2 lines of code) which also works with your sample. It splits the column by the last transition from digit to char and then combine these 2 new columns again together.
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("LYpLDgIhDECv0rB2kkFjjMsKVapMUSjqhHD/a0jU5fu0Zuy0n3azVZRLqFcWj5HhxBKQ34O4aGanoIwyqmNdTd80c7AeKa4CPvOTYDxhSeL/LWDW8krJQ074k+eICkcIVG6kAVyqWWFr50clkgL3nBYS9PSdwfT+AQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Data = _t]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"Data", type text}}), SplitColumn = Table.SplitColumn(#"Changed Type", "Data", Splitter.SplitTextByCharacterTransition({"0".."9"}, {"A".."Z"}), 2), CombineColumn = Table.CombineColumns(SplitColumn, {"Data.1", "Data.2"}, Combiner.CombineTextByDelimiter(" ", QuoteStyle.None), "Data")in CombineColumn
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.