Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
I need help removing spaces between letters in abbreviations in a value while keeping spaces between words using Power Query. Here's an example:
Value | New Value |
P B I | PBI |
P B I Embedded | PBI Embedded |
A Wonderful Life | A Wonderful Life |
A & A Services | A & A Services |
A & A | A&A |
Anyone have some clever M ideas for me?
Solved! Go to Solution.
Saying "there seems to be no feasible way" sounds like a challenge to me. 🙂
This should group single characters into one cluster:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WClBwUvBUitWBshRcc5NSU1JSU8BCjgrh+XkpqUVppTkKPplpqVDBmFIDAyMzBUeF4NSisszk1GI0cTA3VCFYwTE3tSgzORHMD8lIVfBTcFPwUYqNBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Value = _t]),
GroupSingles = Table.AddColumn(Source, "Shorten", each
let
TextToList = Text.Split([Value], " "),
TextLengths = List.Transform(TextToList, each Text.Length(_)),
Zipped = List.Zip({TextToList, TextLengths, List.Skip(TextLengths)}),
Accum = List.Accumulate(Zipped, "",
(state, current) => if List.Skip(current) = {1,1}
then state & List.First(current)
else state & List.First(current) & " ")
in
Text.Trim(Accum), type text
)
in
GroupSingles
Example:
TextToList = Text.Split("P B I Embedded", " ")
= {"P", "B", "I", "Embedded"}
TextLengths = {1, 1, 1, 8}
Zipped = List.Zip({{"P", "B", "I", "Embedded"}, {1, 1, 1, 8}, {1, 1, 8}})
= {{"P", 1, 1}, {"B", 1, 1}, {"I", 1, 8}, {"Embedded", 8, null}}
Accum = "" & "P" & "B" & "I" & " " & "Embedded" & " "
= "PBI Embedded "
Text.Trim(Accum) = "PBI Embedded"
(@BA_Pete, you might enjoy this one.)
Hi @arpost ,
If you want to be more dynamic, there seems to be no feasible way.
It is suggested to replace the English abbreviations one by one.
Best Regards,
Stephen Tao
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Saying "there seems to be no feasible way" sounds like a challenge to me. 🙂
This should group single characters into one cluster:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WClBwUvBUitWBshRcc5NSU1JSU8BCjgrh+XkpqUVppTkKPplpqVDBmFIDAyMzBUeF4NSisszk1GI0cTA3VCFYwTE3tSgzORHMD8lIVfBTcFPwUYqNBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Value = _t]),
GroupSingles = Table.AddColumn(Source, "Shorten", each
let
TextToList = Text.Split([Value], " "),
TextLengths = List.Transform(TextToList, each Text.Length(_)),
Zipped = List.Zip({TextToList, TextLengths, List.Skip(TextLengths)}),
Accum = List.Accumulate(Zipped, "",
(state, current) => if List.Skip(current) = {1,1}
then state & List.First(current)
else state & List.First(current) & " ")
in
Text.Trim(Accum), type text
)
in
GroupSingles
Example:
TextToList = Text.Split("P B I Embedded", " ")
= {"P", "B", "I", "Embedded"}
TextLengths = {1, 1, 1, 8}
Zipped = List.Zip({{"P", "B", "I", "Embedded"}, {1, 1, 1, 8}, {1, 1, 8}})
= {{"P", 1, 1}, {"B", 1, 1}, {"I", 1, 8}, {"Embedded", 8, null}}
Accum = "" & "P" & "B" & "I" & " " & "Embedded" & " "
= "PBI Embedded "
Text.Trim(Accum) = "PBI Embedded"
(@BA_Pete, you might enjoy this one.)
I like what you did here @AlexisOlson. I'm totally stealing this. 😆
Have I solved your problem? Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;). |
If you found this post helpful, please give Kudos. It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen. If you find my signature vaguely amusing, please give Kudos. | Proud to be a Super User! |
Go for it!
I'm not sure how broadly useful this particular code is but it's cool to find another use for List.Zip since I saw someone else use it.
Greetings all, and great questions. I was thinking it would be a combo platter and could involve multiple transform steps.
First, I envision an initial pass (Process #1) at the abbreviation-solving logic that is only applied in cases where there aren't other non-abbreviated words, meaning if there are words that are longer than single letters. If there are only single letters, remove spaces; just received help on a related thread here that looks like it might be a viable solution.
Then, I imagine a second pass (Process #2) that would be similar to what @KNP and @ronrsnfld mentioned, which would be using a list of "acceptable" abbreviations and then anything not in that list would need to be corrected:
Acceptable Abbreviation |
A & A |
A R C |
Original Value | Adjusted Value | Process (For Demo. Only) |
A & A Towing | A & A Towing | 2 |
O & M | O&M | 1 |
A & R Shipping | A&R Shipping | 2 |
B & K Investments | B&K Investments | 2 |
R & P | R&P | 1 |
Hopefully that makes sense. I'm open to any suggestions.
Simple enough if you have some knowledge of regex,
</Script>
var regex = /(?<! ) (?![A-Z]{2,})/gmi; var str = "P B I Embedded"; var res = str.replace(regex, ''); alert(res)
</Script>
but a pain in the axx is that PQ doesn't support lookbehind assertion for embedded js.
Thanks to the great efforts by MS engineers to simplify syntax of DAX! Most beginners are SUCCESSFULLY MISLED to think that they could easily master DAX; but it turns out that the intricacy of the most frequently used RANKX() is still way beyond their comprehension! |
DAX is simple, but NOT EASY! |
It all depends on how you define what is an abbreviation and what is not. If you don't have a clear rule or existing list, then there can't really be a good solution.
You show that "P B I" is treated as an abbreviation whether it is alone in the string or not. However "A & A" is treated as an abbreviation ONLY if it alone. I think we need more examples of this kind of inconsistency to help you. But if that is common, then you will need to develop a list of acceptable abbreviations.
In this scenario, do you, or could you have a maintained list of possible abbeviations?
Have I solved your problem? Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;). |
If you found this post helpful, please give Kudos. It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen. If you find my signature vaguely amusing, please give Kudos. | Proud to be a Super User! |
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
99 | |
50 | |
19 | |
12 | |
11 |