cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
arpost
Helper IV
Helper IV

Is it possible to remove spaces only in abbreviations in a value using Power Query?

I need help removing spaces between letters in abbreviations in a value while keeping spaces between words using Power Query. Here's an example:

 

ValueNew Value
P B IPBI
P B I EmbeddedPBI Embedded
A Wonderful LifeA Wonderful Life
A & A ServicesA & A Services
A & AA&A

 

Anyone have some clever M ideas for me?

1 ACCEPTED SOLUTION

Saying "there seems to be no feasible way" sounds like a challenge to me. 🙂

 

This should group single characters into one cluster:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WClBwUvBUitWBshRcc5NSU1JSU8BCjgrh+XkpqUVppTkKPplpqVDBmFIDAyMzBUeF4NSisszk1GI0cTA3VCFYwTE3tSgzORHMD8lIVfBTcFPwUYqNBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Value = _t]),
    GroupSingles = Table.AddColumn(Source, "Shorten", each
        let
            TextToList = Text.Split([Value], " "),
            TextLengths = List.Transform(TextToList, each Text.Length(_)),
            Zipped = List.Zip({TextToList, TextLengths, List.Skip(TextLengths)}),
            Accum = List.Accumulate(Zipped, "",
                (state, current) => if List.Skip(current) = {1,1}
                                    then state & List.First(current)
                                    else state & List.First(current) & " ")
        in
            Text.Trim(Accum), type text
    )
in
    GroupSingles

 

AlexisOlson_0-1637954277845.png

 

Example:

 

TextToList  = Text.Split("P B I Embedded", " ")
            = {"P", "B", "I", "Embedded"}
TextLengths = {1, 1, 1, 8}
Zipped      = List.Zip({{"P", "B", "I", "Embedded"}, {1, 1, 1, 8}, {1, 1, 8}})
            = {{"P", 1, 1}, {"B", 1, 1}, {"I", 1, 8}, {"Embedded", 8, null}}
Accum       = "" & "P" & "B" & "I" & " " & "Embedded" & " "
            = "PBI Embedded "

Text.Trim(Accum) = "PBI Embedded"

 

 

(@BA_Pete, you might enjoy this one.)

View solution in original post

9 REPLIES 9
v-stephen-msft
Community Support
Community Support

Hi @arpost ,

 

If you want to be more dynamic, there seems to be no feasible way.

It is suggested to replace the English abbreviations one by one.

vstephenmsft_2-1637894082475.png

vstephenmsft_3-1637894100504.png

vstephenmsft_4-1637894111877.png

 

 

 

 

Best Regards,

Stephen Tao

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Saying "there seems to be no feasible way" sounds like a challenge to me. 🙂

 

This should group single characters into one cluster:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WClBwUvBUitWBshRcc5NSU1JSU8BCjgrh+XkpqUVppTkKPplpqVDBmFIDAyMzBUeF4NSisszk1GI0cTA3VCFYwTE3tSgzORHMD8lIVfBTcFPwUYqNBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Value = _t]),
    GroupSingles = Table.AddColumn(Source, "Shorten", each
        let
            TextToList = Text.Split([Value], " "),
            TextLengths = List.Transform(TextToList, each Text.Length(_)),
            Zipped = List.Zip({TextToList, TextLengths, List.Skip(TextLengths)}),
            Accum = List.Accumulate(Zipped, "",
                (state, current) => if List.Skip(current) = {1,1}
                                    then state & List.First(current)
                                    else state & List.First(current) & " ")
        in
            Text.Trim(Accum), type text
    )
in
    GroupSingles

 

AlexisOlson_0-1637954277845.png

 

Example:

 

TextToList  = Text.Split("P B I Embedded", " ")
            = {"P", "B", "I", "Embedded"}
TextLengths = {1, 1, 1, 8}
Zipped      = List.Zip({{"P", "B", "I", "Embedded"}, {1, 1, 1, 8}, {1, 1, 8}})
            = {{"P", 1, 1}, {"B", 1, 1}, {"I", 1, 8}, {"Embedded", 8, null}}
Accum       = "" & "P" & "B" & "I" & " " & "Embedded" & " "
            = "PBI Embedded "

Text.Trim(Accum) = "PBI Embedded"

 

 

(@BA_Pete, you might enjoy this one.)

I like what you did here @AlexisOlson. I'm totally stealing this. 😆

Have I solved your problem?
Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;).
chrome-9xf-Zagzel-B

If you found this post helpful, please give Kudos.
It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen.
If you find my signature vaguely amusing, please give Kudos.
KIfp67uy-Sr
Proud to be a Super User!PBI-Super-User-Rank-30x30-1x

Go for it!

 

I'm not sure how broadly useful this particular code is but it's cool to find another use for List.Zip since I saw someone else use it.

arpost
Helper IV
Helper IV

Greetings all, and great questions. I was thinking it would be a combo platter and could involve multiple transform steps.

 

First, I envision an initial pass (Process #1) at the abbreviation-solving logic that is only applied in cases where there aren't other non-abbreviated words, meaning if there are words that are longer than single letters. If there are only single letters, remove spaces; just received help on a related thread here that looks like it might be a viable solution.

 

Then, I imagine a second pass (Process #2) that would be similar to what @KNP  and @ronrsnfld mentioned, which would be using a list of "acceptable" abbreviations and then anything not in that list would need to be corrected:

 

Acceptable Abbreviation
A & A
A R C

 

Original ValueAdjusted ValueProcess (For Demo. Only)
A & A TowingA & A Towing2
O & MO&M1
A & R ShippingA&R Shipping2
B & K InvestmentsB&K Investments2
R & PR&P1

 

Hopefully that makes sense. I'm open to any suggestions.

Simple enough if you have some knowledge of regex,

</Script>
var regex = /(?<! ) (?![A-Z]{2,})/gmi; var str = "P B I Embedded"; var res = str.replace(regex, ''); alert(res)
</Script>

Screenshot 2021-11-23 224700.png

but a pain in the axx is that PQ doesn't support lookbehind assertion for embedded js.


Thanks to the great efforts by MS engineers to simplify syntax of DAX! Most beginners are SUCCESSFULLY MISLED to think that they could easily master DAX; but it turns out that the intricacy of the most frequently used RANKX() is beyond their comprehension!

DAX is simple, but NOT EASY!

AlexisOlson
Super User
Super User

It all depends on how you define what is an abbreviation and what is not. If you don't have a clear rule or existing list, then there can't really be a good solution.

ronrsnfld
Impactful Individual
Impactful Individual

You show that "P B I" is treated as an abbreviation whether it is alone in the string or not.  However "A & A" is treated as an abbreviation ONLY if it alone.  I think we need more examples of this kind of inconsistency to help you.  But if that is common, then you will need to develop a list of acceptable abbreviations.

KNP
Super User
Super User

In this scenario, do you, or could you have a maintained list of possible abbeviations?

 

Have I solved your problem?
Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;).
chrome-9xf-Zagzel-B

If you found this post helpful, please give Kudos.
It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen.
If you find my signature vaguely amusing, please give Kudos.
KIfp67uy-Sr
Proud to be a Super User!PBI-Super-User-Rank-30x30-1x

Helpful resources

Announcements
Microsoft Build 768x460.png

Microsoft Build is May 24-26. Have you registered yet?

Come together to explore latest innovations in code and application development—and gain insights from experts from around the world.

charticulator_carousel_with_text (1).png

Charticulator Design Challenge

Put your data visualization and design skills to the test! This exciting challenge is happening now through May 31st!

Power BI Dev Camp Session 22 768x460.jpg

Check it out!

Mark your calendars and join us on Thursday, May 26 at 11a PDT for a great session with Ted Pattison!