cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
arpost
Helper III
Helper III

Is it possible to remove spaces only in abbreviations in a value using Power Query?

I need help removing spaces between letters in abbreviations in a value while keeping spaces between words using Power Query. Here's an example:

 

ValueNew Value
P B IPBI
P B I EmbeddedPBI Embedded
A Wonderful LifeA Wonderful Life
A & A ServicesA & A Services
A & AA&A

 

Anyone have some clever M ideas for me?

1 ACCEPTED SOLUTION

Saying "there seems to be no feasible way" sounds like a challenge to me. 🙂

 

This should group single characters into one cluster:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WClBwUvBUitWBshRcc5NSU1JSU8BCjgrh+XkpqUVppTkKPplpqVDBmFIDAyMzBUeF4NSisszk1GI0cTA3VCFYwTE3tSgzORHMD8lIVfBTcFPwUYqNBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Value = _t]),
    GroupSingles = Table.AddColumn(Source, "Shorten", each
        let
            TextToList = Text.Split([Value], " "),
            TextLengths = List.Transform(TextToList, each Text.Length(_)),
            Zipped = List.Zip({TextToList, TextLengths, List.Skip(TextLengths)}),
            Accum = List.Accumulate(Zipped, "",
                (state, current) => if List.Skip(current) = {1,1}
                                    then state & List.First(current)
                                    else state & List.First(current) & " ")
        in
            Text.Trim(Accum), type text
    )
in
    GroupSingles

 

AlexisOlson_0-1637954277845.png

 

Example:

 

TextToList  = Text.Split("P B I Embedded", " ")
            = {"P", "B", "I", "Embedded"}
TextLengths = {1, 1, 1, 8}
Zipped      = List.Zip({{"P", "B", "I", "Embedded"}, {1, 1, 1, 8}, {1, 1, 8}})
            = {{"P", 1, 1}, {"B", 1, 1}, {"I", 1, 8}, {"Embedded", 8, null}}
Accum       = "" & "P" & "B" & "I" & " " & "Embedded" & " "
            = "PBI Embedded "

Text.Trim(Accum) = "PBI Embedded"

 

 

(@BA_Pete, you might enjoy this one.)

View solution in original post

9 REPLIES 9
v-stephen-msft
Community Support
Community Support

Hi @arpost ,

 

If you want to be more dynamic, there seems to be no feasible way.

It is suggested to replace the English abbreviations one by one.

vstephenmsft_2-1637894082475.png

vstephenmsft_3-1637894100504.png

vstephenmsft_4-1637894111877.png

 

 

 

 

Best Regards,

Stephen Tao

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Saying "there seems to be no feasible way" sounds like a challenge to me. 🙂

 

This should group single characters into one cluster:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WClBwUvBUitWBshRcc5NSU1JSU8BCjgrh+XkpqUVppTkKPplpqVDBmFIDAyMzBUeF4NSisszk1GI0cTA3VCFYwTE3tSgzORHMD8lIVfBTcFPwUYqNBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Value = _t]),
    GroupSingles = Table.AddColumn(Source, "Shorten", each
        let
            TextToList = Text.Split([Value], " "),
            TextLengths = List.Transform(TextToList, each Text.Length(_)),
            Zipped = List.Zip({TextToList, TextLengths, List.Skip(TextLengths)}),
            Accum = List.Accumulate(Zipped, "",
                (state, current) => if List.Skip(current) = {1,1}
                                    then state & List.First(current)
                                    else state & List.First(current) & " ")
        in
            Text.Trim(Accum), type text
    )
in
    GroupSingles

 

AlexisOlson_0-1637954277845.png

 

Example:

 

TextToList  = Text.Split("P B I Embedded", " ")
            = {"P", "B", "I", "Embedded"}
TextLengths = {1, 1, 1, 8}
Zipped      = List.Zip({{"P", "B", "I", "Embedded"}, {1, 1, 1, 8}, {1, 1, 8}})
            = {{"P", 1, 1}, {"B", 1, 1}, {"I", 1, 8}, {"Embedded", 8, null}}
Accum       = "" & "P" & "B" & "I" & " " & "Embedded" & " "
            = "PBI Embedded "

Text.Trim(Accum) = "PBI Embedded"

 

 

(@BA_Pete, you might enjoy this one.)

View solution in original post

I like what you did here @AlexisOlson. I'm totally stealing this. 😆

Have I solved your problem?
Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;).
chrome-9xf-Zagzel-B

If you found this post helpful, please give Kudos.
It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen.
If you find my signature vaguely amusing, please give Kudos.
KIfp67uy-Sr
Proud to be a Super User!PBI-Super-User-Rank-30x30-1x

Go for it!

 

I'm not sure how broadly useful this particular code is but it's cool to find another use for List.Zip since I saw someone else use it.

arpost
Helper III
Helper III

Greetings all, and great questions. I was thinking it would be a combo platter and could involve multiple transform steps.

 

First, I envision an initial pass (Process #1) at the abbreviation-solving logic that is only applied in cases where there aren't other non-abbreviated words, meaning if there are words that are longer than single letters. If there are only single letters, remove spaces; just received help on a related thread here that looks like it might be a viable solution.

 

Then, I imagine a second pass (Process #2) that would be similar to what @KNP  and @ronrsnfld mentioned, which would be using a list of "acceptable" abbreviations and then anything not in that list would need to be corrected:

 

Acceptable Abbreviation
A & A
A R C

 

Original ValueAdjusted ValueProcess (For Demo. Only)
A & A TowingA & A Towing2
O & MO&M1
A & R ShippingA&R Shipping2
B & K InvestmentsB&K Investments2
R & PR&P1

 

Hopefully that makes sense. I'm open to any suggestions.

CNENFRNL
Community Champion
Community Champion

Simple enough if you have some knowledge of regex,

</Script>
var regex = /(?<! ) (?![A-Z]{2,})/gmi; var str = "P B I Embedded"; var res = str.replace(regex, ''); alert(res)
</Script>

Screenshot 2021-11-23 224700.png

but a pain in the axx is that PQ doesn't support lookbehind assertion for embedded js.

AlexisOlson
Super User
Super User

It all depends on how you define what is an abbreviation and what is not. If you don't have a clear rule or existing list, then there can't really be a good solution.

ronrsnfld
Responsive Resident
Responsive Resident

You show that "P B I" is treated as an abbreviation whether it is alone in the string or not.  However "A & A" is treated as an abbreviation ONLY if it alone.  I think we need more examples of this kind of inconsistency to help you.  But if that is common, then you will need to develop a list of acceptable abbreviations.

KNP
Super User
Super User

In this scenario, do you, or could you have a maintained list of possible abbeviations?

 

Have I solved your problem?
Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;).
chrome-9xf-Zagzel-B

If you found this post helpful, please give Kudos.
It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen.
If you find my signature vaguely amusing, please give Kudos.
KIfp67uy-Sr
Proud to be a Super User!PBI-Super-User-Rank-30x30-1x

Helpful resources

Announcements
2022 Release Wave 1 760x460.png

2022 Release Wave 1 Plan

Power Platform release plan for the 2022 release wave 1 describes all new features releasing from April 2022 through September 2022.

Power BI December 2021 Update_carousel 768x460.jpg

Check it Out!

Click here to read more about the December 2021 Updates!

User Group Leader Meeting January 768x460.png

Calling all User Group Leaders!

Don't miss the User Group Leader meetings on January, 24th & 25th, 2022.

Jan 2022 Dev Camp 768x460 copy.png

Power BI Dev Camp- January 27th, 2022

Mark your calendars and join us for our next Power BI Dev Camp!

Top Solution Authors