Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
TMielke
New Member

Count substrings in comma-separated string of varying length

Hello,

 

I have a special problem which can be simplified to a table like this:

 

Number of ItemsItemsDesired measure
1Apple_xxyyy 41
3Banana_xxyxy 2, Melon_xxxyy 3, Orange_xxy 60
5Banana_xxyyx 2, Apple_yyyy 2, Apple_xxxx 3, Apple_yxyx 5,  Orange_yxyx 13

 

Now what I need is to count the occurence of special substrings (let's say "Apple_" for example) considering the comma delimiters. So within each comma delimited substring, I need to do a check (e.g. contains "Apple_") if true, sum it up. The problem is, that the number of items can be huge (let's say up to 500). How can I do this? I would prefer a solution without having to split the original column, however if there's no other way, it could be done.

 

Thanks a lot in advance!

1 ACCEPTED SOLUTION
Ahmedx
Super User
Super User

5 REPLIES 5
Hema_Gupta
Frequent Visitor

Hi,

Hope this helps.

Hema_Gupta_0-1708426480973.png

 

danextian
Super User
Super User

Hi @TMielke ,

 

Assuming that you have a column that indicates what text to find and that for each delimiter two instances of the text to find is to be counted just once  (say Apple_xxxx and apple_yyyy are counted once), this can be fairly done in Power Query. Here's a sample custom column formula

let 
texttofind = Text.Lower([text to find]),
substrings = Text.Split([Items],","),
lowercased = List.Transform(substrings, Text.Lower),
filtered = List.Select(lowercased, each Text.Contains(_, texttofind))
in List.Count(filtered)

here's a sample M code

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUXIsKMhJja+oqKysVDAB8kFiiSAxpVidaCVjIM8pMQ8IQUoqKhWMdBR8U3Py84BcoBYFYx0F/6LEvHSwCQpmQNUGQJwLUgHWb4qiv7ICpB9iYyXIQjgPaFoFyDCoHNAqBVOdQwtghoMFQC4zRrguFgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Number of Items" = _t, Items = _t, #"Desired measure" = _t, #"text to find" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Number of Items", Int64.Type}, {"Items", type text}, {"Desired measure", Int64.Type}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "Count", each let 
texttofind = Text.Lower([text to find]),
substrings = Text.Split([Items],","),
lowercased = List.Transform(substrings, Text.Lower),
filtered = List.Select(lowercased, each Text.Contains(_, texttofind))
in List.Count(filtered))
in
    #"Added Custom"

danextian_0-1708426247162.png

 










Did I answer your question? Mark my post as a solution!


Proud to be a Super User!









"Tell me and I’ll forget; show me and I may remember; involve me and I’ll understand."
Need Power BI consultation, get in touch with me on LinkedIn or hire me on UpWork.
Learn with me on YouTube @DAXJutsu or follow my page on Facebook @DAXJutsuPBI.
Ahmedx
Super User
Super User

pls try this

Screenshot_2.png

this one seems to work for me exactly as expected! thanks a lot!

Arul
Super User
Super User

@TMielke ,

try this measure,

Count Measure =
LEN ( SUBSTITUTE ( MAX ( 'Table'[Items] ), "Apple_", "Apple_0" ) )
    - LEN ( MAX ( 'Table'[Items] ) )

Arul_0-1708424931610.png

 





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!


LinkedIn


Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.