Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
gopalv
Employee
Employee

Count unique strings in column containing list of strings

I have data like the following:

Timestamp

Subscriptions

2020-06-11sub1, sub2
2020-06-11sub2, sub3
2020-06-12sub5, sub6
2020-06-12sub5

 

I'd like to be able to plot a chart which shows the number of unique subscriptions at each time stamp. So for timestamp 2020-06-11, the value would be 3, and for timestamp 2020-06-12, the value would be 2. Is this possible?

1 ACCEPTED SOLUTION

Hi @gopalv ,

 

Based on the requirements you have posted, i think you will need to get your table in the correct format.

 

For calculating it only once, you can create a CC

 

CAL = CALCULATE(MAX('Table'[Usage]),FILTER('Table','Table'[Usage] = EARLIER('Table'[Usage]) && 'Table'[Compute Type] = EARLIER('Table'[Compute Type]) && 'Table'[Timestamp] = EARLIER('Table'[Timestamp]) && 'Table'[Subscription] > EARLIER('Table'[Subscription])))
 
1.jpg
 
 
Regards,
Harsh Nathani
WOULD appreciate a KUDOS 🙂 🙂

View solution in original post

8 REPLIES 8
harshnathani
Community Champion
Community Champion

Hi @gopalv ,

 

 

You can use Power Query

 

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjIwMtA1MNM1NFTSUSouTTLUUQCSRkqxOphyRmA5Y1Q5I4icKVjODLscCaImSrGxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Timestamp = _t, #"(blank)" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Timestamp", type date}, {"(blank)", type text}}),
#"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"(blank)", "Subscriptions"}}),
SplittedCol = Table.TransformColumns(#"Renamed Columns",{{"Subscriptions", each Text.Split(_,",")}}),
#"Expanded Subscriptions" = Table.ExpandListColumn(SplittedCol, "Subscriptions"),
#"Trimmed Text" = Table.TransformColumns(#"Expanded Subscriptions",{{"Subscriptions", Text.Trim, type text}})
in
#"Trimmed Text"

 

 

 

 

Original Table : Have Added some Values

1.jpg

 

 

 

2.JPG

 

 

 

3.JPG

 

 

 

Create a simple measure 

 

Sub = DISTINCTCOUNT('Table (2)'[Subscriptions])

 

4.JPG

 

 

Regards,

Harsh Nathani

Did I answer your question? Mark my post as a solution! Appreciate with a Kudos!! (Click the Thumbs Up Button)

Hi @gopalv ,

 

 

You can also use this code in Power Query 

 

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjIwMtA1MNM1NFTSUSouTTLUUQCSRkqxOphyRmA5Y1Q5I4icKVjODLscCaImSrGxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Timestamp = _t, #"(blank)" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Timestamp", type date}, {"(blank)", type text}}),
#"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"(blank)", "Subscriptions"}}),
#"Grouped Rows" = Table.Group(#"Renamed Columns", {"Timestamp"}, {{"a", each Text.Combine([Subscriptions], ", "), type text}}),
#"Renamed Columns1" = Table.RenameColumns(#"Grouped Rows",{{"a", "Combined"}}),
#"RemovedDuplicates" = Table.TransformColumns (#"Renamed Columns1",{{"Combined", each Text.Combine(List.Distinct(List.Transform(Text.Split(_,","), Text.Trim)),", ")}})
in
#"RemovedDuplicates"

 

1.jpg

 

 

Create a measure

 

CountOfItems = LEN(MAX('Table (3)'[Combined])) - LEN(SUBSTITUTE(MAX('Table (3)'[Combined]),",","")) + 1
 
 
Regards,
HN
az38
Community Champion
Community Champion

Hi @gopalv 

try a measure

Measure = CALCULATE(
SUMX('Table', 
LEN('Table'[Subscriptions])-LEN(SUBSTITUTE('Table'[Subscriptions], ",", "")) + 1
), 
ALLEXCEPT('Table','Table'[Timestamp])
)

do not hesitate to give a kudo to useful posts and mark solutions as solution
LinkedIn

In my example, I have multiple rows per timestamp. Would this code still work in that case?

Hi @gopalv ,

 

Yes, it will.

 

The first post.. Removes the delimiter and creates a new row for each time stamp and then does a DISTINCT COUNT.

 

 

The second post .. Combines the rows for a particular time stamp, then removes the duplicates and Counts the no. of words.

 

 

Regards,

Harsh Nathani

 

I'd prefer a DAX solution, if possible. The given DAX solution doesn't appear to be correct and the problem with splitting subscriptions in PowerQuery is that the resulting table ends up double- or triple-counting values in another column (not shown) that represent usage across all subscriptions at each timestamp.

 

For example, I start with:

TimestampSubscriptionsComputetypeUsage (across all subscriptions)
2020-06-11sub1, sub2Compute135
2020-06-11sub3, sub4Compute289
2020-06-12sub5, sub6Compute132
2020-06-12sub7, sub8Compute282

 

Your solution will give me a table that looks like:

 

TimestampSubscriptionComputeUsage
2020-06-11sub1Compute135
2020-06-11sub2Compute135
2020-06-11sub3Compute289
2020-06-11sub4Compute289
...   
2020-06-12sub8Compute282

 

Now, if I try to plot usage over time without any filters, I will suddenly have double as much usage at every time point.

Hi,

To plot usage over time, drag Year/Month from the Calendar Table and write this mwasure

=min(Data[Usage])

Hope this helps.


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/

Hi @gopalv ,

 

Based on the requirements you have posted, i think you will need to get your table in the correct format.

 

For calculating it only once, you can create a CC

 

CAL = CALCULATE(MAX('Table'[Usage]),FILTER('Table','Table'[Usage] = EARLIER('Table'[Usage]) && 'Table'[Compute Type] = EARLIER('Table'[Compute Type]) && 'Table'[Timestamp] = EARLIER('Table'[Timestamp]) && 'Table'[Subscription] > EARLIER('Table'[Subscription])))
 
1.jpg
 
 
Regards,
Harsh Nathani
WOULD appreciate a KUDOS 🙂 🙂

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.