Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Anonymous
Not applicable

Count Keywords in Hashtags

Hello folks, 

 I have a dataset like the below mentioned and I need to calculate occurances of Hashtags, lets say a single hashtag has occured how many times, in a row and across the column. I need to write a DAX which could help me solve this issue.

 

IDStrengths
1 
2#expectations-fulfilled ,#open-communication,#teamwork,#inclusivity ,#my-extended-team,#project-collaboration,#design-efficiency,#quality-achieved
3#open-communication,#teamwork,#positive-attitude,#technically-proficient,#quality-achieved,#GO-iCRC,#expectations-fulfilled ,#proactive-team,#process-compliance
4#timeliness-at-par,#positive-attitude,#expectations-fulfilled 
5#positive-attitude,#inclusivity ,#teamwork,#my-extended-team,#open-communication,#thumbs up-project
6#project-collaboration,#quality-achieved,#proactive-team
7#project-collaboration,#quality-achieved,#proactive-team
8#thumbs up-project,#teamwork,#positive-attitude
9#thumbs up-project,#teamwork,#technically-proficient
10 
4 REPLIES 4
Anonymous
Not applicable

It's not a job for DAX, but Power Query. First, you have to put each tag into its own row (keeping the link to ID, of course). Once you have a column where each row contains only 1 hashtag, you can start thinking about calculations. By the way, if a hashtag can appear multiple times in one row in the above table, you'll also need a column in the table I suggest you build that will store the number of occurences. So 3 fields: ID, Hashtag, Count.

Anonymous
Not applicable

Hello @Anonymous ,

 

Thanks for the quick reply, I had this thing in mind to break each hashtags into rows but I guess that would go against the Start Schema thing, which we generally try to achieve and moreover goes against the concept of normalisation. But still i agree to your point, it is so smooth and easy with Python but within power bi, it does requires lots of critical thinking and effort.

Anonymous
Not applicable

On the contrary, what I described is the right thing to do and it's in the spirit of dimensional modeling, including the star schema. Normalization is a concept from the OLTP world. PBI lives in the OLAP world where denormalization is what you want to achieve. But here we have a problem that has nothing to do with these concepts. Here we have a problem of atomicity of data. And columns should always store atoms, not compounds, be in in OLTP or OLAP.

 

Incidentally, to get the data into the right shape in PQ only takes several clicks of the mouse.

Tanushree_Kapse
Impactful Individual
Impactful Individual

Hi @Anonymous ,

 

You could create a column to know thw count of # across each row:
ROWcount= LEN(Table[Strengths])-LEN(SUBSTITUTE(Table[Strengths],"#",""))

After this:
You can create a measure to count #s in a column:
Columncount= Sum(ROWcount) 

I guess this is what you want.

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors