Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Shawn_Eary
Advocate IV
Advocate IV

Using AI to Categorize Titles, Subjects and Equipment Names

This isn’t my exact problem but I have one very similar to it. Suppose I have the following data set:

Book Title

Publisher

Sales

How to Get Your X-Wife to Come Back

Helpmasters of Chicago

$2,389,394

Fire from Space

Destiny Publishing

$389,390

How to Look Great at 45

Helpmasters of Chicago

$9,392,393

Nuclear Engineering for The Total Dummkopf

IEEE Entertainment Division

$3,938,293

Using Retrologue 4 in Cubase 14.5

zSounds Press

$3,293,300

Why I Should Have Never Sold my Commodore Amiga

Commodore Incorporated

$300,200

The Alien from Galaxy 384

Psychopathic Publishers

$700,320

How to Storm Area 51 Without Getting Caught, Arrested or Killed…

Psychopathic Publishers

$76,382,393

 

I’m sorry for the strange titles. I’m just in one of those moods, but is there some way Power BI could easily use trained Artificial Intelligence to look at just the book titles and publishers above and then come up with a guess as to whether or not the books are Sci-Fi, Self Help or Instructional?   After using and possibly training the AI, the rendered results should hopefully look something like this…

Book Title

Publisher

Sales

Classification

How to Get Your X-Wife to Come Back

Helpmasters of Chicago

$2,389,394

Self Help

Fire from Space

Destiny Publishing

$389,390

Sci-Fi

How to Look Great at 45+

Helpmasters of Chicago

$9,392,393

Self Help

Nuclear Engineering for The Total Dummkopf

IEEE Entertainment Divison

$3,938,293

Instructional

Using Retrologue 4 in Cubase 14.5

zSounds Press

$3,293,300

Instructional

Why I Should Have Never Sold my Commodore Amiga

Commodore Incorporated

$300,200

Self Help
(Mental Illness)

The Alien from Galaxy 384

Psychopathic Publishers

$700,320

Sci-Fi

How to Storm Area 51 Without Getting Caught, Arrested or Killed…

Psychopathic Publishers

$76,382,393

Instructional


Perhaps, the above example is a bit too goofy. In a vein similar to the above, I would like for Power BI to incorporate an AI that makes categorizations based upon the following Equipment Names:

Equipment Name

Video Camera with Desk Stand

Gemeinhardt flUtE   (Blue)

Gemeinhardt Flute   (Yellow)

Flute - Yammaha

Cubase 11.4 – Yammaha - Steinberg

Cubase 10.0 (DAW)

Cubase 5.0

3.0 Cubase

Steinway piano

Yammaha – MotoRcYycle

Yammaha Motorcycle (free Version of Cubase 10.5 with Purchase 😁 )

 

The AI would hopefully be trained to usually recognize the equipment as being in the following categories:

Equipment Name

Category

Video Camera with Desk Stand

Electronics

Gemeinhardt flUtE   (Blue)

Instrument

Gemeinhardt Flute   (Yellow)

Instrument

Flute - Yammaha

Instrument

Cubase 11.4 – Yammaha - Steinberg

Software

Cubase 10.0 (DAW)

Software

Cubase 5.0

Software

3.0 Cubase

Software

Steinway Piano

Instrument

Yammaha – MotoRcYycle

Automotive

Yammaha Motorcycle (free Version of Cubase 10.5 with Purchase 😁 )

Automotive

 

I know I used comical titles above, but I’m serious. I actually do have a ton of “garbage” data coming into Power BI and it’s difficult for me to categorize it. I wound up spending several hours writing a dumb C# “script” using naive regular expressions to categorize the data. I suspect I’ve made some poor decisions along the way though and that an AI (if trained properly) could potentially use a neural network to do the above mentioned categorizations based upon “word” positions and weights. I personally, however, do not know how to write or even use such a neural network so I’m wondering if the Power BI team has thought about writing AI into Power BI so that it can potentially use AI to “lump things into categories” like I mentioned above.

Any suggestions? I really don’t want to have to learn a whole bunch of new material as I tend be a lazy Haskell beginner (bad pun intended)…

 

NOTE: The thoughts contained in this post are in no way reflective of any of my current or future employers…

1 ACCEPTED SOLUTION
V-pazhen-msft
Community Support
Community Support

@Shawn_Eary 

Interesting idea, but unfortunately you can only to categorize them using DAX  to search if a cell contains certain texts then show xxx. Which is kind of far from your expectation. Please refer: https://community.powerbi.com/t5/Desktop/DAX-IF-contains-text-wildcard/td-p/649248

 

Power BI basically created to generate reports with a little bit of data modelling. It is not that kind of software to include auto reading and categorizing AI. I do not think this is a feasible feature to be added to Power Bi in the near future. But I would like to suggest you to raise a idea here

 

Paul Zheng _ Community Support Team
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

3 REPLIES 3
V-pazhen-msft
Community Support
Community Support

@Shawn_Eary 

Interesting idea, but unfortunately you can only to categorize them using DAX  to search if a cell contains certain texts then show xxx. Which is kind of far from your expectation. Please refer: https://community.powerbi.com/t5/Desktop/DAX-IF-contains-text-wildcard/td-p/649248

 

Power BI basically created to generate reports with a little bit of data modelling. It is not that kind of software to include auto reading and categorizing AI. I do not think this is a feasible feature to be added to Power Bi in the near future. But I would like to suggest you to raise a idea here

 

Paul Zheng _ Community Support Team
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Well as I said, I wound up using a bunch of naive Regular Expression (RegEx) patterns via C# but I think I can do better.  I'm sure there is a way to apply AI from C#, I just do know how to do it yet.  I was really hoping Power BI would learn to try to intelligently sort highly similar strings with slight permutations into "buckets".   

If AI isn't a possibility for Power BI categorization in the near future though, then at a minimum, Power BI should probably include native RegEx support.  I understand Mr. Seamark to say that the R programming language can be used withing Power BI do that in this post:
https://dax.tips/2017/05/23/power-bi-and-regular-expressions/
That means you can probably do the same thing with Python since Python also has RegEx support. 

Of course given Mr. Seakmark's post, it would stand to reason that you could farm Power BI logic out to Python or R and then possibly use AI, but it would be nice if Power BI attempted to make the categorizations I suggested automatically assuming the user flipped a switch to give Power BI permission.

My personal preference, however, is to do most of the analysis before the data gets into Power BI since I find DAX very difficult to understand.  I find text processing of TSV files much easier in C# via Visual Studio (where I have a nice responsive editor with debugging support) than that smaller DAX editor window that is slow and cumbersome.  Also, if my data is already in something like MS SQL Server, I find analysis in SQL Server easier than using DAX.  In fact, this statement is likely a little out of scope, but I personally think Haskell would make a good candidate for pre-processing of data before it gets into Power BI since Haskell might be considered the pure "granddaddy" of F#.   In my case, though, C# did the trick.

Don't get me wrong, there are some filters in Power BI that make certain data observations very easy without the need to write any code, but doing advanced analysis in Power BI feels rather cumbersome to me.  I was looking around yesterday, and I stumbled across this:
https://monkeylearn.com/blog/how-to-use-ai-in-excel-for-automated-text-analysis/
Unfortunately, I'm not sure if my current employer will approve of me using Monkey Learn on the basis of cost and security.  I think there is a free option, but I would still possibly have to get approval to use it since it would most likely involve surrendering data to an unapproved third party provider.

On the other hand, I already have "implicit" consent to process small amounts of non-sensitive data with Azure AI.  I could most likely do that via C#, but I don't know how to do it yet...  I skipped that class in school and the field has changed considerably since I graduated.  There were no cloud AI providers when I graduated and I'm not sure there were very many "decent" AIs either.  Back in that day, many of the "AI" solutions may have been "roll your own" types.

BTW: Please ask the admins of this forum to consider eliminating the spell check feature for posts on this website and replace it with something that is easier to use like this:
https://www.wufoo.com/html5/spellcheck-attribute/
https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/spellcheck
It's free and works really well.   All you have to do in most cases is "flip a switch..."



Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.