cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
dw700d
Helper V
Helper V

Extract numerous Key words from sentences in a cloumn

I have a description column that contains descriptions of items. These descriptions are in a sentence format and contain one of over 30  key words.  A few of the key words are below. I would like to extract the key words from the description. For Example in the table below the Description for row 1 would only show MATCode and exclude the rest of the sentence. How can I accomplish this? I am open to doing it as a measure or in the query editor. whichever is most effective because I have over 30 key words

 

 

Key Words

MATCode

LNCMCA

LMTDWN

DECWEEM@AEEG

DECEESM@HDDC

DEFFMM@LKKDDI

DSFDFMM@SSSS

NfdIS@CffffVID-19@AffS37F

NgfgDIS@gdgdVID-19@A01S37F

 

 

PO Number

Descripton                                                     

1

Truck Lift MATCode update

2

Data LNCMCA Review

3

Update LMTDWN Analysis Integrity

4

DECWEEM@AEEG Modification

5

Title Revised DECEESM@HDDC

6

System Processor Technology DEFFMM@LKKDDI

7

DSFDFMM@SSSS Update

8

NfdIS@CffffVID-19@AffS37F Technology

9

NgfgDIS@gdgdVID-19@A01S37F

1 ACCEPTED SOLUTION
Greg_Deckler
Super User
Super User

@dw700d This is a variation on Text to Table. PBIX is attached below sig.

 

Keyword = 
    VAR __Separator = " "
    VAR __SearchText = MAX('Table'[Description])
    VAR __Len = LEN(__SearchText)
    VAR __Count = __Len - LEN(SUBSTITUTE(__SearchText,__Separator,"")) + 1
    VAR __Table = 
        SELECTCOLUMNS(
            ADDCOLUMNS(
                GENERATESERIES(1,__Count,1),
                "__Word",
                    VAR __Text = SUBSTITUTE(__SearchText,__Separator,"|",IF([Value]=1,1,[Value]-1))
                    VAR __Start =
                        SWITCH(TRUE(),
                            __Count = 1,1,
                            [Value] = 1,1,
                            FIND("|",__Text)+1
                        )
                    VAR __End = 
                        SWITCH(TRUE(),
                            __Count = 1,__Len,
                            [Value] = 1,FIND("|",__Text) - 1,
                            [Value] = __Count,__Len,
                            FIND(__Separator,__Text,__Start)-1
                        )
                    VAR __Word = MID(__Text,__Start,__End - __Start + 1)
                RETURN __Word
            ),
            "__Word", [__Word]
        )
    VAR __Keywords = INTERSECT(__Table,SELECTCOLUMNS('Keywords',"__Word",[Key Words]))
RETURN
    CONCATENATEX(__Keywords,[__Word],",")

 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

View solution in original post

6 REPLIES 6
Greg_Deckler
Super User
Super User

@dw700d This is a variation on Text to Table. PBIX is attached below sig.

 

Keyword = 
    VAR __Separator = " "
    VAR __SearchText = MAX('Table'[Description])
    VAR __Len = LEN(__SearchText)
    VAR __Count = __Len - LEN(SUBSTITUTE(__SearchText,__Separator,"")) + 1
    VAR __Table = 
        SELECTCOLUMNS(
            ADDCOLUMNS(
                GENERATESERIES(1,__Count,1),
                "__Word",
                    VAR __Text = SUBSTITUTE(__SearchText,__Separator,"|",IF([Value]=1,1,[Value]-1))
                    VAR __Start =
                        SWITCH(TRUE(),
                            __Count = 1,1,
                            [Value] = 1,1,
                            FIND("|",__Text)+1
                        )
                    VAR __End = 
                        SWITCH(TRUE(),
                            __Count = 1,__Len,
                            [Value] = 1,FIND("|",__Text) - 1,
                            [Value] = __Count,__Len,
                            FIND(__Separator,__Text,__Start)-1
                        )
                    VAR __Word = MID(__Text,__Start,__End - __Start + 1)
                RETURN __Word
            ),
            "__Word", [__Word]
        )
    VAR __Keywords = INTERSECT(__Table,SELECTCOLUMNS('Keywords',"__Word",[Key Words]))
RETURN
    CONCATENATEX(__Keywords,[__Word],",")

 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

@Greg_Deckler Thank you for this solution. I've managed to adjust it to my table but now I get an expression error on line 32 not a valid table expression. 

I'm sure I forgot a step just no idea what. 

 

I tried opening your keywords.pbix but unfortunately we using an older version and I don't have permission to upgrade.

@LeVa Hard to say, can you post your code?

 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

@Greg_Deckler I'm sorry I added a screenshot with the code, must have done something wrong there. 

This is what I have now. 

 

Keyword =
VAR __Separator = " "
VAR __SearchText = MAX('Servicedesk'[JobDescr])
VAR __Len = LEN(__SearchText)
VAR __Count = __Len - LEN(SUBSTITUTE(__SearchText,__Separator,"")) + 1
VAR __Table =
SELECTCOLUMNS(
ADDCOLUMNS(
GENERATESERIES(1,__Count,1),
"__Word",
VAR __Text = SUBSTITUTE(__SearchText,__Separator,"|",IF([Value]=1,1,[Value]-1))
VAR __Start =
SWITCH(TRUE(),
__Count = 1,1,
[Value] = 1,1,
FIND("|",__Text)+1
)
VAR __End =
SWITCH(TRUE(),
__Count = 1,__Len,
[Value] = 1,FIND("|",__Text) - 1,
[Value] = __Count,__Len,
FIND(__Separator,__Text,__Start)-1
)
VAR __Word = MID(__Text,__Start,__End - __Start + 1)
RETURN __Word
),
"__Word", [__word]
)
VAR __Keywords = INTERSECT(__Table,SELECTCOLUMNS('Servicedesk',"__Words", Servicedesk[JobDescr]))
RETURN
CONCATENATEX(__Keywords,[__Word],",")
 
Schermafbeelding 2021-11-28 om 16.18.03.png

Thanks @Greg_Deckler for the prompt response I am not familiar with the text to table concept. What does this measure allow me to do?

@dw700d It extracts the keywords from the sentences and lists them. You can see it in action here:

Keywords - Microsoft Power BI Community

 

Pretty sure it does exactly what you want to be done. Basically the way it works, it converts the sentence into a table of words. It then INTERSECTS the table of words with the table of keywords. It then concatenates the results.


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

Helpful resources

Announcements
Vote for T-Shirt Design

Power BI T-Shirt Design Challenge 2023

Vote for your favorite t-shirt design now through March 28.

March 2023 Update3

Power BI March 2023 Update

Find out more about the March 2023 update.

March Events 2023A

March 2023 Events

Find out more about the online and in person events happening in March!