cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Highlighted
sifar786 Member
Member

Match Word List with another Table and get the Count of Words

Hi @ImkeF / @MarcelBeug ,

 

A) How do i find the Item Counts?

 

I have a single column table named WordList containing a unique list of values, whose each word value i want to do a partial match with phrases in multiple Attribute columns (more than 255 columns) in another table named Data Extract and get a count of the Item codes that show these word values in their rows. I want to count only the First occurence of a word value for each item in a row.

 

e.g. A word value may occur in a single phrase multiple times OR across multiple Attributes for that Item in a row, but it should be counted as 1. So we will be counting Items and not how many times a value was found in an item row.

 

WordList Table:

 

Words

auto

bus

ingenius

natural

color

 

 

DataExtract Table:

 

ItemID    Attr1                       Attr2                        Attr3                      Attr4

10001      natural and soft     put in auto mode      so natural!               get auto; get auto mode

10002      bus at stop             natural icecream       boy is ingenius        more power

10003      let me out!             promote goodwill     simple simon           auto model

10004      matching color      natural color              natural essence       figure it!

10005      god will                  meet at bus stop       similar cases            rosetta stone

10006      mud ringworm       helper columns         rainbow color          geneva convention

10007      masked man          tenor voice                silica gel                   natural smell

10008      microcosm             guns of navarone      bloated cell              enigmatic man

10009      cellular sky             natural color              color coated            ingenius of him!

10010      ingenius body        maker chambers       old story                  minor changes

 

Question: Is it good to transform i.e. unpivot the Attribute columns first as column Attributes and Value?


Output Table 1
:

Words         ItemCount          %ItemCount

auto                   2                              =2/10

bus                     2                              =2/10

ingenius             3                              =3/10

natural               5                              =5/10

color                  3                              =3/10

 

 

 

 

B) If i click on any of the words in the WordList, each word should drilldown to the following Output Table:

 

Output Table 2:

 

e.g. If i clicked on the word Natural in above list, i should get the Attribute names that contained it, the unique Attribute Values that contained it and the Total Count of the Attribute Values found under that Attribute Name.

 

AttributeName         AttributeValue               AttributeValueCount

Attr1                           natural and soft                         1

Attr2                           natural icecream                        1

Attr2                           natural color                               2

Attr3                          so natural!                                   1

Attr3                          natural essence                           1

Attr4                          natural smell                               1

 

1 ACCEPTED SOLUTION

Accepted Solutions
Super User
Super User

Re: Match Word List with another Table and get the Count of Words

This is the query for your first part, where you want to see a 5 for the "natural" keyword:

 

let
    Source = DataExtract,
    #"Inserted Lowercased Text" = Table.AddColumn(Source, "Words", each List.Transform(Text.Split([AttributeValue], " "), Text.Lower), type list),
    #"Expanded Words" = Table.ExpandListColumn(#"Inserted Lowercased Text", "Words"),
    #"Merged Queries" = Table.NestedJoin(#"Expanded Words",{"Words"},Table.TransformColumns(WordList, {"Words", Text.Lower}),{"Words"},"WordList",JoinKind.LeftOuter),
    #"Expanded WordList" = Table.ExpandTableColumn(#"Merged Queries", "WordList", {"Words"}, {"Words.1"}),
    #"Filtered Rows" = Table.SelectRows(#"Expanded WordList", each ([Words.1] <> null)),
    #"Grouped Rows" = Table.Group(#"Filtered Rows", {"ItemID"}, {{"Result", each Table.Distinct(_, {"Words"}), type table}}),
    #"Expanded Result" = Table.ExpandTableColumn(#"Grouped Rows", "Result", {"AttributeName", "AttributeValue", "Words"}, {"AttributeName", "AttributeValue", "Words"})
in
    #"Expanded Result"

For the second part (where you expect a 7), just delete the last 2 steps of the query above.

 

 

Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!

Imke Feldmann

How to integrate M-code into your solution -- Check out more PBI- learning resources here -- Performance Tipps for M-queries




29 REPLIES 29
Super User
Super User

Re: Match Word List with another Table and get the Count of Words

@sifar786 

 

Please see attached file. It hopefully solves both problems.

 

Steps are as follows

 

1) Unpivot Attribute columns

2) Use Calculated Column for item Count as

 

ItemCount_ =
VAR myword = [Words]
RETURN
    COUNTROWS (
        FILTER (
            VALUES ( DataExtract[ItemID] ),
            NOT (
                ISEMPTY (
                    FILTER (
                        CALCULATETABLE ( VALUES ( DataExtract[AttributeValue] ) ),
                        SEARCH ( [Words], [AttributeValue], 1, 0 )
                    )
                )
            )
        )
    )

 

 

sifar1.png

Super User
Super User

Re: Match Word List with another Table and get the Count of Words

@sifar786 

 

And this is the MEASURE which is used as VISUAL filter to get OUTPUT B

 

Measure =
IF (
    SEARCH (
        SELECTEDVALUE ( WordList[Words] ),
        SELECTEDVALUE ( DataExtract[AttributeValue] ),
        1,
        0
    ) > 0,
    1,
    0
)
sifar786 Member
Member

Re: Match Word List with another Table and get the Count of Words

Thanks @Zubair_Muhammad. I will have to go thru this to understand better how you achieved it, as i using Excel 2016 Pro 64 bit. Is it possible to recreate this solution in Excel using Slicer as a visual Filter?

 

Also, is it possible to achieve the same using Power Query transformation in Excel, as the DataExtract is huge (20+ million Item Rows & more than 255 Attribute columns)? Will there be a performance hit using the DAX measures?

 

Super User
Super User

Re: Match Word List with another Table and get the Count of Words

@sifar786 

 

I will try it later tonight with Power Query alone

But I think this solution which i provided above will be more efficient than a whole sole Power Query Solution

Super User
Super User

Re: Match Word List with another Table and get the Count of Words

Hi @sifar786 ,

still the open questions:

1) Shall the match be case sensitive?

2) What matches should be returned on "Autobus"?

Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!

Imke Feldmann

How to integrate M-code into your solution -- Check out more PBI- learning resources here -- Performance Tipps for M-queries




sifar786 Member
Member

Re: Match Word List with another Table and get the Count of Words

Hi @ImkeF ,

 

1) Case-insensitive

2) no match. No fuzzy logic needed.

 

Super User
Super User

Re: Match Word List with another Table and get the Count of Words

This is the query for your first part, where you want to see a 5 for the "natural" keyword:

 

let
    Source = DataExtract,
    #"Inserted Lowercased Text" = Table.AddColumn(Source, "Words", each List.Transform(Text.Split([AttributeValue], " "), Text.Lower), type list),
    #"Expanded Words" = Table.ExpandListColumn(#"Inserted Lowercased Text", "Words"),
    #"Merged Queries" = Table.NestedJoin(#"Expanded Words",{"Words"},Table.TransformColumns(WordList, {"Words", Text.Lower}),{"Words"},"WordList",JoinKind.LeftOuter),
    #"Expanded WordList" = Table.ExpandTableColumn(#"Merged Queries", "WordList", {"Words"}, {"Words.1"}),
    #"Filtered Rows" = Table.SelectRows(#"Expanded WordList", each ([Words.1] <> null)),
    #"Grouped Rows" = Table.Group(#"Filtered Rows", {"ItemID"}, {{"Result", each Table.Distinct(_, {"Words"}), type table}}),
    #"Expanded Result" = Table.ExpandTableColumn(#"Grouped Rows", "Result", {"AttributeName", "AttributeValue", "Words"}, {"AttributeName", "AttributeValue", "Words"})
in
    #"Expanded Result"

For the second part (where you expect a 7), just delete the last 2 steps of the query above.

 

 

Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!

Imke Feldmann

How to integrate M-code into your solution -- Check out more PBI- learning resources here -- Performance Tipps for M-queries




sifar786 Member
Member

Re: Match Word List with another Table and get the Count of Words

Hi @Zubair_Muhammad ,

 

myword is defined by not used anywhere. SELECTEDVALUE is not present in Excel Power Query. Also, when i enter the ItemCount_ calculated column, it goes on calculating and Excel PowerPivot screen freezes.

 

  • Can an optimized Measure be used instead of a calculated column?
  • Why is ItemCount suffixed with a "_" ?
  • ItemCount is also not used anywhere.

 

Super User
Super User

Re: Match Word List with another Table and get the Count of Words

@sifar786 

 

You can use MIN, MAX in place of selectedvalue
You can remove the "_"
I will write a MEASURE for you