Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi @ImkeF / @MarcelBeug ,
A) How do i find the Item Counts?
I have a single column table named WordList containing a unique list of values, whose each word value i want to do a partial match with phrases in multiple Attribute columns (more than 255 columns) in another table named Data Extract and get a count of the Item codes that show these word values in their rows. I want to count only the First occurence of a word value for each item in a row.
e.g. A word value may occur in a single phrase multiple times OR across multiple Attributes for that Item in a row, but it should be counted as 1. So we will be counting Items and not how many times a value was found in an item row.
WordList Table:
Words
auto
bus
ingenius
natural
color
DataExtract Table:
ItemID Attr1 Attr2 Attr3 Attr4
10001 natural and soft put in auto mode so natural! get auto; get auto mode
10002 bus at stop natural icecream boy is ingenius more power
10003 let me out! promote goodwill simple simon auto model
10004 matching color natural color natural essence figure it!
10005 god will meet at bus stop similar cases rosetta stone
10006 mud ringworm helper columns rainbow color geneva convention
10007 masked man tenor voice silica gel natural smell
10008 microcosm guns of navarone bloated cell enigmatic man
10009 cellular sky natural color color coated ingenius of him!
10010 ingenius body maker chambers old story minor changes
Question: Is it good to transform i.e. unpivot the Attribute columns first as column Attributes and Value?
Output Table 1:
Words ItemCount %ItemCount
auto 2 =2/10
bus 2 =2/10
ingenius 3 =3/10
natural 5 =5/10
color 3 =3/10
B) If i click on any of the words in the WordList, each word should drilldown to the following Output Table:
Output Table 2:
e.g. If i clicked on the word Natural in above list, i should get the Attribute names that contained it, the unique Attribute Values that contained it and the Total Count of the Attribute Values found under that Attribute Name.
AttributeName AttributeValue AttributeValueCount
Attr1 natural and soft 1
Attr2 natural icecream 1
Attr2 natural color 2
Attr3 so natural! 1
Attr3 natural essence 1
Attr4 natural smell 1
Solved! Go to Solution.
This is the query for your first part, where you want to see a 5 for the "natural" keyword:
let Source = DataExtract, #"Inserted Lowercased Text" = Table.AddColumn(Source, "Words", each List.Transform(Text.Split([AttributeValue], " "), Text.Lower), type list), #"Expanded Words" = Table.ExpandListColumn(#"Inserted Lowercased Text", "Words"), #"Merged Queries" = Table.NestedJoin(#"Expanded Words",{"Words"},Table.TransformColumns(WordList, {"Words", Text.Lower}),{"Words"},"WordList",JoinKind.LeftOuter), #"Expanded WordList" = Table.ExpandTableColumn(#"Merged Queries", "WordList", {"Words"}, {"Words.1"}), #"Filtered Rows" = Table.SelectRows(#"Expanded WordList", each ([Words.1] <> null)), #"Grouped Rows" = Table.Group(#"Filtered Rows", {"ItemID"}, {{"Result", each Table.Distinct(_, {"Words"}), type table}}), #"Expanded Result" = Table.ExpandTableColumn(#"Grouped Rows", "Result", {"AttributeName", "AttributeValue", "Words"}, {"AttributeName", "AttributeValue", "Words"}) in #"Expanded Result"
For the second part (where you expect a 7), just delete the last 2 steps of the query above.
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Hi @ImkeF,
Sorry for bothering you again. I know i said no partial matches before. But is it possible to do a partial match above instead of an exact match? How will the above query change in such scenario?
e.g. the Word ALLERGEN may be found as ALLERGENICS, ALLERGENIC, ALLERGENS, ALLERGEN, HYPOALLERGENIC, HYPO-ALLERGENIC.
Thanks.
Hi @Anonymous,
the solution for the partial match would look like so:
let Source = Input, #"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"UPC"}, "Attribute", "Value"), #"Changed Type" = Table.TransformColumnTypes(#"Unpivoted Other Columns",{{"UPC", type text}, {"Attribute", type text}, {"Value", type text}}), #"Added Custom" = Table.AddColumn(#"Changed Type", "Words", each List.Select(WordList[Words], (l) => Text.Contains([Value], l, Comparer.OrdinalIgnoreCase))), #"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Words] <> {})), #"Expanded Words" = Table.ExpandListColumn(#"Filtered Rows", "Words"), #"Grouped Rows" = Table.Group(#"Expanded Words", {"UPC"}, {{"Result", each Table.Distinct(_, {"Words"}), type table}}), #"Expanded Result" = Table.ExpandTableColumn(#"Grouped Rows", "Result", {"Attribute", "Value", "Words"}, {"Attribute", "Value", "Words"}) in #"Expanded Result"
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Hi @ImkeF ,
I tried your concise solution, but not getting the partial match results in 2nd pivot table. I was able to write some M code that does the job, though it is not so quick, concise and ingenius as your solution. I observed that your code does not exclude the columns from the ExcludeColumns table but my code excludes.
I am attaching both Tools (your solution & my solution) for your perusal. Also attaching the Dummy Data.xlsx file that i have used with both the Tools. You can see the difference of results in the 2nd pivot table. I calculated the partial matches for my result and they are ok. But your code is somehow excluding some of the results, but don't know why as i didn't completely understand your code.
Finally the Grand Total in 1st Pivot should be 20 distinct UPCs, but it is showing only 15. Also the % is coming incorrect.
Thanks.
The Exclude-aspect is new to me. Could you please elaborate what you're trying to achieve with it?
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Sure @ImkeF ,
Using a Left-Anti Merge, i am able to remove from InputTable columns, the unwanted columns listed in the ExcludeColumns table e.g. Char8 column to be excluded from analysis.
P.S: As per my original query, the GrandTotals should be showing 20 (but showing 15) and the % should be calculated based on UPC count for a word / 20 (but showing UPC count for a word / 15). This is wrong in both the files.
I think after power query transforms the table, the rows reduce from 20 to 15 and thats the Distinct Count of UPC thats being picked up instead of the Original Distinct Count of UPC i.e. 20.
How to solve this problem with Power Query or DAX?
e.g. for the 1st pivot it should show:
Row Labels Distinct Count of UPC %Distinct Count of UPC2
OIL | 8 | 40.00% |
NO | 7 | 35.00% |
GLUTEN | 5 | 25.00% |
ALLERGEN | 5 | 25.00% |
HYPO | 4 | 20.00% |
PARABEN | 3 | 15.00% |
SODIUM | 2 | 10.00% |
JOJOBA | 1 | 5.00% |
Grand Total 20 100.00%
Excluding column has nothing to do with with the request to detect the words, so these are steps that could be executed everywhere.
Your application is becoming too complex for me here, so I hope that someone else steps in.
I cannot help you any further here.
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Hi @ImkeF ,
Thank you very much for your proactive help. I would really appreciate if you take a last look at the latest version of the Word Partial Match Tool and let me know how it could be optimized to run better.
I have written 2 DAX formulae to get the correct Grand Totals and Percentage of Grand Totals for distinct UPCs.
I also found out why the Grand Total was showing 15 Distinct UPCs instead of 20; the original distinct count of UPCs. At the time of Unpivoting the Other columns, any UPC rows that had all NULL values in them, were being removed automatically by Unpivot. So the count was showing less.
Current Issues to tackle :
The VBA code in the tool does an import of excel file, stores path and filename to the Parameter Table cell and only does a Refresh All (Data menu > Refresh All).
Appreciate all your help.
Performance tuning can bedome a very time-consuming task, that I don't do for free.
You might find some guidance to do it yourself here: https://www.thebiccountant.com/speedperformance-aspects/
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
@Anonymous
Please see attached file. It hopefully solves both problems.
Steps are as follows
1) Unpivot Attribute columns
2) Use Calculated Column for item Count as
ItemCount_ = VAR myword = [Words] RETURN COUNTROWS ( FILTER ( VALUES ( DataExtract[ItemID] ), NOT ( ISEMPTY ( FILTER ( CALCULATETABLE ( VALUES ( DataExtract[AttributeValue] ) ), SEARCH ( [Words], [AttributeValue], 1, 0 ) ) ) ) ) )
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
114 | |
99 | |
83 | |
70 | |
60 |
User | Count |
---|---|
150 | |
115 | |
104 | |
89 | |
65 |