Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
apralou
Advocate II
Advocate II

Filter records based on multiple strings in text column using OR or AND

Hello good people,

Is the following achievable in Power BI?

Context:

We have a system which allows users to select values to reflect multiple aspects of the record. in a single field e.g.

 

CodeDescription
AGreen
BBlue
CTurqoise
DRed
EBlack

 

When exported, the data renders in the column with all values separated with a comma e.g.

 

Record SelectedCodes
Record 1 A, B
Record 2 B,C
Record 3 A, D, E
Record 4 D

 

Problem:

We can set up a filter on the records by "Codes", but the filter would end up being the distinct values in that colum. ie.

apralou_0-1646611728452.png

 

Requirement:

We need users to be able to filer records on the "Selected Codes" column

  1. using the individual values (codes) that contribute to the column "SelectedCodes" in the record.
  2. being able to filter based on "and" (all selected values) / "or" (any selected values).

e.g. for the following dataset

Record SelectedCodes
Record 1 A, B
Record 2 B,C
Record 3 A, D, E
Record 4 D

 

SCENARIO 1 (AND FILTER)

this filter

apralou_1-1646612103122.png

would produce

RecordID
Record 3

 

SCENARIO 2 (OR FILTER)

this filter

apralou_2-1646612245144.png

would produce

RecordID
Record 1
Record 3
Record 4

 

I hope I haven't been unclear but happy to answer any questions if anyone thinks they can help.

Many thanks in advance.

1 ACCEPTED SOLUTION
apralou
Advocate II
Advocate II

Thank you @jianlong for your suggestions.

The most efficient solution tested, found to be fit for purpose and successfully deployed was found here.

https://community.powerbi.com/t5/DAX-Commands-and-Tips/partial-string-filter-based-multiple-selected...

View solution in original post

8 REPLIES 8
apralou
Advocate II
Advocate II

Thank you @jianlong for your suggestions.

The most efficient solution tested, found to be fit for purpose and successfully deployed was found here.

https://community.powerbi.com/t5/DAX-Commands-and-Tips/partial-string-filter-based-multiple-selected...

Yes, indeed this is the best.  Thank you for letting me know and this is the charm of the forum 🙂 @apralou 

jianlong
Resolver I
Resolver I

Just clarify, I mentioned two tables,  they are slicer tables, not datatable. 

For datatable, one will be good,    create relationship between ABC cominbination slicers table  to datatable directly.

                                                        A, B, C individual slicer does not have relationship to datatable. 

 

Anyhow, if you find or figure out the solution for mulitiple selections, please also post it here, thank you.

Yep - I understand. But maybe my concern is unfounded...

 

Consider this:

In the solution for your problem your original dataset:

apralou_0-1646625777003.png

 

is being transformed to 

apralou_1-1646625823610.png

 

If we use character count across the whole dataset to be a reflection of size:

  • Original dataset: 57 characters
  • Transformed dataset: 84 charcters.
  • Increase of 27 (147%) *(obviously highly dependent on how many ICDs against a client.)

This table essentially consititues a duplication in my eyes as it's a table based on the original dataset specifically to segment and filter the ICD values by attributing them individually to a Client record.  Presumably the rest of the columns for the client records still had to exist on its own table?

 

In any case, you indicated that your real dataset was around 50M rows. 

Did you have any efficiency problems using the solution that was suggested to you?

If not, maybe this is not such a bad way to go...

yeah, this is just a trial for the code.   In reality, it's 250M rows if useing codes without string, it's too much.

Here is the whole story:

1) three tables, with different sets of codes, image  Code_A,  Code_B, Code_C

2) each type has 10,000 different codes, suchas  Code_A_0000 to Code_A_10000

3) each client has different Codes A,  Code_B, and Code_C combination from three different source table table. 

4) I tested with small trial, the performance is a disaster, as we need to crossfilter for selected codes and client_ID. 

 

So

we created this table, in which each client has one row, and three types of codes together, and the combinations are concatenated together.   

such as 

ClientCode_ACode_BCode_C
a;kd;kfj;akj;dA000,A0001,A0099B04, B07…C11,C235…

So if we convert it back, it's basically our previous solution, with a big join table..  

 

btw, the performance is way better with all string together, the only limitation is I can only choose one code, for each code column..

 

Anyway, good discussion.

jianlong
Resolver I
Resolver I

I was trying to solve kind of the same problem. 

 

Basically there are two levels of filter, what I would do is:

1) two filter table,   table 1 with cominbinations of such as A, ABC, ABE, etc, to have relationship with the datatable;

                               table 2 with A, B, C, D individual only, no relationship to data table. 

 

2) for table 1, it's kind of easy

3) for table 2, 

    You can refer to   the following link (posted by me to ask the question).   

However, there are two problems for this:   

a)  You can only choose a single filter, such as only A, or only C.  I have not figure out how to filter directly.

b) although I accepted the solution, the solution is based on simple data, basically create single rows for combination of ABC,  i.e.   "A, B, C" to three rows for A, B, C individually. 

 

https://community.powerbi.com/t5/Desktop/select-multiple-values-in-slicers-to-search-find-or/m-p/236...

Thank you @jianlong . I'm hoping for a more elegant solution to this that can apply the filtering directly.

Depending on what else comes back, I may have to embrace your suggested methodology but I can see this solution adding processing overhead to the report as well as bloat the report with an extra table just to duplicate records to assign individual code values to them which will impact refresh times. 

Thank you very much for your input though. If no-one can offer a more streamlined approach, I may have to solution per the logic shown in the post you shared. Many thanks again.

@apralou 

I have hoped a loop would work, but that did not work out.  So here is the silly solution, the logic is:

1, Decide what's the maximum selection you can choose, let's say 5

2, Create five measures for selections, one for each by kind of ranking them,

3, count total selected, should be 0-5

4, create five measures to assign "Y" label if searched successfully;

5, comimbination of labels: 

var com_l = switch(true(),
sel_counts=1&& top_1_l="Y", "Y",
sel_counts=2 && or(top_1_l="Y",bot_1_l="Y"), "Y",
 
sel_counts=3&&or(or(top_1_l="Y",bot_1_l="Y"),top_2_l="Y"), "Y",
sel_counts=4&&or(or(top_1_l="Y",bot_1_l="Y"),or(top_2_l="Y",bot_2_l="Y")), "Y",
sel_counts=5&&OR(or(or(top_1_l="Y",bot_1_l="Y"),or(top_2_l="Y",bot_2_l="Y")),mid_1_l="Y"), "Y",
"N"
)
 
It works well, and of course, this can expand to more selections, but need to add more measures in a silly way.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.