Re: How to "double" filter?

kyle_l · ‎03-23-2016

I am visualizing a table (using force directed visual) that contains three columns: From_City, To_City, Visitor_Count

I want to create a single filter (on the visual) that filters both From_City and To_City in the same time, so that users of the visual don't have to deal with two filters: a From_City filter and a To_City filter.

What I tried but didn't work: Created a lookup table with a single column, City, and then relate City to both From_City and To_City, but PowerBI doesn't allow both relations to be active in the same time, so a City-based filtered cannot filter both columns in the same time.

Does anyone have a solution to this? (any help will be appreciated)

Note: I do understand that function-wise such a "double" filter is not the same as two seperate filters, but it is appropriate in my case.

----------------------------------------------------

Update:

To articulate, I want to use the city filter to select cities so that:

1) connections between selected cities show up in the visual, AND

2) connections that involve non-selected cities don't show up

So far I was able to achieve 1) thanks to ImkeF's help but still can't achieve 2)

Anonymous · ‎03-25-2016

Can you simply create a custom "From-To" column?

E.g. From-To Column = [Name of your From City Column] & " - " & [Name of your To City Column].

Now, if you had Berlin as "from" city, and Sydney as "to" city, the value in the new column would read "Berlin-Sydney".

It is pretty basic, but doesn't it solve the problem? (as long as you don't care about the direction, because obviously you'll also have "Sydney - Beriln" value for the reverse direction)

kyle_l · ‎03-25-2016

Thaks to taraskaduk for chiming in.

Indeed I don't care about direction of connection, but the challenge remains: How to use one filter to filter cities from two columns? the columns are From and To if I cared about the direction, or City_A and City_B if I didn't care about the direction.

If I understand your suggestion correctly-- correct me if I didn't-- You are suggesting creating a From-To column and let users filter based on this column. While this is certainly a solution, practically there will be too many From-To pairs in the filter box. For instance, if the user wants to filter out connections with Berlin, s/he will have to uncheck Berlin-London, Berlin-Seattle, Berlin-Sydney, and so on so forth.

Ideally, the user only needs to uncheck Berlin from the filter box to filter out all connections with Berlin. (and this is where I'm pulling my hairs trying to figure out...)

Anonymous · ‎03-25-2016

OK, I guess I didn't quite understand what exactly you were trying to accomplish with this.

If you want the users to see all the connections with Berlin (as opposed to a specific lane of traffic), you may want to create an exact same table as the one you have right now, and then create that lookup table with cities you were talking about. Now, connect both tables to the lookup table, one by the from city, one - by to city.

Now, when you click on, say, Berlin, in your lookup table, you can get both connections from Berlin and connecions to Berlin.

kyle_l · ‎03-25-2016

taraskaduk-

Sorry for the confusion. Let me try again--

I want to use the city filter to select cities so that:

1) connections between selected cities show up in the visual, and

2) connections that involve non-selected cities don't show up

With ImkeF's help I was able to achieve 1) but not 2).

I had actually tried the same lookup table method as your suggested above, but it worked funky. Connections would show up or not show up depending on if the selected city is connected to the City_Lookup through the From column or the To column.

Here's a small PowerBI example that illustrates the situation (in the "taraskaduk solution" tab).

ImkeF · ‎03-25-2016

Thanks, thats clear now.

To solve this, we need a '"KeepDuplicate"- measure: As we only want to keep those selected rows, who appear in the From-table as well as in the To-table. This is easiest to accomplish if we add an Index-Column in the first table that we can reference:

Measure = CALCULATE(COUNT('To'[Index]), FILTER(ALLEXCEPT('To', 'To'[City]), 'To'[Index]=MAX('To'[Index])))

This counts the number of same indexes. So this has to be set as a filter in your chart being 2.

Link to File

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

kyle_l · ‎03-30-2016

Thanks ImkeF.

I'd like to report back that your solution works perfectly!

Didn't respond back sooner because I tried to understand how your DAX formula works, and being a DAX newbie it took me a few days.

kyle_l · ‎03-31-2016

Question on MAX:

Having played with the formula, I found that I can achieve the same effect by putting MAX on both side or neither side of the euqation, like:

Measure = CALCULATE(COUNT('To'[Index]), FILTER('To',MAX('To'[Index])= MAX('To'[Index])))

OR

Measure = CALCULATE(COUNT('To'[Index]), FILTER('To','To'[Index]='To'[Index]))

Whereas ImkeF puts MAX only on the right hand side:

Measure = CALCULATE(COUNT('To'[Index]), FILTER('To','To'[Index]= MAX('To'[Index])))

Can anyone enlighten me why putting MAX only on one side?

I'm new to DAX coming from the SQL world, the way I picture how this formula works is like this:

SELECT COUNT() AS kount
FROM
	(SELECT From_City, To_City
	FROM table 'To'
	WHERE MAX(Index) == MAX(Index)
	GROUP BY From_City, To_City
	)
HAVING kount == 2

The smart thing about ImkeF's DAX formula is that the GROUP BY columns (From_City and To_City) are dynamically determined by the visual so you don't have to hardcode them like in the SQL above.

To articulate my question-- IF this is how the DAX formula works, wouldn't it make sense to put MAX on both sides because there's a (implicit) GROUP BY?

ImkeF · ‎04-01-2016

Yes, this is one of the core ''strange characteristics" of DAX. I needed around 20 attempts to get my head around it, maybe you will only need half of it 🙂

At the left side of the equation we have a row context that has already been converted into filter context, therefore no need to reduce/specify this further.

@MattAllington is much better at explaining this, maybe he will be so kind and step in here.

Can only recommend to read his book !

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

MattAllington · ‎04-01-2016

Yes, this stuff is complex when you first look at a formula. I cover this in Chapter 14 of my book, but here is the inside running.

First some Definitions

Aggregate function means any of the aggregate functions including SUM, MAX, MIN, AVERAGE, COUNT. They operate over columns of data providing typically a single scalar value result. https://en.wikipedia.org/wiki/Aggregate_function

When I say "Naked Column", I mean any reference to column in the form TableName[Column Name] without any aggregate function wrapped around it.

So with these definitions in mind, let's move on.

Explanation

If I write a FILTER formula in pseudo code, it would read like this.

=Filter(my table, where my table[column] has a filter applied that is equal to the Maximum value in the current filter context )

Inside a filter (or any other iterator for that matter), you need to interpret the formula as follows:

When you see a “naked column”, it is referring to the column in the table.
When you see an aggregate function, it is referring to the current filter context. So in effect the Aggregate functions can “read” the current filter context from a visualisation.

Reading the Current Filter Context

You need to learn to “read filter context” from your visualisations as one of the first and most important skills in learning to write DAX. Take the following image and take note of the highlighted cell.

You need to understand which filters (if any) are impacting this cell BEFORE the calculation of the cell is executed. So what is the filter context of this cell? There are 2 filters applied in this case, one is calendar year = 2002 (from Columns in the pivot) and another on Calendar Month = “September” (from Rows in the pivot). They work together to create a filter context of “September 2002”. So given the initial filter context is September 2002, what is MAX(Calendar[Date]) in this filter context? Well because an Aggregate function will respect the filter context, you can infer that it will return the maximum of Date for the current filter context “September 2002”, which of course is 30th Sept 2002.

So new question - what is MAX(Calendar[Month Number]) in the current filter context? Well given the filter context is September 2002, there is only a single Month Number in the current filter context, and hence MAX(Calendar[Month Number]) = 9. Note that MIN(Calendar[Month Number]) also = 9, as does AVERAGE and SUM.

New Question. what is MAX(Calendar[Year]) in the current filter context? Once again, there is only 1 year, so MAX, MIN, SUM, AVERAGE all will result in 2002.

This is what I mean by Aggregate functions can “read” the current filter context. In the case where there is only 1 value in the current filter context, you can use any Aggregate function to “read” the value of the current filter context.

Now, imagine you are applying this new filter you have "read" to a table

Imagine I asked you to apply a filter to the calendar table so that Calendar Year = 2002 and a second filter Calendar Month = September. Well you can actually simulate this by going into the back end in Power Pivot, go to the calendar table, find the “year” column and place a filter on “2002”. Then go to the Month Name column and filter it for “September”. What will you have left? You would have 30 days from 1 Sept 2002 through 30 Sept 2002 visible in the table. So what is MAX(Calendar[Date]) of this filter context? Clearly it is 30 Sept 2002. You should learn to "Imagine" this filtering in the back end when you read the filter context from your visualisation. I am using Excel, but it is the same with Power BI.

So back to your formula

Measure = CALCULATE(COUNT('To'[Index]), FILTER('To','To'[Index]= MAX('To'[Index])))

This formula says

First go and filter the "TO" table. When you get there, I want you to apply a new filter to the TO[Index] Column to be equal to the Maximum Value you can “See” in the current filter context. The naked column on the left is referring to which column to apply the filter to, and the MAX() is “reading” the value from the current filter context that needs to be applied. In the case where there is only a SINGLE value in the current filter context, you could use MIN, MAX, AVERAGE, SUM to “read” the value of the INDEX column.

Note

The following formula is equivalent.

Measure = CALCULATE(COUNT('To'[Index]), FILTER('To',MAX('To'[Index]) = To'[Index]))

The point is, it doesn't matter which is on the left or right. What matters is the naked columns and the aggregate functions.

Now I haven't read all the previous posts, but based on the OP, I would suggest the following approach.

1. Create a table of all the locations

2. Don't join this table to your data table.

3. Write a formula that applies 2 filters in a logical OR. Something like this

=if(hasonevalue(CitiesTable[City]),
     calculate(Sum(DataTable[VisitorCount],
          filter(DataTable,
                  DataTable[From_City]=Values(CitiesTable[City] || 
                       DateTable[To_City]=Values(CitiesTable[City]
                 )
     )
)

The first filter uses VALUES to harvest the currently selected City from the current filter context. It then applies a filter on the FROM column. The second does the same on the TO column. After both filter are applied, the remaining rows in the data table are used in the formula. The HASONEVALUE first checks to make sure you have a single filter applied to the CitiesTable. There is no point trying to "harvest" VALUES(CitiesTable[City]) if you have more than 1 city selected.

* Matt is an 8 times Microsoft MVP (Power BI) and author of the Power BI Book Supercharge Power BI.

ImkeF · ‎03-23-2016

You've described the dilemma well, so we need some creative modelling here:

If we can't consolidate 2 filters to 1 data-table, we're going to connect 1 filter to 2 consolidated (appended) data-tables instead.

let
    Source = From,
    #"Replaced Value" = Table.ReplaceValue(Source,"From","To",Replacer.ReplaceText,{"FromTo"}),
    #"Appended Query" = Table.Combine({#"Replaced Value", From}),
    #"Added Custom" = Table.AddColumn(#"Appended Query", "City", each if [FromTo]="To" then [To_City] else [From_City])
in
    #"Added Custom"

Check out the file.

You duplicate your fact tables, but one of them having the From-City (From) as the consolidating attribute and the other one the To-City (current query of the code above).

The visual you're using is ignoring duplicates, so the new consolidating column will act as your required filter that shows only the selected cities - and their connections, which is essential here.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

kyle_l · ‎03-23-2016

ImkeF, Thanks for the prompt response.

I understand your idea (clever!) but don't quite understand how to impliment (I'm new to PowerBI), so bear with me here--

1) What does consolidation attribute do? Where and how do I set consolidation attribute for a table?

2) Does the code create the 2nd table (i.e. the appended table), or do you have to duplicate the first table first and then modify it with the code?

3) I noticed that in your example file the From table and the To table have a many-to-one relation on Visitor_Count. Is it OK when my data has duplicates across To-From pairs (i.e. multiple To-From city pairs have the same number of Visitor_Counts)?

Thanks,

ImkeF · ‎03-23-2016

kyle_I,

You (and your questions) are welcome 🙂

1) It's the new column 'City' that acts as you (new consolidated) filter-column

2) I did all this using the UI only, so this is how it should work for you as well

add a column 'FromTo' to your original table (you don't have to rename this query, ours is called "From" in this example)
create a new query by referencing your original table
check column 'FromTo' and replace 'From' by 'To'
append the first query (From) to this query
create your new filter column 'City' with the logic in the formula

3) I didn't realize there was a connection between these tables - must have been created automatically in the background. You can delete it.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

kyle_l · ‎03-25-2016

Thank you ImkeF. Your instructions were easy to follow.

There's one little thing though-- this filter hack works great when a user wants to show every city that are connected with selected cities, but it doesn't let the user to filter out cities that are not selected.

For instance, if we checked Berlin and London in the filter box, the visual still shows the two cities' connections with Sydney and Seattle. This is OK when the number of cities is small like in this example, but when the data contains many cities-- if I select 5 cities from the filter (wanting to see where the strong connections are between these 5 cities) the visual will show dozens other cities that are connected to the 5 selected cities, making it hard to see connections between the 5 cities of interest.

This is not a fault of your solution but what people normally want a filter to do-- to filter down the number objects in a visual so they can see better.

I guess one (partial) solution is to add another filter on Visitor_Count, but then, connections between the 5 cities are not necessarily the stronger connections that will still show up after this second filter is applied.

ImkeF · ‎03-25-2016

Hm - to me it looks as if I didn't understand your original request right (thought it was about to exclude those cities who don't have a connection to each of the selected cities)

Just to reassure: The filter shall work in a way that only those citiy From-To-Combinations are shown, where both cities are included in the filter?

So if just one city is selected, nothing would be shown.

If 2 or more cities are selected who don't have a relationship with each other, nothing will be shown as well.

Is that correct?

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

kyle_l · ‎03-25-2016

ImkeF- You did understand my request perfectly, which is to filter out cities that are not selected.

I initially thought that your solution will do just that, but later realized that it doesn't work that way. For instance, when only one city, say Berlin, is checked in the filter box, two connections (i.e. two rows from the data) will show:

1) From=Seattle, To= Berlin, FromTo = To, City= Berlin

2) From= London, To= Berlin, FromTo= To, City= Berlin

...because City= Berlin, which is what the filter specifies.

The number of connections that the filter allows to show depends on how many rows where City is in [list of selected cities].

In other words, it shows every city that are connected to the selected city(ies) instead of filtering out cities that are not selected.

Alas, good thing it's a slow Friday and I'll have time to find ways to crack it 😉