Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
sfmike99
Advocate II
Advocate II

Using Intersect function

I have a table of survey data with four fields of interest: 

  • Respondent: unique identifier for each participant
  • QuestionRoot: fieldname to link data with questions
  • RowIndex: an index to the specific response chosen for that question
  • Value: the individual response to that question/response pair (0 or 1)

We have created a dashboard to allow exploration of the data. One question that often comes up however is of the form: "How many people who chose reponse 1 to question A also chose response 1 to question B?"

 

Here is an example which uses temporary tables and the Intersect function to answer the question: "What % of respondents who collect data on paper also store the data on paper?"

 

ManualCollectAndStore = 
    VAR ManualCollect = 
        CALCULATETABLE(
          DISTINCT('Raw Data'[Respondent]),
          FILTER('Raw Data',
            'Raw Data'[QuestionRoot] = "CollectMethod" && 'Raw Data'[RowIndex] = 1 && 'Raw Data'[Value] > 0
          )
    )
    VAR ManualStore = 
        CALCULATETABLE(
          DISTINCT('Raw Data'[Respondent]),
          FILTER('Raw Data',
            'Raw Data'[QuestionRoot] = "StorageMethod" && 'Raw Data'[RowIndex] = 1 && 'Raw Data'[Value] > 0
          )
    )
    RETURN
        COUNTROWS(INTERSECT(ManualCollect, ManualStore)) / 
        COUNTROWS(ManualCollect)

 

(Answer is 43.6% in case you are interested)

 

Question 1: Make sense? Any suggestions?

 

Question 2: Is it possible in Power BI to generalize this to the entire dataset? In other words, allow the user to select any two question/response pairs and see the resulting intersect value?

Thanks for any guidance!

 

 

1 ACCEPTED SOLUTION

Armed with the suggestion for disconnected tables and some head scratching time, I think I figured this out.

 

Solution was to clone the data table (QuestionA and QuestionB) and give each of them separate slicers for the questions and response options. I use DAX to create calculated tables based on the current selection for each, and use Intersect to get the numbers I need. The calculation works like this:

  • Numerator: everyone who chose selected question/response options for BOTH questions
  • Denominator: everyone who chose selected question/response for A and ANY response for B question

Here's the DAX: 

PctRespAthenB = 
    VAR ThisResponseA = CALCULATETABLE(
        DISTINCT(QuestionA[Respondent]),
        ALLEXCEPT(QuestionA, QuestionA[QuestionRoot], QuestionA[RowIndex])
    )
    VAR ThisResponseB = CALCULATETABLE(
        DISTINCT(QuestionB[Respondent]),
        ALLEXCEPT(QuestionB, QuestionB[QuestionRoot], QuestionB[RowIndex])
    )
    VAR AllResponseB = CALCULATETABLE(
        DISTINCT(QuestionB[Respondent]),
        ALLEXCEPT(QuestionB, QuestionB[QuestionRoot])
    )
RETURN
   COUNTROWS(INTERSECT(ThisResponseA, ThisResponseB)) /
   COUNTROWS(INTERSECT(ThisResponseA, AllResponseB))

 

And here's how it looks in a rough version of report. I will swap in text for questions & responses later but you get the idea.

 

sfmike99_1-1615158234333.png


I tested this against some crosstab matrices I created before and the numbers lined up. Nifty that you can even select multiple response options to essentially bin the data.

Make sense? Any suggestions?

View solution in original post

7 REPLIES 7
Ashish_Mathur
Super User
Super User

Hi,

With disconnected tables this should be possible to solve.  Share some data to work with.


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/

Armed with the suggestion for disconnected tables and some head scratching time, I think I figured this out.

 

Solution was to clone the data table (QuestionA and QuestionB) and give each of them separate slicers for the questions and response options. I use DAX to create calculated tables based on the current selection for each, and use Intersect to get the numbers I need. The calculation works like this:

  • Numerator: everyone who chose selected question/response options for BOTH questions
  • Denominator: everyone who chose selected question/response for A and ANY response for B question

Here's the DAX: 

PctRespAthenB = 
    VAR ThisResponseA = CALCULATETABLE(
        DISTINCT(QuestionA[Respondent]),
        ALLEXCEPT(QuestionA, QuestionA[QuestionRoot], QuestionA[RowIndex])
    )
    VAR ThisResponseB = CALCULATETABLE(
        DISTINCT(QuestionB[Respondent]),
        ALLEXCEPT(QuestionB, QuestionB[QuestionRoot], QuestionB[RowIndex])
    )
    VAR AllResponseB = CALCULATETABLE(
        DISTINCT(QuestionB[Respondent]),
        ALLEXCEPT(QuestionB, QuestionB[QuestionRoot])
    )
RETURN
   COUNTROWS(INTERSECT(ThisResponseA, ThisResponseB)) /
   COUNTROWS(INTERSECT(ThisResponseA, AllResponseB))

 

And here's how it looks in a rough version of report. I will swap in text for questions & responses later but you get the idea.

 

sfmike99_1-1615158234333.png


I tested this against some crosstab matrices I created before and the numbers lined up. Nifty that you can even select multiple response options to essentially bin the data.

Make sense? Any suggestions?

Hi  @sfmike99 ,

 

Could you pls upload your .pbix file if there' s no confidential information?It would be beneficial for test.

 

Best Regards,
Kelly

Did I answer your question? Mark my post as a solution!

 

Thanks @v-kelly-msft but I think I'm good. This was a really fun project to work on, and an interesting alternative to crosstab / matrices. Did a demo for the client today and they loved it. 

Also I had an aha this morning. Instead of duplicating the large unpivoted dataset (3m+ rows) I think I can just duplicate the Question/Response table (already used for reports) and join both to the data. The two slicers should then work to produce two different views of the same underlying data.

LMK if anyone has any questions.

Great - here's a sample with four questions and ten respondents. (source table is > 3m rows):

https://1drv.ms/x/s!AvmRZllooYiOnyuQ3NNFa7laXSL9?e=giUbQ9

 

If I understand your comment, then treating this data as two different tables would enable any intersection to be calculated - letting you have two differents slicers for the same data. But how would that work with the measure?

 

Thanks for looking at this.

Hi,

What is the use of the RowIndex column?  Not only do i fund that confusing, i see that it has the same values as those shown in the Value column.  Don't we need just the first 3 columns to answer your question?


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/

Ah sorry - I see where that is confusing. 

 

There are multiple question types in this dataset, of which this is one example. The meaning of the Value field varies according to the question type. It could be a 0/1 or a ranking for example. For simplicity here I just included one type where the Value and RowIndex are the same. 

 

The purpose of RowIndex is a lookup for the specific response option chosen, while value contains their response (which may or may not be the same). For this sample data you can ignore one of them. 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.