Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
Suus
Regular Visitor

Co-occurrence matrix

Hi,

I have a dataset of several thousand companies that have memberships in work groups. There are 16 different work groups (A-P). Companies can be member of one or several groups. Most are members in one or two groups. I am trying to figure out the most common combinations of work groups. I think this would be called a co-occurrence matrix.

 

My original dataset looks like this:

Work groups

A

B

C

D

Company 1

1

0

0

0

Company 2

0

1

1

0

Company 3

1

1

1

0

Company 4

1

0

0

1

 

 

 

 

 

Company 1 is member of work group A only. Company 3 has membership to work groups A, B and C.

 

My expected output would be:

 

A

B

C

D

A

3

0

1

1

B

0

2

2

0

C

1

2

2

0

D

1

0

0

1

 

 

 

 

 

The diagonal shows the total membership in a given work group, all other field show the “overlap” as in how many companies are in  both those work groups. It doesn’t matter to me whether the output is in a full matrix (where all combinations appear twice – once above and once below the diagonal) or a triangle shape.

 

My approach (that doesn’t work 😞 ) was to create three tables:

 

table1: source

(this is the unpivoted original dataset with an extra copy of the Workgroup column)

 

Company

Workgroup

Workgroup copy

Value

Company1

A

A

1

Company1

B

B

0

Company1

C

C

0

Company1

D

D

0

Company2

A

A

0

Company2

B

B

1

Company2

C

C

1

Company2

D

D

0

 

 

 

 

table 2: Workgrouplist1

Workgroups

A

B

C

D

 

table 3: Workgrouplist2

Workgroups

 A

B

C

D

 

There are no relationships between these three tables.

 

And here is the DAX code of my measure:

countifs =

VAR CurrentX = SELECTEDVALUE('Workgrouplist2'[Workgroups])

VAR CurrentY = SELECTEDVALUE('Workgrouplist1'[Workgroups])

VAR Cooccurrence =

            CALCULATE (

            COUNTROWS('source'),

            FILTER(

                'source',

                'source'[Workgroup] = CurrentX

                    &&

                'source'[Workgroup copy] = CurrentY

                    && 'source'[Value] = 1

                )

            )

RETURN

    Cooccurrence

 

 

The actual output is

 

A

B

C

D

A

3

 

 

 

B

 

2

 

 

C

 

 

2

 

D

 

 

 

1

 

What is my mistake?

2 REPLIES 2
Suus
Regular Visitor

Hi @v-yaningy-msft,

Thank you for asking. The numbers represent the number of companies that participate in the respective two workgroups. No company is a member of both A and B, one company is in both A and C, two are in both B and C etc. The two red triangles show in fact the same numbers. So a way to display only one of these triangles would be perfect (look better than two that replicate the same information).

 

Thank you!! Best, Suus

v-yaningy-msft
Community Support
Community Support

Hi, @Suus 

Didn't get the logic of the implementation of the numbers in the red box, can you explain it in more detail

vyaningymsft_0-1713414225454.png

Best Regards,
Yang
Community Support Team

 

Helpful resources

Announcements
PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.