cancel
Showing results for
Did you mean:
Highlighted
Frequent Visitor

## Weighted median as a measurement in DAX

I'm working with weighted data and I would like to use slicers with the weights so I'm creating custom measures for weighted values. I already got the weighted average:

```weighted_mean =
SUMX ( 'data', [value] * [weight] )
/ CALCULATE (
SUM ( 'data'[weight_age_group] ),
FILTER ( 'data', NOT ( ISBLANK ( [value] ) ) )
)```

Now I'm having difficulties calculating a weighted median. I think I have a clue, but I have not made any progress in hours.

This is what I have came up this far:

```weighted_median =
VAR weight_sum_half =
CALCULATE (
SUM ( 'data'[weight] ),
FILTER ( 'data', NOT ( ISBLANK ( [value] ) ) )
)
/ 2
VAR values = 'data'[values]
VAR ascending =
CALCULATE (
SUM ( 'data'[weight] ),
FILTER ( 'data', 'data'[value] <= values )
)
VAR descending =
CALCULATE (
SUM ( 'data'[weight] ),
FILTER ( 'data', 'data'[value] >= values )
)
RETURN
CALCULATE (
AVERAGE ( [arvo] ),
FILTER (
'data',
ascending >= weight_sum_half
&& descending >= weight_sum_half
)
)```

The idea is to calculate cumulative sum of weights in ascending and descending order by values and then to take the average of the values that meet up in the middle.

I first tried this idea by creating columns for each of the variables. This gave the right observation (the right row) the unweighted median of the values. So it found the right observations for the weighted median but gave the wrong value. When I tries this as written above as a measurement, Power Bi complains "A single value for column 'value' in table 'data' cannot be determined".

Some example data:

value       weight

 1 0.2 2 0.5 3 1.5 5 0.3 6 1.5 7 1 null 0.2 2 0.8 4 2 null 2

The real median is 5.5 and the weighted median is 4. There are some missing values just like in the real data.

5 REPLIES 5
Frequent Visitor

## Weighted median measurement in DAX

I'm working with weighted data and I would like to use slicers with the weights so I'm creating custom measurements for weighted values. I already got the weighted average done:

```weighted_mean =
SUMX ( 'data', [value] * [weight] )
/ CALCULATE (
SUM ( 'data'[weight_age_group] ),
FILTER ( 'data', NOT ( ISBLANK ( [value] ) ) )
)```

Now I'm having difficulties with the weighted median. I think I have a clue, but I have not made any progress in hours.

This is what I have came up this far:

```weighted_median =
VAR weight_sum_half =
CALCULATE (
SUM ( 'data'[weight] ),
FILTER ( 'data', NOT ( ISBLANK ( [value] ) ) )
)
/ 2
VAR values = 'data'[values]
VAR ascending =
CALCULATE (
SUM ( 'data'[weight] ),
FILTER ( 'data', 'data'[value] <= values )
)
VAR descending =
CALCULATE (
SUM ( 'data'[weight] ),
FILTER ( 'data', 'data'[value] >= values )
)
RETURN
CALCULATE (
AVERAGE ( [arvo] ),
FILTER (
'data',
ascending >= weight_sum_half
&& descending >= weight_sum_half
)
)```

The idea is to calculate cumulative sum of weights in ascending and descending order by values and then to take the average of the values that meet up in the middle.

I first tried this idea by creating columns for each of the variables. This gave the right observation the unweighted median of the values. When I tries this as written above as a measurement Power Bi complains "A single value for column 'value' in table 'data' cannot be determined.

Some example data:

value       weight

 1 0.2 2 0.5 3 1.5 5 0.3 6 1.5 7 1 null 0.2 2 0.8 4 2 null 2

The real median is 5.5 and the weighted median is 4. There are some missing values just like in the real data.

Moderator

## Re: Weighted median measurement in DAX

@tuomo_kareoja

In this scenario, your weighted value is [Value]*[Weight], you can create a calculate column like:

`Column = [Value]*[Weight]`

Then just use MEDIAN() function on this calculated column. If this calculation need to be based on filtered context, you can use MEDIANX() function.

Regards,

Frequent Visitor

## Re: Weighted median measurement in DAX

Unfortunately this problem needs a bit more complicated solution, because the weights represent observations. So a row with value 5 and weight 3 is equal to 3 observations of value 5 and weight 1. The weights are not all integers so simply adding the row times its weight to the dataset will not work (this would also make the dataset huge).

Taking the median of value * weight doesn't take this account and gives a wrong answer. This can be seen clearly from an extreme example:

 value weight value * weight 1 10000 10000 2 1 2 3 1 3

If we take the median by MEDIANX([value]*[weight]) we get 3 as the median. But as the weights are actually number of rows, the right answer would obviously be 1.

Visitor

## Re: Weighted median as a measurement in DAX

I have been using the following DAX code for weighted medians. It works but is a bit slow calculating. If anyone has a more efficient formula, I would love to know it.

Weighted Median = (

MINX(

FILTER(

VALUES([value]),

CALCULATE(

SUM([weight]),

[value]

<= EARLIER([value])

)

> SUM([weight]) /2),

[value]

)

+ MINX(

FILTER(

VALUES([value]),

CALCULATE(

SUM([weight]),

[value]

<= EARLIER([value])

)

> (SUM([weight]) - 1) /2),

[value]

)) /2

Occasional Visitor

## Re: Weighted median as a measurement in DAX

I came up with this formula, it seems to work but I'm a DAX newbie so I'm not at all confident that it will always work. I'd love to get feedback on the expression and if it works for others. This gives the "upper weighted median" and doesn't attempt to average upper and lower when the boundary is exactly in the middle.

`wgtmedian:= MINX( FILTER( table, SUMX( filter( table, table[value]<= earlier( table[value])) , table[weight]) >=sum([weight])/2), table[value])`