Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Using AVERAGE function against Direct Query source calculates incorrect results

I am re-creating a currently existing dashboard in Power BI.

The dashboard includes a KPI for the average number of hours worked by the workforce of the company for the current YTD and the previous year. The workforce consists of employees, contractors, temps, interns etc.

To keep this simple I have limited my repro to 1 Fact and 1 Dim table. The data is in SQL Server 2016 and I’m using Direct Query to enable the access.

CreatingRelationship.jpg

When establishing the relationship between the tables I checked “Assume referential integrity” so that the queries would use an INNER JOIN for efficiency.

Power BI very helpfully discovered that there are some missing rows in the Dim that we weren’t aware of.


After a little investigation it was determined there are 6 rows in the Fact that do not have a matching row in the Dim. Definitely not the end of the world because there are 3.45 million rows in the Fact. 6 missing rows shouldn’t make a significant difference in the calculated average.

 

The creation of the new measures & dashboard was easy to complete.

Here is the simple dashboard.dashboardNoSlicer-ResultOK.jpg

So far, so good.

The problem occurs when I use the slicer to filter on ‘Workforce Type’.

Slicing on ‘Contractor produces an average of 1,924.33 hours per day!?

 

dashboard-ContractorSlicer-ResultBAD.jpg

 

Slicing on ‘Employee’ produces an average of 2,221.07 hours per day!?

dashboard-EmployeeSlicer-ResultBAD.jpg 

Before the last update to the On Premise Data Gateway, which broke the connection to our SQL Server warehouse, I turned on tracing and captured the SQL being generated. When I found the SQL for the query with the slicer I noticed that the WHERE clause had gotten very complex without a discernable reason. Unfortunately I no longer have the gateway logs as I deleted them during the latest software update.

 

The current YTD measure doesn’t calculate the AVERAGE correctly but the prior Year measure AVERAGE does.

The difference is a second FILTER in the CALCULATE.

 

Here are the two measures.

 

_avgWorkHoursCurrYTD =

  VAR varCurrentYear = 'FactWorkHours'[_CurrentYear]

  VAR varMAXPeriodOfYear = 'FactWorkHours'[_MAXPeriodOfYear]

RETURN

  CALCULATE(

  AVERAGE( 'FactWorkHours'[WhAvg Allocated Hours]),

  FILTER( 'FactWorkHours','FactWorkHours'[WhAvg Pay Period Year] = varCurrentYear),

  FILTER( 'FactWorkHours','FactWorkHours'[WhAvg Pay Period Of Year] < varMAXPeriodOfYear)

  )

 

_avgWorkHoursPriorYear =

  VAR thePriorYear = 'FactWorkHours'[_PriorYear]

RETURN

  CALCULATE(

  AVERAGE( 'FactWorkHours'[WhAvg Allocated Hours]),

  FILTER( 'FactWorkHours','FactWorkHours'[WhAvg Pay Period Year] = thePriorYear)

  )

If the FILTER shown in Bold is removed the amount calculates correctly.

If  the data is Imported into the .pbix the current YTD average measure is calculated correctly.

Status: Needs Info
Comments
v-haibl-msft
Employee

@BobHaze

 

How abuot the result if we change the FILTER part of current YTD average measure as below?

 

_avgWorkHoursCurrYTD =
VAR varCurrentYear = 'FactWorkHours'[_CurrentYear]
VAR varMAXPeriodOfYear = 'FactWorkHours'[_MAXPeriodOfYear]
RETURN
    CALCULATE (
        AVERAGE ( 'FactWorkHours'[WhAvg Allocated Hours] ),
        FILTER (
            ALLSELECTED ( 'FactWorkHours' ),
            'FactWorkHours'[WhAvg Pay Period Year] = varCurrentYear
                && 'FactWorkHours'[WhAvg Pay Period Of Year] < varMAXPeriodOfYear
        )
    )

 

Best Regards,
Herbert

Vicky_Song
Impactful Individual
Status changed to: Needs Info
 
BobHaze
Frequent Visitor

@v-haibl-msft

 

Thank you for your suggestion. 

I made the change to the measure and it did not result in correct values.

The results did change slightly, Employee = 2,217.89 and Contractor=1,934.61 but the average should be somewhere around 8.5 hours/ day.

 

This may be very difficult to diagnose until we get our Gateway installation working again.

The culprit is definitely the SQL being produced and I think once it is available from the trace it will be easier to figure out.