Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
GTRailey
Frequent Visitor

Standard Deviation and Z-Score calculation depending on time slicer

I am looking to calculate the z-scores for the number of calls answered by a given call center agent.  Each Agent can have either a standard or non-standard role.  

I have an Agent_Data table containing each of the agent names and their roles.  I also have a Call_Data table which contains a listing for each incoming call.  Pertinent columns include the date of the call and the agent who answered.  I also 

 

I have two slicers in my report.  One is for the agent role.  The other is for the date range.  I have mapped the date slicer to the date column in a calendar table.

 

In order to calculate the mean, I added a column to my Agent_Data table to count the calls answered by that agent:
Calls_Taken = COUNTA(Call_Data[StartDate])+0

 

The 0 in the above formula is so that the mean will take into consideration Agents who did not take any calls.  I then have a measure to calculate the mean, another to calculate the Standard Deviation:

Mean_Calls_Taken = calculate(average(Agent_Data[Calls_Taken]),allselected(Agent_Data))

StDev_Calls_Taken = calculate(STDEV.P(Agent_Data[Calls_Taken]),allselected(Agent_Data))

 

 

These allow me to create a third measure to calculate an individual Agent's Z-score based on the mean and stdev of the group:

Z-Score_Calls_Taken = (sum(Agent_Data[Calls_Taken])-[Mean_Calls_Taken])/[StDev_Calls_Taken]

 

All of this works if I filter based on Role.  But I am unable to find a way to enact a date-based filter.  If I use a time slicer, the Calls_Taken column does not change.  If I move the Calls_Taken column under the Call_Data table, the numbers DO change as desired.  The problem is that all of the Agents with 0 calls appear blank instead of Zero.  This changes the mean.

See the screenshots below.  Notice how when I change the time frame, the first two columns change (As desired).  The Third column removes the zero values and the Non-zero values remain constant (not desired).  The mean and stdev change, but only because the previous zero values have become blank instead.  
calls_taken.jpgcalls_taken_time.jpg

Any help in this would be appreciated.  Thank you.

1 ACCEPTED SOLUTION

I fixed the problem.  In case anyone out there is curious, here is how I did it.

 

PBIX - https://1drv.ms/u/s!Ar0HG-aZJh_ZhTPiiyKo0VDhPdLa

 

I started with two queries, Agent_Data, and Call_Data.  Agent_Data has three columns:
Name - The Agent's full name
id - The Agent's unique id
Role - Standard or Nonstandard

 

Call_Data has two Columns:

Agent - The Agent's full name
StartTime - The date and time of the inbound call

 

Using PowerQuery:

1. I created a new Date column in Call_Data that extracts the date from StartTime using Date.From()

2. I made a copy of Agent_Data by right-clicking>Duplicate.  I called this table "Totals."
3. I removed all columns from "Totals" except for the id column.

4. I added a custom Date column to "Totals" using the List.Dates() function.  I called this "Date."

5. I expanded the "Date" column and changed the type to Date.

 

This gave me a table with each agent name listed once for every date in the range.

 

Using Dax:

1. I created a table called "Calendar" by using the Calendar() function.  I created the following relationships:

a. Agent_Data[Name] <> Call_Data[Agent]

b. Agent_Data[id] > Totals[id]

c. Calendar[Date] > Call_Data[StartDate]

d. Calendar[Date] > Totals[Date]

2. I created a Measure M_Calls_Taken = COUNTA(Call_Data[StartDate])+0

3. I created a column 'Totals'[Calls_Taken] = M_Calls_Taken

4. I created the following additional Measures:

COUNT_Calls_Taken = calculate(DISTINCTCOUNT('Totals'[id]),allselected('Totals'))
SUM_Calls_Taken = calculate(sum('Totals'[Calls_Taken]),allselected('Totals'))+0

Mean_Calls_Taken = [SUM_Calls_Taken]/[COUNT_Calls_Taken]

Var_squaredif = Power(sum('Totals'[Calls_Taken])-[Mean_Calls_Taken],2)

Var_Calls_Taken = var difsquare = SUMX(ALLSELECTED(Agent_Data),[Var_squaredif]) return difsquare/[COUNT_Calls_Taken]

StDev_Calls_Taken = sqrt([Var_Calls_Taken])

Z-Score_Calls_Taken = ([M_Calls_Taken]-[Mean_Calls_Taken])/[StDev_Calls_Taken]

 

This allowed me to select various dates within my range along with different roles at the report level and adjust the Z-Scores on the fly.  I posted this reply immediately after fixing the issue, so there is still a bit of cleanup/consolidating to do, but hopefully the above steps will help with thinking through any similar issues.

 

Thank you.

View solution in original post

4 REPLIES 4
v-ljerr-msft
Employee
Employee

Hi @GTRailey,

 

What's the relationship between your Agent_Data table, Call_Data table and your Calendar table? Could you share a sample pbix file which can reproduce the issue? You can upload it to OneDrive or Dropbox and post the link here. Do mask sensitive data before uploading.Smiley Happy

 

Regards

https://1drv.ms/u/s!Ar0HG-aZJh_ZhTFQdh0NcfwPRxce

 

Above is a link to a sample PBIX file that can reproduce the issue.

 

My only real objective is getting the correct Z-Score values.  Any way we can acheive this is open at this point.

 

Below are screenshots from the file above depicting what it is displaying vs what I want it to display.

 

Current:

All Dates&All Agents (Z-Score values are correct)

Now1.JPG

 

 

All Dates&Standard Agents (Z-Scores are correct)

Now2.JPG

 

 

Dates1/2-1/5&All Agents (Z-Scores are Incorrect)

Now3.JPG

 

 

Dates1/2-1/5&Standard Agents (Z-Scores are Incorrect)

Now4.JPG

 

_______________________________________________________

Desired:

AllDates&AllAgents (Same as above, this currently works)

AllDates&Standard Agents (Same as Above, this currently works)

 

Dates1/2-1/5&All Agents (Z-Scores are correct)

Des3.JPG

 

Dates1/2-1/5 & Standard Agents (Z-Scores are correct)

Des4.JPG

 

I understand that due to the nature of calculated column values being computed during database processing, I may not be able to use them in a date slicer.  The problem is that I need to calculate the Mean and Standard Deviation over the total values per Agent and those formulas require me to feed them a column.  For this reason, I am also venturing down the route of only using Measures and calculating the Mean and Standard Deviations "manually" as opposed to using their formulas.  This is giving me a different issue which I will post in a separate reply.

 

I was able to create some measures that replace the calculated column which solves the issue with the Mean and Standard Deviation.  However, the method relies on a summary table that breaks down the number of calls each Agent received by date.  I created a manual table (Called Totals_Manual) in order to fix the Mean and Standard Deviation issues.  Now that I have solved that issue, I need to be able to generate this table using Dax as opposed to manual input.

 

The table should have a list of dates with each agent name appearing once per date, then the number of calls taken on that date by that agent.  If the Agent did not take any calls that day, their name would still appear, but it would have a 0 in the calls_taken column.

 

A sample PBIX file is here https://1drv.ms/u/s!Ar0HG-aZJh_ZhTLheBmCj5b4b_uB

 

Note that "Calls_Taken" is really only one of several different items for which I will need to calculate Z-Scores.  I plan to use the same method for each item and add them on to the summary table as needed.  I would imagine the summary table to look similar to the below table when all is said and done.

 

DateAgentIdCalls_TakenCalls_MadeCalls_MissedTickets_CreatedTickets_Resolved
1/1/2017Aatkins21112
1/1/2017Bbenning13213
1/1/2017Ccastor21434
1/1/2017Ddurant04412
1/1/2017Eesterhouse02411
1/1/2017ffrankson02042
1/1/2017ggoldsmith02310
1/2/2017Aatkins13102
1/2/2017Ddurant22320
1/2/2017Bbenning00404
1/2/2017Ccastor03421
1/2/2017Eesterhouse04013
1/2/2017ffrankson04133
1/2/2017ggoldsmith04421
1/4/2017Aatkins30313
1/4/2017Bbenning22332
1/4/2017Ccastor11143
1/4/2017Eesterhouse10310
1/4/2017Ddurant03310
1/4/2017ffrankson02333
1/4/2017ggoldsmith00033
1/5/2017Aatkins44004
1/5/2017Bbenning21201
1/5/2017Ccastor23400
1/5/2017Ddurant21333
1/5/2017Eesterhouse12140
1/5/2017ffrankson03323
1/5/2017ggoldsmith03200
1/8/2017Ccastor10033
1/8/2017Ggoldsmith11141
1/8/2017Aatkins02301
1/8/2017Bbenning03001
1/8/2017Ddurant00213
1/8/2017ffrankson01413
1/8/2017Eesterhouse03201

 

 

Again, any help you can provide is appreciated.

 

Thank you!

I fixed the problem.  In case anyone out there is curious, here is how I did it.

 

PBIX - https://1drv.ms/u/s!Ar0HG-aZJh_ZhTPiiyKo0VDhPdLa

 

I started with two queries, Agent_Data, and Call_Data.  Agent_Data has three columns:
Name - The Agent's full name
id - The Agent's unique id
Role - Standard or Nonstandard

 

Call_Data has two Columns:

Agent - The Agent's full name
StartTime - The date and time of the inbound call

 

Using PowerQuery:

1. I created a new Date column in Call_Data that extracts the date from StartTime using Date.From()

2. I made a copy of Agent_Data by right-clicking>Duplicate.  I called this table "Totals."
3. I removed all columns from "Totals" except for the id column.

4. I added a custom Date column to "Totals" using the List.Dates() function.  I called this "Date."

5. I expanded the "Date" column and changed the type to Date.

 

This gave me a table with each agent name listed once for every date in the range.

 

Using Dax:

1. I created a table called "Calendar" by using the Calendar() function.  I created the following relationships:

a. Agent_Data[Name] <> Call_Data[Agent]

b. Agent_Data[id] > Totals[id]

c. Calendar[Date] > Call_Data[StartDate]

d. Calendar[Date] > Totals[Date]

2. I created a Measure M_Calls_Taken = COUNTA(Call_Data[StartDate])+0

3. I created a column 'Totals'[Calls_Taken] = M_Calls_Taken

4. I created the following additional Measures:

COUNT_Calls_Taken = calculate(DISTINCTCOUNT('Totals'[id]),allselected('Totals'))
SUM_Calls_Taken = calculate(sum('Totals'[Calls_Taken]),allselected('Totals'))+0

Mean_Calls_Taken = [SUM_Calls_Taken]/[COUNT_Calls_Taken]

Var_squaredif = Power(sum('Totals'[Calls_Taken])-[Mean_Calls_Taken],2)

Var_Calls_Taken = var difsquare = SUMX(ALLSELECTED(Agent_Data),[Var_squaredif]) return difsquare/[COUNT_Calls_Taken]

StDev_Calls_Taken = sqrt([Var_Calls_Taken])

Z-Score_Calls_Taken = ([M_Calls_Taken]-[Mean_Calls_Taken])/[StDev_Calls_Taken]

 

This allowed me to select various dates within my range along with different roles at the report level and adjust the Z-Scores on the fly.  I posted this reply immediately after fixing the issue, so there is still a bit of cleanup/consolidating to do, but hopefully the above steps will help with thinking through any similar issues.

 

Thank you.

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.