Solved: Pre-Aggregated Data

John_D · ‎08-01-2016

Hi,

I wonder whether I could get any thoughts from more experienced Power BI users on dealing with pre-aggregated data.

I'm working with results from a survey, and I have a simple hierarchy - Regions (at the top), Districts, and Stores (at the bottom). I want to see the survey feedback firstly at region level, and then be able to drill down into the districts, and finally the stores. If I supply PBI with the raw response data then I can do exactly that, but there are difficulties.

For example, one problem is that there may be insufficient respondents at a single store. For the sake of individual anonymity, we don't report on groups of people unless there are at least 10 respondents. Their feedback should still feed into the upper levels of the hierarchy, but it shouldn't be possible to view the feedback for that particular store. Obviously, if we simply filter such stores out of the data set, then their input will also be filtered out of their district and region, which isn't what we want.

The above is just one illustration (there are others) as to why it might be better if we could get PBI to work with pre-aggregated data at each level of our hierarchy. I have tried that approach, but it seems to go against PBI's natural way of working, and it certainly introduces other difficulties - particularly when it comes to navigating the hierarchy.

So, I'd like to ask whether other people have had any similar issues or ideas, particularly on passing ready-aggregated data to PBI?

Cheers

John

KGrice · ‎08-01-2016

Would it be sufficient for what you're doing to use measures that report based on the count of your surveys? It doesn't solve the overall question of importing aggregated vs non-aggregated data, but maybe it will help for the particular example you used.

If you've got a measure that caculates the number of responses you have:

ResponseCount = DISTINCTCOUNT(ResponseTable[ResponseID])

You could use that as a buffer for what you report. If you want an average of the response scores for question 1 on your survey, you could use:

Q1Avg = IF([ResponseCount] < 10, BLANK(), AVERAGE(ResponseTable[Q1]))

That will still show your rollup scores and allow you to use the drill-down features nicely, and any stores (or any level of your hierarchy, really) that have fewer than 10 responses will show up on your chart axis, but won't show any scores for that level.

View solution in original post

jordanlevine · ‎10-28-2016

Any other idea on this one folks?

Unlike the OP, I'm utilizing economic data from the bureau of economic analysis and bureau of labor statistics, which only comes pre-aggregated, so I don't have the option of setting up a rule on when to show the data.

Also, some of the sub-components of over-arching sectors are suppressed, so I can't just sum up across sub-pieces to get the whole.

What I really need is a drill down to actually work like a filter, so it filters down to Category = "Subsector" when I drill down, and when I drill back up, it needs to filter on Category = "Sector". Sector != sum(Subsector) because some subsectors are suppressed to protect confidentiality, so it seems like only filtering would work, but I don't want people to have to change the slicer back and forth every time they drill up or down.

gabrielmota · ‎06-27-2018

@jordanlevine, could you find any solution to your problem? I'm having a similar problem here where most of my aggregated calculations come from a statistical model and stored on a database. My report needs to be able to access information both from top and bottom of my hierarchy without any calculation.

KGrice · ‎08-01-2016

Would it be sufficient for what you're doing to use measures that report based on the count of your surveys? It doesn't solve the overall question of importing aggregated vs non-aggregated data, but maybe it will help for the particular example you used.

If you've got a measure that caculates the number of responses you have:

ResponseCount = DISTINCTCOUNT(ResponseTable[ResponseID])

You could use that as a buffer for what you report. If you want an average of the response scores for question 1 on your survey, you could use:

Q1Avg = IF([ResponseCount] < 10, BLANK(), AVERAGE(ResponseTable[Q1]))

That will still show your rollup scores and allow you to use the drill-down features nicely, and any stores (or any level of your hierarchy, really) that have fewer than 10 responses will show up on your chart axis, but won't show any scores for that level.

John_D · ‎08-02-2016

Thank you - that is very helpful. It wasn't (and, if I'm totally honest, still isn't) intuitively obvious that measures would behave this way - that is, although there's nothing in the statement:

Q1Avg = IF([ResponseCount] < 10, BLANK(), AVERAGE(ResponseTable[Q1]))

to explicitly restrict that average to just the stores within a selected district, it does it anyway. So I've learned something very useful there.

While there are some areas where I still think I need to use my own aggregates, this helps enormously.

Thanks again.

Cheers

John

Pre-Aggregated Data

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024

How to Get Your Question Answered Quickly