Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
danxshap
Helper I
Helper I

Google Analytics goal completions data inconsistency

Hi all,

I've done some research and am aware of a couple known issues with using Google Analytics & Power BI:

  • The Google Analytics API will sample data for high volume sites, and so the results for high volume sites may not match the actual GA platform because of this.
  • Certain metrics and dimensions cannot be combined or added together to get consistent results.

Unless I'm mistaken, I don't believe the problem I'm running into is related to either of those...

 

I'm trying to get data to answer the question: For each month, how many Goal 8 Completions did we get for each channel grouping?

 

The Google Analytics Query Explorer, which uses the Core Reporting API, is giving me accurate/expected results. See below for recent months' data and a highlighted example for Nov 2020, Organic Search:

Markup 2020-12-16 at 13.33.49.png

 

 

 

 

 

However when I select the exact same metrics and dimensions in Power Query (within Power BI Desktop), I get incorrect results.

Markup 2020-12-16 at 13.34.55.png

 

 

 

 

 

Shouldn't these results be the same because they're both based on the Core Reporting API?

Finally, if I filter for a single month (e.g. Nov 2020), then the metric values change to be correct:

Markup 2020-12-16 at 13.35.55.png

 

 

 

 

 

Could someone please help me understand why adding a simple filter like this would change the metric values in other columns?:

 

Table.SelectRows(#"Sorted Rows", each ([Month of Year] = "202011"))

 

I thought that Table.SelectRows line would just select rows that match the condition and so the result would only be that certain rows are excluded, but perhaps there's something I don't understand about how "cubes" work. Is a different API call being made to Google Analytics as a result of that filter?

 

Thanks for any help in advance!

1 ACCEPTED SOLUTION

Hi @Jimmy801 thank you very much for the reply!

 

I contacted Power BI support about this and their response, combined with yours, explains what's going on.

 

The TLDR solution is:

  1. Get your request working in the Google Analytics Query Explorer with the appropriate date range filters and make sure it says the data is not sampled.
  2. Apply the same date range filters via a Power Query filter step, and the data should match the query explorer.

For anyone else who runs into this problem, the issue has to do with date range filtering (or lack thereof) with the Google Analytics API. By default, Power BI doesn't pass any date filtering parameters when it makes the API request, and that's what results in inconsistent data — in my testing, it can actually be sampled data even when you're not working with big numbers, I guess because it's asking Google Analytics to return data for potentially years worth of data without any date filtering.

 

The key is to add a filter step in Power Query that limits the date range, which somehow makes its way into the resulting API request (I guess through the MDX that you mentioned, which is a concept I was not familiar with).

 

I also found it very useful to try the same request in the Google Analytics Query Explorer to see how different filtering parameters (often larger date ranges) result in the tool telling you whether the returned data is sampled or not.

View solution in original post

4 REPLIES 4
Jimmy801
Community Champion
Community Champion

Hello @danxshap 

 

try to add a Table.Buffer before your sorting step. This should fix the issue

 

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

Hi @Jimmy801,

 

Thank you for the reply! I just tried that but it didn't change the results.

 

Here's the buffer step I added:

Markup 2020-12-17 at 10.22.42.png

 

 

 

 

And here's the resulting data after I sort it so I can see recent months:

Markup 2020-12-17 at 10.23.44.png

 

 

 

 

 

To rule out the possibility that the sorting step is somehow messing up the data, I checked the example row even without sorting and it still has the wrong value:

Markup 2020-12-17 at 10.28.21.png

 

 

 

 

 

 

Any other ideas on how I could fix this?

 

Thank you!

Hello @danxshap 

 

I saw that Power Query is using cube-functions here. Meaning that in the background an MDX is created to give you a result. It's difficult to analyse / understand why on a selectrows the result it's okay. So you would need to understand what this functions is changing in the query made from Power Query to Google. Maybe there are happing some groupings. You could try to apply a grouping to the first two columns and appliying a sum-function to your measure column (not applying Table.Buffer anymore) or to write a MDX/DAX? on your own.

Check out this possibilities and let me know if this changed something.

 

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

Hi @Jimmy801 thank you very much for the reply!

 

I contacted Power BI support about this and their response, combined with yours, explains what's going on.

 

The TLDR solution is:

  1. Get your request working in the Google Analytics Query Explorer with the appropriate date range filters and make sure it says the data is not sampled.
  2. Apply the same date range filters via a Power Query filter step, and the data should match the query explorer.

For anyone else who runs into this problem, the issue has to do with date range filtering (or lack thereof) with the Google Analytics API. By default, Power BI doesn't pass any date filtering parameters when it makes the API request, and that's what results in inconsistent data — in my testing, it can actually be sampled data even when you're not working with big numbers, I guess because it's asking Google Analytics to return data for potentially years worth of data without any date filtering.

 

The key is to add a filter step in Power Query that limits the date range, which somehow makes its way into the resulting API request (I guess through the MDX that you mentioned, which is a concept I was not familiar with).

 

I also found it very useful to try the same request in the Google Analytics Query Explorer to see how different filtering parameters (often larger date ranges) result in the tool telling you whether the returned data is sampled or not.

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors