Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
kendash
Advocate I
Advocate I

Wrong data from Google Analytics

Hi everyone,

we are receiving wrong data from Google Analytics to PowerBI, they are different than data in Google Analytics online interface. Does anyone know, what is the problem?

It happens when we try to get data about Page Tracking, for example when we download data with dimensions Date, Page and Previous Page Path and metric Pageviews. What happens is that we get no combinations with 0 pageviews, although there most surely are such pages with 0 pageviews. Moreover, there are also several combinations with the same number of similar pageviews, such as "27" or "53".

The same thing happens when we try to get dimensions of Date, Page tracking and Event tracking.

 

Does anyone have any idea, why it happens?

1 ACCEPTED SOLUTION
fso
Advocate II
Advocate II

Hi kendash,
if there are several blocks of rows that have the same value as you describe, this is a strong indication that your data is sampled.
(https://support.google.com/analytics/answer/2637192?hl=en)
"If the number of sessions in the property over the given date range exceeds 500k sessions (25M for Premium)1, Analytics will employ a sampling algorithm"

Here's how you can verify if that's the case:
- Check your GA data you have imported in PBI and note down the date range you have imported (earliest and latest date)
- Go to https://ga-dev-tools.appspot.com/query-explorer/

- Set up exactly the same query and use the same dates that you have in PBI

- Hit "Run Query" and have a look at the header section of the result

If it says "Contains sampled data:Yes", then you know, that sampling is the root of the issue.
The only way around it is to request smaller date ranges of data, which you cannot do in PBI itself. So you will have to programmaticly do that elsewhere and use the result as a source for PBI.

View solution in original post

5 REPLIES 5
Anonymous
Not applicable

Hi,

 

My data in the Power BI desktop is higher than the numbers showing on the Google Analytics site. I have checked all the filters, tried setting a new connection and also verified the data, but I cannot figure out the problem. 

 

Please help. 

cv
New Member

Hi kendash,

 

You could also try out some of tools that automatically eliminate GA sampling by breaking your query down to a number of smaller unsampled queries and then aggregate them back together again. E.g. Analytics Canvas and Unsampler.io. Another option is to get all the raw data out of GA, using the tool from scitylana.com. This should also eliminate sampling. Apparently they also offer a PBI Desktop template.

pqian
Employee
Employee

@kendash There was a known issue before Feb update, that GA connector could trigger unwanted sampling when aggregating over the ga:date dimension. We've fixed this issue in the latest Desktop update. Do you have that update?

 

If so, then your data maybe sampled by GA on the service side. One thing you can verify is using Fiddler (http://www.telerik.com/fiddler), capture the outgoing request when you refresh the Query. Analyze the URL and parameters, see if there is anything odd.

fso
Advocate II
Advocate II

Hi kendash,
if there are several blocks of rows that have the same value as you describe, this is a strong indication that your data is sampled.
(https://support.google.com/analytics/answer/2637192?hl=en)
"If the number of sessions in the property over the given date range exceeds 500k sessions (25M for Premium)1, Analytics will employ a sampling algorithm"

Here's how you can verify if that's the case:
- Check your GA data you have imported in PBI and note down the date range you have imported (earliest and latest date)
- Go to https://ga-dev-tools.appspot.com/query-explorer/

- Set up exactly the same query and use the same dates that you have in PBI

- Hit "Run Query" and have a look at the header section of the result

If it says "Contains sampled data:Yes", then you know, that sampling is the root of the issue.
The only way around it is to request smaller date ranges of data, which you cannot do in PBI itself. So you will have to programmaticly do that elsewhere and use the result as a source for PBI.

Rossy
Frequent Visitor

What is the best way to reduce the data sample size before getting it into PBI?  All I want to see is dates and channels as the dimensions, but as it's pulling through all dates ever before I even have chance to do any transformation I'm having to work with sampled data.  What's the best way to get around this?

Thanks

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.