Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Power Query Merge yields inconsistent results

Power BI Desktop Version  2.42.4611.701 64-bit (January 2017) - though earlier versions seem to have the same issue

Data Source SQL Server 2016

 

When using a Power Query merge in my Power BI Desktop file, the results in the data model are inconsistent every time I refresh.  No error is reported - the data is just never correct.

 

Some rows are dropped, and others are duplicated.  Every refresh returns a different set of rows, so it does not consistently include/exclude/duplicate specific records.  There is always either 0, 1 or 2 of each row.  Never more than 2.

 

 

I created the model and handed over to a business user late November 2016 year.  About 2 weeks ago the user reported that some historical results kept changing.  I am not sure at which point exactly this issue started, but I am now troubleshooting the issue on Power BI Desktop version 2.42.4611.701 64-bit (January 2017), and getting this strange behaviour consistently.  The user reporting the issue is running the November release, so it seems to be across different versions of PBI Desktop.

 

Similar issues were reported in the links below, but I cannot see a resolution on any of them.

https://community.powerbi.com/t5/Desktop/Each-data-refresh-gives-different-data-results/td-p/60989

https://ideas.powerbi.com/forums/360879-issues/suggestions/16403065-inconsistent-results-when-using-...

https://social.technet.microsoft.com/Forums/en-US/7ce7cfb7-1379-486e-91d0-2dfabb53d7f7/merging-queri...

 

 

Status: Needs Info
Comments
v-haibl-msft
Employee

@InaNortje

 

Could you please provide some sample data which can be used to repro this issue? I’d like to try it locally to see if I can repro the same issue as you.

 

Best Regards,

Herbert

Vicky_Song
Impactful Individual
Status changed to: Needs Info
 
InaNortje
Frequent Visitor


Hi @v-haibl-msft,

 

In getting sample data for you to test, I found that the issue is related to:
- high volume source data (note - not high volume data actually imported!)
- combined with a merge (DataMashup)

 

Behaviour and Impact:
1. On smaller source data sets the results are accurate, so initial testing doesn't show the problem
2. Once the issue starts to occur, there is no error or even warning showed when the refresh is done.
3. Because only a relatively small percentage of records are dropped or duplicated (seems 3-6%) it may not become apparent
4. When a record with a high value measure is dropped or duplicated, it can cause significantly incorrect reporting.

 

Since this type of issue can easily pass by undetected and cause loss of confidence in the tool, could you please investigate and see if there is a way to prevent this using internal controls in PowerBI?


Here's a description of the issue as I found it, and what I did to bypass it:

 

Source data is 1 record per Meter every 10 seconds
- MeterNumber [varchar](20) NOT NULL,
- ReadDate [date] NOT NULL,
- ReadTime [varchar](8) NOT NULL,
- Reading [numeric](20,8) NULL
Source data is currently at over 200M records

 

Query1 (MeteredUsage)

Query is: "SELECT MeterNumber, ReadDate, SUM(Reading) AS Reading FROM MeteredUsage GROUP BY MeterNumber, ReadDate"
Result set is around 13K records

 

Query2 (CustomerList) data is 1 record per Meter (300 records)
- MeterNumber
- CustomerNumber
- CustomerGroup

 

Merge is left outer - include all from Query1 (MeteredUsage), related from Query2 (CustomerList)

 

Inconsistent results starts when the data source rows reach around 36M records.  I have enabled diagnostics, and studied the trace files. It seems the query is run in 2 steps, with the result sets overlapping.  The number of rows returned is correct, but this is because some records are dropped while others are duplicated.


Having tracked down the root cause, found that by adding an ORDER BY clause to the original SQL Query, you can prevent the overlap that occurs when the query is split in 2 steps.

 

Trace Extract:
No error is reported - only a warning if you enable diagnostics and investigate the trace. See in extract below that finally reports loadEntirelySuccessful = True

 

DataMashup.Trace Warning: 24579 : {"Start":"2017-02-07T01:15:01.5522421Z","Action":"QueryLoadToModelCompleted","Message":"{\"loadReason\":\"RefreshedQuery\",\"numberOfSteps\":\"3\",\"loadSuccessful\":\"True\",\"loadErrorReason\":\"\",\"loadId\":\"9dafa7a6-0c74-484f-a96c-2d8edbbb2d61\",\"parentId\":\"f3c3efb4-9af5-4710-960c-de57a8ae0a65\",\"isError\":\"False\",\"sessionId\":\"7d551c29-9d94-440f-80d0-7f0f42814a42\",\"deviceId\":\"bf624581\",\"userId\":\"0be7c855-1b67-43e9-8b05-82f95b5ede68\",\"isInternal\":\"False\",\"authenticatedUserId\":\"88c7d9c5-6751-480f-8ed7-3e7dfdc00d1f\"}","ProductVersion":"2.42.4611.701 (PBIDesktop)","ActivityId":"2ace5823-f0a2-4661-82ae-ffcee5a82eae","Process":"PBIDesktop","Pid":9844,"Tid":35,"Duration":"-00:00:00.0443317"}
DataMashup.Trace Warning: 24579 : {"Start":"2017-02-07T01:15:01.5532422Z","Action":"QueryLoadToModelCompleted","Message":"{\"loadReason\":\"RefreshedQuery\",\"numberOfSteps\":\"1\",\"loadSuccessful\":\"True\",\"loadErrorReason\":\"\",\"loadId\":\"9dafa7a6-0c74-484f-a96c-2d8edbbb2d61\",\"parentId\":\"f3c3efb4-9af5-4710-960c-de57a8ae0a65\",\"isError\":\"False\",\"sessionId\":\"7d551c29-9d94-440f-80d0-7f0f42814a42\",\"deviceId\":\"bf624581\",\"userId\":\"0be7c855-1b67-43e9-8b05-82f95b5ede68\",\"isInternal\":\"False\",\"authenticatedUserId\":\"88c7d9c5-6751-480f-8ed7-3e7dfdc00d1f\"}","ProductVersion":"2.42.4611.701 (PBIDesktop)","ActivityId":"9d230bee-eba3-471b-a896-aa67fda5f9b2","Process":"PBIDesktop","Pid":9844,"Tid":35,"Duration":"-00:00:00.0450725"}
DataMashup.Trace Warning: 24579 : {"Start":"2017-02-07T01:15:01.5532422Z","Action":"LoadToModelCompleted","Message":"{\"queriesCount\":\"2\",\"newQueriesCount\":\"0\",\"editedQueriesCount\":\"0\",\"refreshedQueriesCount\":\"2\",\"loadId\":\"9dafa7a6-0c74-484f-a96c-2d8edbbb2d61\",\"successfulQueryLoadsCount\":\"2\",\"failedQueryLoadsCount\":\"0\",\"loadEntirelySuccessful\":\"True\",\"parentId\":\"f3c3efb4-9af5-4710-960c-de57a8ae0a65\",\"isError\":\"False\",\"sessionId\":\"7d551c29-9d94-440f-80d0-7f0f42814a42\",\"deviceId\":\"bf624581\",\"userId\":\"0be7c855-1b67-43e9-8b05-82f95b5ede68\",\"isInternal\":\"False\",\"authenticatedUserId\":\"88c7d9c5-6751-480f-8ed7-3e7dfdc00d1f\"}","ProductVersion":"2.42.4611.701 (PBIDesktop)","ActivityId":"86d33ff7-fe3d-4d33-ac7a-0a0b916fd089","Process":"PBIDesktop","Pid":9844,"Tid":35,"Duration":"-00:00:00.0449700"}


Thanks

Ina

Anonymous
Not applicable

Thanks a lot Ina, forcing an explicit Order by in the source query seems to have fixed it. Even though I am loading in the entire table and don't use a SQL query.

 

I find it absolutely ridiculous that this issue is still occurring, nobody has a clue that something is wrong in the report as we don't get any error. I noticed it by accident that the numbers seemed off, but if this gets spotted by the business they immediatly loose faith in Power BI.

 

In my case we aren't even talking about millions of records, we are talking about ten-thousands...