Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
edhans
Super User
Super User

Data Model in Service duplicating and replacing rows

I have a ticket open with MS on this but am interested to see if anyone is noticing an issue where their reports in the Service from published PBIX desktop files are not right. Here is what I've found:

  • The record count is always correct. For example, we have 877 records in May 2018. That is correct.
  • Some records are duplicated, but something in the data model is telling it to maintain the same record count, so other records are removed. This is 100% in the data model (DAX) portion. Not the query.
  • I've seen reports refresh ok (most of the time) and then randomly exhibit this behavior.
  • Started late last week. (towards the end of the June 11, 2018 week)
  • I have verified there is no corruption in the PBIX file itself as I redownloaded it from the service and refreshed it on my desktop. The numbers work.
  • If I download the PBIX from the service when it shows wrong numbers, and don't refresh it I can pinpoint which rows are duplicated and which are removed by using the new filter views in the PBI Desktop app and copying the filtered tables to Excel for further analysis. THat is how I know the record count is always correct.

As I said, I have a ticket open and have started the initial conversations, but wanted to see if anyone here was experiencing the same issue recently.

 On the June 2018 PBID and June 2018 Gateway. These are SQL Server tables. I've not seen this issue on other data types, but most of our data is SQL Server.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting
1 ACCEPTED SOLUTION

Hi Ed. As I said over email, we've identified the root cause of this issue. We're currently targeting the fix for our July release.

 

In the meantime, you can work around the issue by ensuring that your folded queries are always sorted. This can be done by always including (and preserving in downstream queries) all primary keys for a given table. Or, if this isn't possible (as in your case when doing a Group By operation), you can explicitly sort by a set of unique keys before combining. (Also note that it may be necessary to buffer after sorting and before combining to ensure the sort is applied by the folding layer. You'll have to experiment to see whether or not this is necessary.)

 

Ehren

View solution in original post

43 REPLIES 43
Kasunpathirana
Resolver I
Resolver I

I publish PBIX from desktop to service and use a gateway to SQL Server to retreive data to refresh daily. I too came across that the data was not refreshed properly and was random and since then i did not got this issue. But I also faced the same issue earlier. 

Thanks. A good confirmation I am not nuts. If you have any data, please open a ticket. It may help MS track this issue down. 



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

ha ha yup but later i didn't face this issue. Sure I will check and if have will open a ticket. But thanks for this information because I was intially thinking it was my issue in model. But frankly i didn't change anything in the model and the issue fixed by itself. 

Yup. I've seen it fix itself, then come back, then fix, etc. Seems a bit random.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Hi all. Do you ever see duplicated rows after refreshing locally in PBIDesktop, or only when refreshing via the Gateway?

Ehren

Only via the gateway. I have the June 2018 version 14.16.6724.2 installed, which is what https://www.microsoft.com/en-us/download/details.aspx?id=53127&751be11f-ede8-5a0c-058c-2ee190a24fa6=... is pushing. (it shows a different version there, but it sends the version I listed.)



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Ok. Does the duplication only happen for rows coming from a SQL Server data source? Or does it happen for other data sources as well?

Ehren

Most of our data is from SQL Server, so I've only seen it happen there.

 

Edit: Actually, 100% of our data through the gateway is SQL Server. In the few cases where we use CSV files or something else, that isn't via the gateway. I'm not mashing up any gateway+non-gateway data.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

I ran a test and this is what I am seeing:

  • The table that shows the error is a result of a Table.Combine in Power Query.
  • I took the two source queries to that and loaded them into the data model.
  • I then used a DAX UNION() statement to generate a new table in Power BI.
  • I build a new measure in the new DAX table to replicate a mesaure in the table loaded from the Table.Combine query and stuck it on a card.
  • Refreshed everything in Power BI Desktop. Both the measure from the orginal table and the measure from the DAX UNION function show the same result.
  • Published to my workspace, connected to the Gateway and refreshed.
  • The measure in the DAX UNION table works fine.
  • The measure in the original table is off by these random rows.

Something isn't right with how the Table.Combine statement is being loaded either by the Gateway or being handled by the service. The desktop app is fine.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Hi Ed. I've tried reproducing this locally (using SQL data refreshed via a Gateway) and am unfortunately not seeing any duplication. Would it be possible for you to try some other things for me?

 

  • Do your SQL queries involve passing native SQL statements (e.g. "select * from myTable"), or do they simply connect to the tables? What transforms (if any) are you doing on top of the SQL data?
  • Try duplicating the query that does the Table.Combine operation, and then buffering the result (i.e. add a step at the end that does = Table.Buffer(previousStep)). Does this version of the query exhibit the same behavior?
  • Try duplicating the query that does the Table.Combine operation, and buffering each individual table being combined (e.g. = Table.Combine({Table.Buffer(Query1), Table.Buffer(Query2)})). Does this version of the query exhibit the same behavior?
  • Try duplicating the query that does the Table.Combine operation and doing a group by and count of the unique field you're seeing duplicated, then filter for Count > 0 (which should return an empty table when run locally in PBIDesktop). When refreshed via the gateway, does this query produce any results?

I'm trying to pinpoint if the issue is happening on the M query side of things or somewhere in the AS pipeline.

 

Thanks for your help.

Ehren

Also, if you're available to live debug this, please ask the folks handling your ticket to schedule a Skype meeting with me. Might be easier than going back-and-forth here on the forum.

Ehren

Another question: what version of SQL Server are you pulling from?

Ehren

@Ehren

 

The SQL Queries are all pulling from SQL 2008 and simply connect to the tables. So:

let
    Source = SQL_DATA,
    Custom1 = Source{[Schema="dbo",Item="TableName"]}[Data]
in
    Custom1

My source is simply = Sql.Database("SQLSERVER", "DATABASE") - I do that so can change the server name in one place to do development work off of a different server.

 

Then there are a number of transformations, merges, etc. 

 

I will tried the Buffer statement after the combine. After a few tests, it seems to have resolved it. The UNION and BUFFER tables consistenly report the same info, but the unbuffered one randomly gets out of whack. I ran 10 tests. F means the unbuffered table failed while the BUFFER and UNION tables worked, P means all 3 matched.

  1. P
  2. F
  3. F
  4. F
  5. F
  6. P
  7. P
  8. F
  9. P
  10. F

So you can see it is pretty random as to when it fails. These refreshes were done over a 15min period. Takes less than a minute usually for this to refresh via the Gateway.

This also just started the week of June 11, 2018 at the earliest. No changes to the reports here. Was published back in March and has been running fine. Only changes are updated gateway, and whatever MS does on the backend to its service.

 

Using table.buffer here could work as query folding was stopped a few stages before the Table.Combine, but I understand that kills query folding. Not looking forward to having to use table.buffer after appends if it kills query folding thereafter. 

If that is good enough for helping you, let me know. If not, I will do the more detailed buffering on specific table and other tests.

 

I am avail to live debug. I ran some yesterday with tech support and they gathered Fiddler data. Is that what you mean or is there a more detailed process. The ticket is Ticket #:10593214 and is being tracked under emails with REG:118062218446002 in the subject.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Thanks, @edhans. That's useful info.

 

Here's one more thing you could try: when running locally in PBIDesktop, you can (sort of) simulate what happens in the service by ensuring the Data Privacy firewall is enabled and marking all your data sources as Private. If you get rid of the Table.Buffer and get your queries back into their prior, refreshing-in-the-service-produces-bad-results state, what happens when you enable the firewall locally, set all your data sources to private, and refresh? I'm curious if this will cause the issue to happen locally in PBIDesktop.

 

Ehren

Here is what I did:

  1. Changed Global Options|Privacy to "Combine data according to each file's privacy levels." (it is just SQL Server here, one database, so one source)
  2. Changed Current File privacy to "Combine data according to your priacy level settings for each source"
  3. Edit Queries: Data Source Settings
    1. Cleared Permissions
    2. Edited the permissions for the Server/Database to Private
  4. Hit Refresh
  5. Authenticated again (domain, so just hit connect for the Windows Current Credentials)
  6. Refreshed about 10 times. Never failed the way it does in the service.

 

REdid the above but #1 at the top, changed global options to "Always Combine data according to privacy level for each source" (again, 1 source, so only one privacy level I suspect)

Repeated steps 2-6 as above.

About 10 refreshes. Worked every time. Never failed.

 

Changed everything back to "ignore privacy levels." 🙂



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Ok. Just to clarify: when you did the buffering, were you buffering the result of the Table.Combine, or buffering each query before combining them?

Ehren

One buffer, immediately after the Table.Combine. Not the whole thing, but this is the M code:

 

let
    Source = Table.Combine({qryOESLSHST, SalesCurMonth}),
    Buffer = Table.Buffer(Source),
    #"Trimmed Cus_No and Item_No" = Table.TransformColumns(Buffer,{{"cus_no", Text.Trim}, {"item_no", Text.Trim}}),


Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Hoping the team got in touch with you @Ehren. Spent nearly an hour convincing them it wasn't the workspace, my browser,  and renaming and republishing the report also wasn't the fix.

Honestly, if you need me to dump these tables into a different database so you can restore it into SQL 2008 let me know.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

I haven't seen anything. Can you please point them to this thread and ask them to add me to the email thread with you? I've met with two other customers who are experiencing this and would like to live debug some things with you.

 

Ehren

I'll do it again, but I sent the link to them already yesterday and we even read through some of it today.

 

She and her manager said there were a lot of "ehren" in the Skype directory and stopped there. :shrug emoticon:

 

Edit: Done. Sent 3:45pm pacific. Hopefully they get in touch with you.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors