Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
DavidCousins
Frequent Visitor

Gen2 Dataflow Failed to insert a table

I have a dataflow that is taking a .json file landed by a data pipeline into a lakehouse in the same workspace. I've flattened out the data to a single table, given the tables sensible data types and set the lakehouse as the destination. Everything works up to the point when the dataflow runs and tries to save the data to a table in the lakehouse.

 

workspaces-personal: Error Code: Mashup Exception Data Source Error, Error Details: Couldn't refresh the entity because of an issue with the mashup document MashupException.Error: DataSource.Error: Failed to insert a table. Details: [Value = {...}, Type = text] (Request ID: 82fa63e1-66a3-4cb1-a6a7-47b61c54e936).

Session ID is da9dc6a9-6c1a-40c1-a6ed-a6766f95611d

 

The json file is small so I've been able to confirm there isnt anything obvious that shouldnt be there other than some trailing whitespace field names, that I fixed in the query. It's actually the output from this API: Admin - Groups GetGroupAsAdmin - REST API (Power BI Power BI REST APIs) | Microsoft Learn

 

I've also tried changing the destination to a warehouse in the same workspace and I get the same error message. And I've tried it in a completely new workspace, too, just in case the staging tables were broke. I'm not using a gateway (as this is lakehouse source to destination). Bit stumped.

 

1 ACCEPTED SOLUTION

So I think I've figured it out. It seems to happen when you have a referenced query that still has records in it that show as [list]. Even when this query isnt set to output to a destination. 

 

Disabling staging on the main, referenced query, fixes it. Hopefully in the future the staging storage will support more record types, but at the very least the error could be cleaned up as it isnt helpful at all.

 

Explanation of what I was trying to do:

 

The JSON has a list of Power BI workspaces and all of their content, so I have the main query that points to the JSON and unpicks the Odata response. It returns the Workspace ID and then the lists of data for Reports, Users, Datasets etc.

 

I then reference the main query to create a table of reports. I keep the Workspace ID but delete everything else other than the Reports column, which I then expand. Repeat that for Users and Datasets. Workspace ID appears in every one of these alongside the expanded info.

 

I only set a destination for the flattened queries, not anything that still contains lists/records

 

If I flatten the entire thing in a single query, then it will load happily. But that's a stupid setup as it means massively duplicating everything.

 

View solution in original post

8 REPLIES 8
ELANG
Frequent Visitor

Having the same issue except im using CSV, not a JSON.... If I remove the step "change column type" it will load but then all column types are set to ANY and am unable to build visuals unless they are formatted properly.

 

Results 2019-2022: Error Code: Mashup Exception Data Source Error, Error Details: Couldn't refresh the entity because of an issue with the mashup document MashupException.Error: DataSource.Error: Failed to insert a table. Details: TX (Request ID: 0fa9af4c-4048-4b20-8077-4b45753c646d)

miguel
Community Admin
Community Admin

Hi!

Are you using a gateway by any chance? what's your data source? are you able to reproduce this error in a different workspace?

Hi Miguel,

 

No gateway

Data source is a json file in the lake in the workspace

Yes, have created an entirely new workspace with a new pipeline to call the API and save the output as a json file and dataflow to flatten out the json, and get the same result

Could you share the table schema of the table that you have as the output in your dataflow gen2? what data types does it have?

The article below might help you understand some of the limitations that we have in Dataflows Gen2 today. Please check if you might be hitting one of them:

Data Factory Dataflow Gen2 limitations - Microsoft Fabric | Microsoft Learn

So I think I've figured it out. It seems to happen when you have a referenced query that still has records in it that show as [list]. Even when this query isnt set to output to a destination. 

 

Disabling staging on the main, referenced query, fixes it. Hopefully in the future the staging storage will support more record types, but at the very least the error could be cleaned up as it isnt helpful at all.

 

Explanation of what I was trying to do:

 

The JSON has a list of Power BI workspaces and all of their content, so I have the main query that points to the JSON and unpicks the Odata response. It returns the Workspace ID and then the lists of data for Reports, Users, Datasets etc.

 

I then reference the main query to create a table of reports. I keep the Workspace ID but delete everything else other than the Reports column, which I then expand. Repeat that for Users and Datasets. Workspace ID appears in every one of these alongside the expanded info.

 

I only set a destination for the flattened queries, not anything that still contains lists/records

 

If I flatten the entire thing in a single query, then it will load happily. But that's a stupid setup as it means massively duplicating everything.

 

DennesTorres
Post Prodigy
Post Prodigy

Hi,

Did you pass the JSON by a JSON validator online?

I know it sounds obvious, but I found some references of the same error message with problems related to the source : https://community.fabric.microsoft.com/t5/Power-Query/Microsoft-Mashup-error-message-on-refresh/td-p...

As a debugging strategy, maybe it would worth if you could make the JSON a static JSON at some place and run the pipeline over it and, if it works, add the connection to the API ?

By this way you could break down the root cause, identifying if the problem is the JSON you can see returned or the API access returning something different and weird.

Kind Regards,

Dennes

Yeah it's valid json and Power Query parses it fine. The data pipeline API call writes it to the lake as a .json file, so it's already static. It just can't save the output (a single table for the main data object) into the lake or warehouse.

Hi,

If there is no better solution, I would suggest breaking down the JSON until find what is causing the problem.

What I mean: I imagine the JSON has many records. Even if the data is hierarchical, it still may have many records. What if you create a fake JSON with one single record from the original  JSON you have ? Does the pipeline work?

If it works, it would prove the problem is in one of the records. You could break down the JSON (half records? Add in small amounts?) until you locate exactly which record is causing the problem.

Anyway, it's just a debugging suggestion, in case no better option is available.

Kind Regards,

 

Dennes

Helpful resources

Announcements
LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.