Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Grow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.

Reply
009co
Helper IV
Helper IV

Dataflow Gen 2 query or table changes appear to cause errors or failures

My scenario is pretty simple:

 

* csv files that have been manually uploaded to Lakehouse Files

* Dataflow Gen 2 queries sourcing from these Lakehouse csv files

* Dataflow Gen 2 Data destination to Lakehouse tables

 

I am finding that any changes after creating a Dataflow Gen 2 such adding new or renamed query columns, new or renamed data destination Lakehouse tables can cause errors or failures and the only apparent way to get past this is to delete the entire Workspace and start from scratch.

 

I am guessing there are things happening in the staging artifacts that are not able to adapt to these changes or perhaps that the new query and table articifacts do not connect to new staging artifacts. This might be due to data pipeline artifacts that appear to be created automatically behind the scenes might not be updated when new dataflows or tables are created?

 

 

 

1 ACCEPTED SOLUTION
SidJay
Employee
Employee

If I'm understading your scenario correctly:

- You're setting the output destination of queries to the Lakehouse (tables)

- You're modufying the queries in ways that change the schema

- After those changes, refresh is failing

 

If so, the way to address this is to reconfigure the output destination after the schema changes. In the Query Settings pane of the Query Editor, you will see the Output Destination section at the bottom. Clicking on the "X" will remove the current settings and then you can reconfigure the destination (including specifying a new column mapping).

 

We are looking into a future mode that does not require explicit remapping for disruptive schema changes.

View solution in original post

3 REPLIES 3
SidJay
Employee
Employee

Note: you will also have to delete the previous Lakehouse table (if you'd like to use the same table for the new schema).

 

If you using the "Existing Table" flow (instead of "New Table"), the schema of the existing table will not be altered.

 

Thanks

SidJay
Employee
Employee

If I'm understading your scenario correctly:

- You're setting the output destination of queries to the Lakehouse (tables)

- You're modufying the queries in ways that change the schema

- After those changes, refresh is failing

 

If so, the way to address this is to reconfigure the output destination after the schema changes. In the Query Settings pane of the Query Editor, you will see the Output Destination section at the bottom. Clicking on the "X" will remove the current settings and then you can reconfigure the destination (including specifying a new column mapping).

 

We are looking into a future mode that does not require explicit remapping for disruptive schema changes.

Hey Sidjay,

 

Thanks for reply.

 

I will explicitly try this. Though pretty sure I tried it as one of my troubleshooting steps.

 

I note I've seen that the data flow gen 2 destination process has a step where it presents the in and out schema, which seems to confirm that the schema is ok, and have even seen it indicate new columns that have unchecked checkbox beside them (which I have checked).

 

But it appears that in fact this apparent confirmation doesn't yet actually behind the scenes update the schema mapping thus your recommendation to delete and recreate the Output Destination section by clicking X.

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayFabricCarousel

Fabric Monthly Update - May 2024

Check out the May 2024 Fabric update to learn about new features.