Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Grow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.

Reply
elavery95
New Member

Updating

Hello,

Currently I have two tables in a lakehouse and I am trying to complete some updates to the main table. Let's call them metadata and d_metadata.

 

Metadata contains all data to date, and is the main table our teams use for reporting. d_metadata contains data loaded from source, where the records were updated in the previous day.

 

Both tables are currently in a lakehouse, and right now, we are using a Dataflow Gen2 to load the metadata table where rows do not match rows in the d_metadata table (joined based on a unique id).

This is returning the old records from metadata, and then I am appending updated records from d_metadata and pushing the final results the metadata table.

 

I would like to appraoch this differently by simply deleting the updated records in metadata, and then appending d_metadata to avoid doing a full overwrite of what is a very large table.

I cannot find a way to achieve this within a Fabric lakehouse, and other Spark SQL suggestions have not worked.

 

Can someone help guide me in the right direction? Thanks in advance 🙂

1 ACCEPTED SOLUTION
miguel
Community Admin
Community Admin

Dataflows only offers two data destination operations:

  • Append
  • Replace

In that sense, it doesn't support any sort of delete operations or "Upserts" to update existing records.

If your destination was a SQL Database, you could perhaps leverage a combination of a Data Pipeline,  a Dataflow Gen2 and a SQL Stored procedure to perhaps load the data inside of a "staging table" in your SQL Database and then run a stored procedure that will take that table and figure out, based on your own logic, what records to delete, update or add to your main table. 

I'm not sure how feasible that is with a Lakehouse as a destination, but you could try asking more questions specific to Lakehouse in the Lakehouse community forum (link below):

Synapse forums - Microsoft Fabric Community

View solution in original post

4 REPLIES 4
v-gchenna-msft
Community Support
Community Support

Hi @elavery95 ,

Thanks for using Fabric Community.
Can you please check this - Pattern to incrementally amass data with Dataflow Gen2 - Microsoft Fabric | Microsoft Learn


I suggest you incremental refresh rather than maintaining 2 tables.

FYI: We will have more features in upcoming releases.

vgchennamsft_0-1712949105335.png


Docs to refer - What's new and planned for Data Factory in Microsoft Fabric - Microsoft Fabric | Microsoft Learn



Hope this is helpful. Please let me know incase of further queries.

Hi @elavery95 ,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet.
In case if you have any resolution please do share that same with the community as it can be helpful to others.
Otherwise, will respond back with the more details and we will try to help .

Hi @elavery95 ,

We haven’t heard from you on the last response and was just checking back to see if your query was answered.
Otherwise, will respond back with the more details and we will try to help .

Thanks

miguel
Community Admin
Community Admin

Dataflows only offers two data destination operations:

  • Append
  • Replace

In that sense, it doesn't support any sort of delete operations or "Upserts" to update existing records.

If your destination was a SQL Database, you could perhaps leverage a combination of a Data Pipeline,  a Dataflow Gen2 and a SQL Stored procedure to perhaps load the data inside of a "staging table" in your SQL Database and then run a stored procedure that will take that table and figure out, based on your own logic, what records to delete, update or add to your main table. 

I'm not sure how feasible that is with a Lakehouse as a destination, but you could try asking more questions specific to Lakehouse in the Lakehouse community forum (link below):

Synapse forums - Microsoft Fabric Community

Helpful resources

Announcements
Europe Fabric Conference

Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.

RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayFBCUpdateCarousel

Fabric Monthly Update - May 2024

Check out the May 2024 Fabric update to learn about new features.