Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Shawn_Eary
Advocate IV
Advocate IV

Version Control for Lakehouse File Uploads

When I upload files into an MS Fabric backed lakehouse, I don't think any version control is used:
(175) Manually Upload Large CSV files to a Microsoft Fabric Lakehouse - YouTube

https://www.youtube.com/watch?v=Ln4mpuknuco    (Same Link as Above)

I'm worried that out of confusion someday, someone will accidentally replace one of my uploaded CSV files with a corrupt or blank version. With Git or SharePoint, when a user checks in a corrupt file over a good version of the same file, you have a way to revert to the previous good version, but I don't see anyway to do that with MS Fabric.

How do I configure my MS Fabric Lakehouse to create a new version of hello_world.csv each time it is uploaded?

Example: If someone uploads a file named hello_world.csv into my lakehouse 7 times, then I want a repo to save all 7 versions with the latest version being the one that stays on top until I invoke a Git or SharePoint command to revert to an older version.

1 ACCEPTED SOLUTION

Hi @Shawn_Eary ,
The internal team has updated me regarding version control in Fabric.

Git integration is for doing version control on code, not data files. OneLake, isn't connected to Git Integration. Instead it is basically ADLS gen2. 
You can go through this link for reference : Solved: Delta lake time travel in Fabric SQL endpoints? - Microsoft Fabric Community
Instead if you want file versioning you can follow the below steps:

1)  You can connect to Blob Storage using the Dataflows or Data Factory connectors, or in a python notebook using the Storage APIs directly.

 

2) You can use GIT to version your files . Just stick them in the repo and then again connect to the repo using a python notebook. 

Hope this helps . Please let us know if you have any further queries.

View solution in original post

3 REPLIES 3
v-nikhilan-msft
Community Support
Community Support

Hi @Shawn_Eary ,

Thanks for using the Fabric community and reporting this . 

I have reached the internal team for help on this. I will update you once I hear from them.

Appreciate your patience.

Hi @Shawn_Eary ,
The internal team has updated me regarding version control in Fabric.

Git integration is for doing version control on code, not data files. OneLake, isn't connected to Git Integration. Instead it is basically ADLS gen2. 
You can go through this link for reference : Solved: Delta lake time travel in Fabric SQL endpoints? - Microsoft Fabric Community
Instead if you want file versioning you can follow the below steps:

1)  You can connect to Blob Storage using the Dataflows or Data Factory connectors, or in a python notebook using the Storage APIs directly.

 

2) You can use GIT to version your files . Just stick them in the repo and then again connect to the repo using a python notebook. 

Hope this helps . Please let us know if you have any further queries.

Hi @Shawn_Eary ,
It was great to know that you were able to get to a resolution. We expect you to keep using this forum and also motivate others to do that same. 

Thanks
Nikhila N

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

March 2024  FBC Gallery Image

Fabric Monthly Update - March 2024

Check out the March 2024 Fabric update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Kudoed Authors