Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
arpost
Advocate V
Advocate V

Does loading to Delta table from Files duplicate data in a lakehouse?

Greetings, community. I have a bunch of files I'm planning to load into a Lakehouse in CSV format. From there, I have considered loading them as Delta tables where possible. Does this duplicate the data, however, as the data is persisted in "raw" file format and then generated in Parquet format for the Delta table?

1 ACCEPTED SOLUTION
AndyDDC
Solution Sage
Solution Sage

Hi @arpost yes this will duplicate the data but you are transforming into a far better and more efficient format when saving as Delta, plus the underlying parquet will be compressed and likely smaller size than the source CSVs

View solution in original post

1 REPLY 1
AndyDDC
Solution Sage
Solution Sage

Hi @arpost yes this will duplicate the data but you are transforming into a far better and more efficient format when saving as Delta, plus the underlying parquet will be compressed and likely smaller size than the source CSVs

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

Expanding the Synapse Forums

New forum boards available in Synapse

Ask questions in Data Engineering, Data Science, Data Warehouse and General Discussion.

MayFabricCarousel

Fabric Monthly Update - May 2024

Check out the May 2024 Fabric update to learn about new features.

LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

Top Solution Authors
Top Kudoed Authors