Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and a 50 percent discount on exams.
Get startedEarn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
How can I control the number of partitions created when I output a parquet file?
Solved! Go to Solution.
Hi @saveenrMSFT,
If you are using PySpark, you can control the number of partitions created when you output a Parquet file by using the repartition method or the coalesce method on your DataFrame before writing it to Parquet. These methods allow you to control the number of output partitions, which in turn affects the number of Parquet files generated.
Thanks,
Chetna
Hi @saveenrMSFT,
If you are using PySpark, you can control the number of partitions created when you output a Parquet file by using the repartition method or the coalesce method on your DataFrame before writing it to Parquet. These methods allow you to control the number of output partitions, which in turn affects the number of Parquet files generated.
Thanks,
Chetna
Ask questions in Eventhouse and KQL, Eventstream, and Reflex.
Ask questions in Data Engineering, Data Science, Data Warehouse and General Discussion.
User | Count |
---|---|
1 | |
1 | |
1 | |
1 | |
1 |