Fabric Data Pipeline Activities with Fabric Notebo...

prabhatnath · ‎05-17-2024

Hello Friends,

I need suggestion on how shall I achive this using Fabric Data Pileline calling a Fabric Jupyter Notebook that need to return values.

Details:

1) I have a Jupyter Notebook (PySpark) that has Parameters for Source Table and target Tables in Lakehouse and and it read the data from Source Table and write to the target Table after doing some transformations. I have 2 variables: rows_inserted and rows_updated and these store the values.

2) The Pipeline has below activities:

2.1) Lookup - Read the metadata JSON file for table names.

2.2) ForEach - Iterate the table list and call the Notebook and pass the Parameters

Help me on below things:

1) How can I have the Notebook return the rows_inserted and rows_updated values to the caller Pipeline Activity?

2) From the Pipeline how can I capture the Result for Each Call to the Notebook, as each call will be for a different table.

3) Once the ForEach Activity is done I need to create a Final details with which table how many rows inserted, updated and this will be sent to Teams Chat.

Please suggest on this.

Thanks,

Prabhat

GilbertQ · ‎05-19-2024

Hi @prabhatnath

It sounds like you need to loop through your data. And within each loop, you need to then save to the table. here are details in how to loop in the notebook? How to loop through each row of dataFrame in PySpark ? - GeeksforGeeks

How to loop through each row of dataFrame in PySpark ? - GeeksforGeeks

Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Power BI Blog