Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
SergioTorrinha
Resolver II
Resolver II

Data Pipeline Fail on Notebook due to unexplainable Spark Error

Hi everyone!

 

Today my datapipeline failed, because a Spark SQL notebook I have developed (and tested it was running whithout any issues) last month, failed to run.

The fail reason, aparently, is due to the Spark error highlighted in below image

SergioTorrinha_0-1704877425944.png

 

Weirdly enough, I can confirm the referrenced table exists in my lakehouse and it holds data, by either simply inspecting it or by querying it using T-SQL in the respective SQL Endpoint.

 

At this point, I can only conclude this might be a bug (similar to what is described here: https://community.fabric.microsoft.com/t5/Dataflows/Error-using-data-imported-through-DataFlow-when-... ).

I would like to know if someone else is/was experiencing this and if @Microsoft is aware of this issue ?

 

Thank you.

1 ACCEPTED SOLUTION

Hi @Corar !

I just got my issue solved today. I was informed by microsoft support that there was a bug in Spark runtime 1.2 that is now fixed.
In my case, the way I resolved the issue was to completly delete the table that I was not able to query via SparK SQL, and then re-run the pipeline that was generating the table.

 

I hope this helps fixing your issue.

View solution in original post

14 REPLIES 14
matkvaid
Helper III
Helper III

This is marked as solved, but i have same error randomly. I have recreated datalake tables that notebook was not running for, they work for one time and then start showing same errors, so the bug is not fixed? I have created support ticket, but i can see in how i get the replies that it will take months to solve?? How is it again, that trial has started counting days till the end, but Fabric as a product seems is still so not finished product????

prom
Frequent Visitor

Hello

This is a bug. As a workaround, You can try:

  • use runtime 1.1 
  • manually delete checkpoint file in _delta_log 
  • switch to parquet/ext tables as a destination for CP

.

matkvaid
Helper III
Helper III

Any news on that? I got same error. I have 3 identical tables, notebook script was not working for one of them. Got that table recreated, everything was fine for one day. Next day, other two tables got the same error. 

Corar
Frequent Visitor

Hi @v-cboorla-msft,

 

is there any information if there was a fix and this is rolled out globally?

 

Error ist unfortunately still happing for me, last time 2min ago and i need to recreate the table every time.

 

Best

Corar
Frequent Visitor

I am experiencing exactly the same behaviour;

 

I have an error envolving data pipelines and notebook. I write raw logs to a table in the bronze lakehouse. After that in the notebook i read the data incrementally, clean it and process it to silver and gold lakehouse. 

 

Corar_0-1705503446827.png

 

For some time this always works without problems but after some runs, spark cannot read the logs delta table in the notebook anymore. On read the following error appears:

 

Notebook execution failed at Notebook service with http status code - '200', please check the Run logs on Notebook, additional details - 'Error name - SparkRuntimeException, Error value - Error while decoding: java.lang.IllegalArgumentException: requirement failed: Mismatched minReaderVersion and readerFeatures.
newInstance(class scala.Tuple3).' :

 

I use the spark and delta table runtime default settings of the workspace, also tested different ones. I executed the data pipeline and notbook independently. 

 

Only thing i found was an error where Dataflow Gen2 caused a problem, but the delta log looks fine in my case:

 

Re: Error using data imported through DataFlow whe... - Microsoft Fabric Community

 

I hope i am not doing something stupid here and I am thankful for any guidance and support.

Hi @Corar !

I just got my issue solved today. I was informed by microsoft support that there was a bug in Spark runtime 1.2 that is now fixed.
In my case, the way I resolved the issue was to completly delete the table that I was not able to query via SparK SQL, and then re-run the pipeline that was generating the table.

 

I hope this helps fixing your issue.

Hello,

Did You resolve the problem? We have same similiar problem -  we are using Copy Activity to copy data from on-prem to lakehouse. Every five runs we get same error: Mismatched minReaderVersion and readerFeatures. 

 

After 5 runs, there is a file created in _delta_log catalog: 00000000000000000010.checkpoint.parquet. It consits of a column protocol with minReaderVersion:1 and readerFeatures: [] (and similiar settings for writer). According do delta.io docs its not correct, readerFeatures can be used with minReaderVersion >=3.

 

One solution is to delete whole table. You can remove checkpoint, but it will be recreated during load activity. We set overwrite option for destination table  in Lakehouse, in Copy Activity, but checkpoints are still generated. There should be option to let sb to replace delta table (with delta log removed) in Copy Act. We are using Pippelines to copy staging tables from on premises. We don't want to mix parquet tables with delta for performance reasons. 

 

Also played with parameter delta.checkpointInterval (default 10?)  for the table. Its getting removed every checkpoint from tblproperties, besides its not documented in fabric.

 

Best Regards

 

 

Hi @SergioTorrinha ,

 

thank you for the update.

 

I have resolved the problem in a similar way, multiple times since the weekend, by creating a backup as parquet and recreating the delta table afterward.

 

Unfortunately after multiple runs it reappears and the table cannot be read anymore, last time 2 hours ago (Mismatched Version error)

 

Nevertheless, maybe the fix is not yet rolled out in my region, so I am waiting a bit.

 

Thank you again !

@Corar 
No problem, glad I could help.

In your case I think it would be best to open a support ticket then. Perhaps the solution for your issue is somewhat distinct than mine. I don't know for sure when microsoft rolled out the bug fix, but I would dare to say it was a couple days ago, hence my sugestion.

Hope you see your issue fixed. 🙂
Thanks for chipping in.

SergioTorrinha
Resolver II
Resolver II

Hi everyone!

This issue is still occourring on my end and I have not been contacted from support so far.
Can someone from @Microsoft have a look at this, please? Or, at least, help with the support ticket 2401110020000508 ?

I have a pipeline developed since last month, which was working fine till this error been throwned out. I am trying to sell Fabric internally, and for that I need this pipeline to be operational, otherwise my 'internal selling operation' will be a failure.


Thank you.

SergioTorrinha
Resolver II
Resolver II

Hi @v-cboorla-msft !

I had no contact from support since I created the support ticket, and the issue still persist.

I wonder if you can see internally if it's possible to have a contact in this regard?

Please let me know.

 

Thank you.

v-cboorla-msft
Community Support
Community Support

Hi @SergioTorrinha 

 

Thanks for using Fabric Community.

Apologies for the issue that you are facing here.

This might require a deeper investigation from our engineering team about your workspace and the logic behind it to properly understand what might be happening. 

Please go ahead and raise a support ticket to reach our support team:

https://support.fabric.microsoft.com/support
Please provide the ticket number here as we can keep an eye on it.


Thanks.

Hi @v-cboorla-msft !

 

I followed your suggestion and raised a support ticket. The support ticket id is: 2401110020000508

Please keep me posted.

Thank you.

Hi @v-cboorla-msft !

I had no contact from support since I created the support ticket, and the issue still persist.

I wonder if you can see internally if it's possible to have a contact in this regard?

Please let me know.

 

Thank you.

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayFabricCarousel

Fabric Monthly Update - May 2024

Check out the May 2024 Fabric update to learn about new features.

LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

Top Solution Authors
Top Kudoed Authors