Re: Databricks very slow

joshua1990 · ‎08-08-2021

Hello all,

I recently got access to a Databricks cluster. In PBI Desktop I use the Azure Databricks connection type to access this data.
Is it possible that Databricks is not primarily intended for importing large tables? But rather intended to work through Direct Query? Compared to Oracle, ODBC or Impala, Databricks is really very slow here.

Does anyone know what could be the reason for this?

Would it be helpful to access the Databricks cluster via ODBC?

Using an enterprise gateway and a corresponding dataflow, the refresh is faster than in PBI Desktop.

v-xulin-mstf · ‎08-11-2021

Hi @joshua1990

Maybe you can access the Databricks cluster via ODBC.

Please refer:

https://mauridb.medium.com/powerbi-and-azure-databricks-193e3dc567a

If you still have some question, please don't hesitate to let me known.‌‌

Best Regards,

Link

Is that the answer you're looking for? If this post helps, then please consider Accept it as the solution. Really appreciate!

daxer-almighty · ‎08-08-2021

The answer is as usual: it depends. It's no secret that the best and fastest data source for PBI is a relational database. But the speed depends on many factors. One of them being the throughput of the network and the quality of the driver. I have no experience with Databricks as the source for PBI, so can't really comment on this any more than I have. I can just add that getting data from a plain csv file is much faster than from other data sources (minus the relational db mentioned). It might also be faster to get data from a parquet file if the data is BIG.

Databricks very slow

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024

How to Get Your Question Answered Quickly