Notebook load data from a Lakehouse that's not def...

Shadow229 · ‎01-18-2024

Hey all,

I've loaded my sources and my medallion stages into their own lakehouses for scalability and trying to keep things tidy from the get-go. I'm now in a position where it would be REALLY useful to, within a notebook, be able to pull data in from a source, run some SQL on it, and load it into a layer. Pretty standard simple transformation step.

I can add those lakehouses to my notebook so they're both there, but I can't get data from any lakehouse that isnt the default for that notebook. If we can't get data from it, why can we add it? Is this somehting thats coming soon or am I doing something wrong?

When I select 'load data' from a table in a lakehouse that isnt default to the notebook, it auto populates the cell, starting with the comment:

# With Spark SQL, Please run the query onto the lakehouse which is from the same workspace as the current default lakehouse.

df = spark.sql("SELECT * FROM SilverLayerLakehouse.myTable LIMIT 10")

display(df)

and trying to run that cell yields the error:

AnalysisException: Table or view not found: SilverLayerLakehouse.myTable; line 1 pos 14; 'GlobalLimit 10 +- 'LocalLimit 10 +- 'Project [*] +- 'UnresolvedRelation [SilverLayerLakehouse, myTable], [], false

If I switch it to default it runs fine, but I need to load from source and store in silver with some SQL in between. With 'One lake' I assumed my data was visible from anywhere and that was the whole idea behind it, but I can't seem to get my notebook to read anything from a different lake? Am I missing something, or is this going to be functionality further down the line?

alxdean · ‎02-16-2024

ps. you could also try an absolute approach to loading the correct lakehouse by using the abfss path

from delta.tables import DeltaTable
lakehousePath = "abfss://yourpathhere"
deltaTablePath = f"{lakehousePath}/Tables/{tableName}"
deltaTable = DeltaTable.forPath(spark, deltaTablePath)
deltaTable.toDF().show()

you can get the abfss path from the lakehouse. click onto the triple dots of the Tables Node in the left menu and open properties. if you don't see properties, be sure to be in lakehouse mode and not SQL Mode.

alxdean · ‎02-16-2024

from our experience, you can access any lakehouse and warehouse in the same workspace. Cross workspace will require shortcuts.

darshanpv · ‎02-15-2024

The medallion architecture suggests to have different zones within a lokehouse.

If I understand your issue, you could create the file strcuture (zones) within workspae and in the same lakehouse.

Shadow229 · ‎02-16-2024

Having a single lakehouse with all layers contained within it goes against the access and security suggestions. Individual lakehouses in their own workspace allow for role based / workspace access permissions, keeping bronze safe, Silver accessible to data scientists and gold accessible to reporting analysts.

The solution I've been given for this from microsoft was to do shortcuts to the active lakehouse for any data I want to read in. Writing out can be done from anywhere.

Notebook load data from a Lakehouse that's not default

Helpful resources

New forum boards available in Real-Time Intelligence.

Fabric Monthly Update - May 2024

Jumpstart your career with the Fabric Career Hub

Notebook load data from a Lakehouse that's not default

Helpful resources

New forum boards available in Real-Time Intelligence.

Fabric Monthly Update - May 2024