Solved: Re: Unable to call sparksql methods in VSCode-Fabr...

robkerrdm · ‎01-26-2024

I'm trying to get my VSCode -> Fabric integration working. I've been through this thread many times, and it was instructive and helpful.

-- I've installed the prereqs listed in the docs, have JAVA_HOME setup, conda and java in the path

-- After installing the VSCode synapse extension, on first use it creates the two conda environments (fabric-synapse-runtime-1-1 and fabric-synapse-runtime-1-2), and on reviewing those initialization logs I see no errors

-- I can browse my workspaces and open notebooks

The problem is whenI try to use a spark command, there is a long pause (session starting in Fabric?), following by an error:

- Example 1-

%%sparksql

SELECT * FROM TableName

--

UsageError: Cell magic `%%sql` not found.

- Example 2-

df = spark.sql("SELECT * FROM LakehouseName.TableName LIMIT 1000")

--

AttributeError: 'NoneType' object has no attribute 'sql'

I've pulled the logs from the local Fabric notebook folder--There are messages in the logs, but I don't know whether these are "normal" or not, since I've not yet used a system where this feature works. Logs are below.

I suspect the issue is in the PySparkLighter.log, but I can't connect the error with any configuration step I may have missed.

Failed to initialize Spark Lighter variables. An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext. <-- See below for context

Any ideas?

******************* SparkLighter.log ******************

[WARN ] 2024-01-26 16:52:01.302 [main] Shell: Did not find winutils.exe: java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. -see https://wiki.apache.org/hadoop/WindowsProblems
[INFO ] 2024-01-26 16:52:01.963 [Thread-2] SparkContext: Running Spark version 3.4.1
[WARN ] 2024-01-26 16:52:02.013 [Thread-2] NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[INFO ] 2024-01-26 16:52:02.083 [Thread-2] ResourceUtils: ==============================================================
[INFO ] 2024-01-26 16:52:02.084 [Thread-2] ResourceUtils: No custom resources configured for spark.driver.
[INFO ] 2024-01-26 16:52:02.084 [Thread-2] ResourceUtils: ==============================================================
[INFO ] 2024-01-26 16:52:02.085 [Thread-2] SparkContext: Submitted application: Data Cloud: Spark Lighter Connection 2024-01-26 16:51:58.578904
[INFO ] 2024-01-26 16:52:02.099 [Thread-2] ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
[INFO ] 2024-01-26 16:52:02.104 [Thread-2] ResourceProfile: Limiting resource is cpu
[INFO ] 2024-01-26 16:52:02.105 [Thread-2] ResourceProfileManager: Added ResourceProfile id: 0
[INFO ] 2024-01-26 16:52:02.141 [Thread-2] SecurityManager: Changing view acls to: RobertKerr
[INFO ] 2024-01-26 16:52:02.143 [Thread-2] SecurityManager: Changing modify acls to: RobertKerr
[INFO ] 2024-01-26 16:52:02.143 [Thread-2] SecurityManager: Changing view acls groups to:
[INFO ] 2024-01-26 16:52:02.143 [Thread-2] SecurityManager: Changing modify acls groups to:
[INFO ] 2024-01-26 16:52:02.144 [Thread-2] SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: RobertKerr; groups with view permissions: EMPTY; users with modify permissions: RobertKerr; groups with modify permissions: EMPTY
[INFO ] 2024-01-26 16:52:02.357 [Thread-2] Utils: Successfully started service 'sparkDriver' on port 49959.
[INFO ] 2024-01-26 16:52:02.381 [Thread-2] SparkEnv: Registering MapOutputTracker

************* PySparkLighter.log ******************

16:51:36,461 root INFO current_directory c:\Users\RobertKerr\source\CrabShack\4fe867d4-8b9c-451f-94cc-b0b950a52f55\SynapseNotebook\c2564846-6518-4589-8682-db06c92f76f8\Load and Enrich Reviews

16:51:36,461 root INFO workspace_path c:\Users\RobertKerr\source\CrabShack\4fe867d4-8b9c-451f-94cc-b0b950a52f55

16:51:36,462 root INFO log_path c:\Users\RobertKerr\source\CrabShack\4fe867d4-8b9c-451f-94cc-b0b950a52f55\logs\c2564846-6518-4589-8682-db06c92f76f8

16:51:36,462 root INFO Using synapse remote kernel ...

16:51:36,462 root INFO Should attach session in dev mode False

16:51:37,633 root INFO Starting session 0feb87f2-4d8a-4de0-ad12-af0ed67c5a1d...

16:51:38,664 root INFO Getting refresh token...

16:51:39,628 root INFO https://pbipwus8-westus.pbidedicated.windows.net/webapi/capacities/***********************/workloads...

16:51:41,985 root INFO <session_management.SessionStatus object at 0x0000026C4DCEE830>

16:51:44,201 root INFO Trident session states: starting

16:51:48,750 root INFO Trident session states: busy

16:51:53,240 root INFO Trident session states: busy

16:51:57,654 root INFO Trident session states: idle

16:51:57,656 root INFO Starting session 0feb87f2-4d8a-4de0-ad12-af0ed67c5a1d finished...

16:51:57,656 root INFO Attaching spark session 0feb87f2-4d8a-4de0-ad12-af0ed67c5a1d for Spark Lighter ...

16:51:58,578 root INFO log4j.properties variable LOG_FILE_PATH c:\Users\RobertKerr\source\CrabShack\4fe867d4-8b9c-451f-94cc-b0b950a52f55\logs\c2564846-6518-4589-8682-db06c92f76f8\SparkLighter.log

16:52:02,399 root ERROR Failed to initialize Spark Lighter variables. An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.

: java.lang.ExceptionInInitializerError

at org.apache.spark.unsafe.array.ByteArrayMethods.<clinit>(ByteArrayMethods.java:52)

at org.apache.spark.memory.MemoryManager.defaultPageSizeBytes$lzycompute(MemoryManager.scala:306)

at org.apache.spark.memory.MemoryManager.defaultPageSizeBytes(MemoryManager.scala:296)

at org.apache.spark.memory.MemoryManager.$anonfun$pageSizeBytes$1(MemoryManager.scala:315)

at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:23)

at scala.Option.getOrElse(Option.scala:189)

at org.apache.spark.memory.MemoryManager.<init>(MemoryManager.scala:315)

at org.apache.spark.memory.UnifiedMemoryManager.<init>(UnifiedMemoryManager.scala:58)

at org.apache.spark.memory.UnifiedMemoryManager$.apply(UnifiedMemoryManager.scala:215)

at org.apache.spark.SparkEnv$.create(SparkEnv.scala:328)

at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:199)

at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:287)

at org.apache.spark.SparkContext.<init>(SparkContext.scala:505)

at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)

at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:75)

at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:53)

at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502)

at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486)

at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)

at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)

at py4j.Gateway.invoke(Gateway.java:238)

at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)

at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)

at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)

at py4j.ClientServerConnection.run(ClientServerConnection.java:106)

at java.base/java.lang.Thread.run(Thread.java:1583)

Caused by: java.lang.IllegalStateException: java.lang.NoSuchMethodException: java.nio.DirectByteBuffer.<init>(long,int)

at org.apache.spark.unsafe.Platform.<clinit>(Platform.java:113)

... 27 more

Caused by: java.lang.NoSuchMethodException: java.nio.DirectByteBuffer.<init>(long,int)

at java.base/java.lang.Class.getConstructor0(Class.java:3761)

at java.base/java.lang.Class.getDeclaredConstructor(Class.java:2930)

at org.apache.spark.unsafe.Platform.<clinit>(Platform.java:71)

... 27 more

16:52:02,400 root INFO Registering Spark Lighter magics for IPython...

16:52:02,400 root INFO Registered Spark Lighter magics for IPython.

v-nikhilan-msft · ‎01-30-2024

Hi @robkerrdm
Please note that you should use openjdk8 instead of openjdk21. Download openjdk8 and set the environment variable.

You can refer to this document for more information:
https://learn.microsoft.com/en-us/fabric/data-engineering/setup-vs-code-extension

Hope this helps. Please let me know if you have any further questions.

View solution in original post

v-nikhilan-msft · ‎01-28-2024

Hi @robkerrdm

Thanks for using Fabric Community.

1) We only support %pip in current version, other magic commands are not supported yet.

2) Can you please confirm that the Java version which you are using is OpenJDK 8 ?

3) Select the right spark runtime version. You can set environment in notebook in the portal, if you use fabric spark 1.2, you need to choose fabric-synapse-runtime-1-2 kernel when running code in VS Code.

Hope this helps. Please let us know if you have any further questions.

garys12 · ‎04-11-2024

Hi Guys,

When will %%sql magic commands start working in vs code when running fabric notebooks locally?

robkerrdm · ‎01-29-2024

Thanks for the response, nikhilan!

Below is a screen grab of my VS selection, java & conda versions and notebook selection in the portal. The openjdk is 21.0.2 LTS.

I did have 1.2 selected in both the notebook and VSCode (at the time this was the only environment in portal).

I noticed in your screen grab you're using Runtime 1.1. I added an enviroment for 1.1, selected it in Portal and VSCode, but the error is the same.

Also noticed in your screen grab you have a magic command in the cell "%%spar..." (can't read all of it). What is the rest of it, and is it required?

Last Q: you mentioned %%sql is not supported. Thanks for letting me know. This limitation isn't stated in the "current limitations" section of the docs. Are there other limitations we should know about?

Here's a grab of my notebook (portal), notebook (VSCode), versions.

Thanks!

Rob

v-nikhilan-msft · ‎01-30-2024

Hi @robkerrdm
Please note that you should use openjdk8 instead of openjdk21. Download openjdk8 and set the environment variable.

You can refer to this document for more information:
https://learn.microsoft.com/en-us/fabric/data-engineering/setup-vs-code-extension

Hope this helps. Please let me know if you have any further questions.