Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
lk-u1248
New Member

SparkR in Synapse in Fabric

Hi All, 

 

I'm a beginner with Fabric. I'm trying to follow tutorials like https://microsoftlearning.github.io/mslearn-fabric/Instructions/Labs/03-delta-lake.html

 

By default my notebook is in PySpark, so I can run 

 

 

 

df = spark.read.format("csv").option("header","true").load("Files/products.csv")
display(df)

 

 

as expected. I can also easily do all the basic R things right out of the box which is amazing 

 

 

%%sparkr

print('hello world')
library(tidyverse)

data.frame(a = 1:3, b = letters[c(1,1,2)], c = Sys.Date() - 1:3) %>%
group_by(b) %>%
summarise(n())

 

 

But how do I load my file in the workspace into R? 

 

 

%%sparkr
read_csv("Files/products.csv")

 

 

...returns "Error: 'Files/products.csv' does not exist in current working directory" so I'm guessing I have a different working directory for my R session than PySpark?

 

For bonus points, how do I load a delta table into R?

 

Thanks very much for any insights! 

1 ACCEPTED SOLUTION

Hi @lk-u1248 ,

I apologize for the misunderstanding, here are few examples with spark R:

 

Lakehouse structure -

vgchennamsft_3-1711429822518.png

 



Example 1:

vgchennamsft_0-1711429550570.png

 

 

%%sparkr

# Load data into a SparkDataFrame from a table

# Method 1:
df <- tableToDF("gopi_lake_house.abc")

display(df)

# Method 2:
results <- sql("SELECT * FROM gopi_lake_house.abc LIMIT 1000")

head(results)

 

 
Example 2:

vgchennamsft_1-1711429621630.png

 

 

%%sparkr

# Load data into a SparkDataFrame from a file

df <- loadDF(
        path        = "Files/raw/Customer.csv",
        source      = "csv",
        header      = "true",
        inferSchema = "true"
      )

display(df)

 

 
Example 3:

vgchennamsft_2-1711429673663.png

 

 

%%sparkr

# Save data into a table from a SparkDataFrame

# New Table
tableName <- "gopi_lake_house.abcd"

data   <- list(
            list(1L, "Raymond", "green",  "apple"),
            list(2L, "Loretta", "purple", "grape"),
            list(3L, "Wayne",   "yellow", "banana")
          )

schema <- structType(
            structField("id",    "integer"),
            structField("name",  "string"),
            structField("color", "string"),
            structField("fruit", "string")
          )

df <- createDataFrame(
        data   = data,
        schema = schema
      )

saveAsTable(
  df        = df,
  tableName = tableName
)

# Verify that the table was successfully saved by
# displaying the table's contents.
display(sql(paste0("SELECT * FROM ", tableName)))

 


Docs to refer -
Tutorial: Work with SparkR SparkDataFrames on Azure Databricks - Azure Databricks | Microsoft Learn

Hope this is helpful. Please let me know incase of further queries.

View solution in original post

5 REPLIES 5
v-gchenna-msft
Community Support
Community Support

Hi @lk-u1248 ,

Thanks for using Fabric Community.
Unfortunately I am unable to find any way to save a dataframe as table using spark R even after searching every where in google. It looks like we cannot load with spark R.

I suggest you to use pyspark inorder to load it to tables, you can use combination of pyspark and spark R.

Code Snippet -

 

 

df = spark.read.format("csv").option("header","true").load("Files/year/month/date/sales.csv")
# df now is a Spark DataFrame containing CSV data from "Files/year/month/date/sales.csv".
display(df)

df.write.format("delta").save("Tables/actual_weather")

 

 

 

 

Above code can be executed along with your existing code, but make sure that above code is written in pyspark not in spark R.

Post from Reddit -

He seemed to suggest "you can do all this in R" but then didnt know specifics and said to use python anyway.  

JoeCrozier_0-1710160247761.png

 




Hope this is helpful. Please let me know incase of further queries.

Sorry, but your message amounts to "use PySpark" while the point of my question is how to use R. I'm afraid your response misses the point, but thank you for your time.

Hi @lk-u1248 ,

I apologize for the misunderstanding, here are few examples with spark R:

 

Lakehouse structure -

vgchennamsft_3-1711429822518.png

 



Example 1:

vgchennamsft_0-1711429550570.png

 

 

%%sparkr

# Load data into a SparkDataFrame from a table

# Method 1:
df <- tableToDF("gopi_lake_house.abc")

display(df)

# Method 2:
results <- sql("SELECT * FROM gopi_lake_house.abc LIMIT 1000")

head(results)

 

 
Example 2:

vgchennamsft_1-1711429621630.png

 

 

%%sparkr

# Load data into a SparkDataFrame from a file

df <- loadDF(
        path        = "Files/raw/Customer.csv",
        source      = "csv",
        header      = "true",
        inferSchema = "true"
      )

display(df)

 

 
Example 3:

vgchennamsft_2-1711429673663.png

 

 

%%sparkr

# Save data into a table from a SparkDataFrame

# New Table
tableName <- "gopi_lake_house.abcd"

data   <- list(
            list(1L, "Raymond", "green",  "apple"),
            list(2L, "Loretta", "purple", "grape"),
            list(3L, "Wayne",   "yellow", "banana")
          )

schema <- structType(
            structField("id",    "integer"),
            structField("name",  "string"),
            structField("color", "string"),
            structField("fruit", "string")
          )

df <- createDataFrame(
        data   = data,
        schema = schema
      )

saveAsTable(
  df        = df,
  tableName = tableName
)

# Verify that the table was successfully saved by
# displaying the table's contents.
display(sql(paste0("SELECT * FROM ", tableName)))

 


Docs to refer -
Tutorial: Work with SparkR SparkDataFrames on Azure Databricks - Azure Databricks | Microsoft Learn

Hope this is helpful. Please let me know incase of further queries.

Great, very helpful! Thanks a lot.

Hi @lk-u1248 ,

We haven’t heard from you on the last response and was just checking back to see if your query was answered.
Otherwise, will respond back with the more details and we will try to help .

Helpful resources

Announcements
Expanding the Synapse Forums

New forum boards available in Synapse

Ask questions in Data Engineering, Data Science, Data Warehouse and General Discussion.

LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

April Fabric Update Carousel

Fabric Monthly Update - April 2024

Check out the April 2024 Fabric update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.