Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Grow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.

Reply
JMCDOWELL
New Member

mssparkutils file operations

I am having difficulty moving files from one location to another using mssparkutils in a fabric notebook.  The error seems to be random as sometimes the commands execute, other times it returns an error.  I am moving files from one location in a shortcut to an ADLS Gen 2 storage account to another location in the same account.  I am using the full "abfss://.." file path for the source and destination folder.  I have confirmed that the documentation for msssparkutils.fs.mv is incorrect.  Using @param createPath generates an error that says parameter createPath is undefined.  However, putting a value of True does make the function work, however the if the destination folder does not exist it is NOT created but rather generates an error. I have worked around this my using the mkdirs command each time as it does not return an error if the directory already exists.  Thes commands are being issued in a loop as I am ingesting all csv files in a folder and moving them to another folder after each is ingested. Periodically the mkdirs or mv command generates an error from what appears to be a permissions issue that I can not figure out how to resolve.  It seems random.  As an example this is the 5th time through the loop when it throws the below error: (specifics xxx for privacy)

 

If it matters the Storage account shortcut was created using my credentials and I am the one running the notebook.

 

Failed in processing data from raw folder with exception:An error occurred while calling z:mssparkutils.fs.mkdirs. : java.nio.file.AccessDeniedException: Operation failed: "Forbidden", 403, PUT, "http://onelake.dfs.fabric.microsoft.com/xxxFiles/xxx/2024-01-27?resource=directory&timeout=90,"  AuthorizationPermissionMismatch, "This request is not authorized to perform this operation using this permission. RequestId:1ce9a06f-001f-0003-594f-515e63000000 Time:2024-01-27T18:31:55.5350440Z"

5 REPLIES 5
PedroJunqueira
Frequent Visitor

Hi @v-gchenna-msft 

 

I am working in a Fabric Workspace and I want to check if a directory exists in the Files folder and if not I want to create a new directory.

 

I am the admin of the workspace and I can create and delete manually any dile or folder int he Files path.

 

However when operation done using mssparkutils.fs for exists and mkdirs methos I get a 

Py4JJavaError: An error occurred while calling z:mssparkutils.fs.exists. : Operation failed: "Bad Request", 400, HEAD,

Is this a known error?

 

PedroJunqueira_0-1710993339923.png

 

 

Hi @v-gchenna-msft and @JMCDOWELL 

 

I found the issue I was having.

 

In Fabric the top level folders of a workspace are condired "managed one lake folders" .

In the structure of a workspace these folders "Files" and "Tables" /MyLakehouse.lakehouse/Files and /MyLakehouse.lakehouse/Tables

At this level one cannot create or delete folders using the msspartutils only from the "second" level inside those folders.

For example: the code below will fail

 

import pandas as pd

data = {'name':['John']}
df = pd.DataFrame(data)

path = 'ManagedLevel'

try:
    if not mssparkutils.fs.exists(path):
        print(f'path does not exit, creating path: {path}')
        mssparkutils.fs.mkdirs(path)
    else:
        print(f'path: {path} already exists')

    print('trying to save pandas df in a just created folder')
    df.to_csv(f'/lakehouse/default/{path}/able_to_save.csv')
    print('save csv successfully')
except Exception as e:
    print(e)

 

Because I am trying to create a filder at the Managed Level or same level as Files and Tables.

 

The error is a bad request code 400 (not authorized code). 

 

 Operation failed: "Bad Request", 400, HEAD,

 

 

 However if I try and create a folder below the managed level then the code works fine to create the folder but fails to save the pandas dataframe.

 

 

import pandas as pd

data = {'name':['John']}
df = pd.DataFrame(data)

path = 'Files/second_level'

try:
    if not mssparkutils.fs.exists(path):
        print(f'path does not exit, creating path: {path}')
        mssparkutils.fs.mkdirs(path)
    else:
        print(f'path: {path} already exists')

    print('trying to save pandas df in a just created folder')
    df.to_csv(f'/lakehouse/default/{path}/able_to_save.csv')
    print('save csv successfully')
except Exception as e:
    print(e)
path does not exit, creating path: Files/second_level
trying to save pandas df in a just created folder
[Errno 2] No such file or directory: '/lakehouse/default/Files/second_level/able_to_save.csv'

 

 

I think because the condition that You can perform CRUD (Create, Read, Update and Delete) operations on any folder or file created within these managed folders, and perform read-only operations on workspace and item folders.

 

Finally at a folder within the second_level it is fine and it all works.

 

 

import pandas as pd

data = {'name':['John']}
df = pd.DataFrame(data)

path = 'Files/second_level/third_level'

try:
    if not mssparkutils.fs.exists(path):
        print(f'path does not exit, creating path: {path}')
        mssparkutils.fs.mkdirs(path)
    else:
        print(f'path: {path} already exists')

    print('trying to save pandas df in a just created folder')
    df.to_csv(f'/lakehouse/default/{path}/able_to_save.csv')
    print('save csv successfully')
except Exception as e:
    print(e)

 

 

 

path does not exit, creating path: Files/second_level/third_level
trying to save pandas df in a just created folder
save csv successfully

 

 

 

v-gchenna-msft
Community Support
Community Support

Hi @JMCDOWELL ,

Apologies for the delay in reply from our side. 
As I understand you are facing some issues while working with mssparkutils.fs.mv(source_path, destination_path).

Can you help me understand at what scenarios you are facing the above issue?
What is your source path? What is destination path? Is there any observation?

I can help you better if you can provide few more details of your scenario and issue.

Hello @JMCDOWELL ,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet.
In case if you have any resolution please do share that same with the community as it can be helpful to others .
Otherwise, will respond back with the more details and we will try to help .

Hi @JMCDOWELL ,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet.
In case if you have any resolution please do share that same with the community as it can be helpful to others .
Otherwise, will respond back with the more details and we will try to help .

Helpful resources

Announcements
Europe Fabric Conference

Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.

Expanding the Synapse Forums

New forum boards available in Synapse

Ask questions in Data Engineering, Data Science, Data Warehouse and General Discussion.

RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayFBCUpdateCarousel

Fabric Monthly Update - May 2024

Check out the May 2024 Fabric update to learn about new features.