Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Grow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.

Reply
rabihbadr
Frequent Visitor

Most convenient storage with Powerbi Service

Hello,

I have a slowly growing data consisting of JSON files (358 file) sizing about 9GB, biggest file is 150MB.

I am currently copying them from a server to Onedrive for business on a daily basis via scheduled GraphAPI python script, to be later pulled and refreshed daily in my dataset on my PPU workspace.

 

I was doing this until I started having this error: "An Exixting Connection was forcibly closed by remote host", in my powerbi desktop, while i'm making changes to my Powerquery after apply and close.

I learned that this is happening because the server (Onedrive/sharepoint) cannot handle multiple long copy sessions (yes, my internet is slow)

I am thinking of moving to another data storage solution instead of Onedrive. I am considering Azure Storage. This way I can copy my files daily to the BLOB via RESTAPI, then my dataset can use them from the BLOB.

I welcome all comments tips and advices.

1 ACCEPTED SOLUTION
hackcrr
Solution Sage
Solution Sage

Hi, @rabihbadr 

Given the problems you've experienced with OneDrive and the need for a more robust solution to handle large and slow-growing JSON files, migrating to Azure Storage is a smart decision. Azure Storage, specifically Azure Blob Storage, offers a variety of benefits, including scalability, reliability, and the ability to handle large files and frequent access.

Steps to migrate to Azure Blob Storage:
If you do not have an Azure Storage account, create one. Create a Blob container in the storage account to store JSON files.
Upload the files to Azure Blob Storage:
1. You can perform a manual upload using the Azure Storage Explorer.
2. For automation, use the Azure Storage Blob REST API or the Azure SDK (for example, the Azure Storage Blob SDK for Python).
Modify the Python script to upload files to Blob Storage:

from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
import os

connect_str = "your_connection_string"  # Obtain from Azure portal
container_name = "your_container_name"
local_folder_path = "path_to_your_local_folder"

blob_service_client = BlobServiceClient.from_connection_string(connect_str)
container_client = blob_service_client.get_container_client(container_name)

for filename in os.listdir(local_folder_path):
    if filename.endswith(".json"):
        local_file_path = os.path.join(local_folder_path, filename)
        blob_client = container_client.get_blob_client(filename)

        with open(local_file_path, "rb") as data:
            blob_client.upload_blob(data, overwrite=True)

print("Upload complete.")

3. Data transfer using AzCopy
Download and install AzCopy from the Microsoft website. Use the following command to upload files to Azure Blob Storage:

https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10

azcopy copy "local_folder_path\*.json" "https://<storage_account>.blob.core.windows.net/<container_name>?<SAS_token>" --recursive

If data growth is slow and incremental, consider using incremental data refresh in Power BI to reduce refresh time and efficiency. Implement logging in Python scripts to monitor successful uploads and any errors. Azure Blob Storage provides detailed logs that can also help with troubleshooting. If the Internet connection is slow, consider using the AzCopy utility to efficiently transfer data to Azure Blob Storage. AzCopy is a command-line utility designed for high-performance uploads and downloads.
By migrating to Azure Blob storage, you'll benefit from better performance, scalability, and more reliable data management, which should solve the problems you're experiencing with OneDrive.

 

 

Best Regards,

hackcrr

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

1 REPLY 1
hackcrr
Solution Sage
Solution Sage

Hi, @rabihbadr 

Given the problems you've experienced with OneDrive and the need for a more robust solution to handle large and slow-growing JSON files, migrating to Azure Storage is a smart decision. Azure Storage, specifically Azure Blob Storage, offers a variety of benefits, including scalability, reliability, and the ability to handle large files and frequent access.

Steps to migrate to Azure Blob Storage:
If you do not have an Azure Storage account, create one. Create a Blob container in the storage account to store JSON files.
Upload the files to Azure Blob Storage:
1. You can perform a manual upload using the Azure Storage Explorer.
2. For automation, use the Azure Storage Blob REST API or the Azure SDK (for example, the Azure Storage Blob SDK for Python).
Modify the Python script to upload files to Blob Storage:

from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
import os

connect_str = "your_connection_string"  # Obtain from Azure portal
container_name = "your_container_name"
local_folder_path = "path_to_your_local_folder"

blob_service_client = BlobServiceClient.from_connection_string(connect_str)
container_client = blob_service_client.get_container_client(container_name)

for filename in os.listdir(local_folder_path):
    if filename.endswith(".json"):
        local_file_path = os.path.join(local_folder_path, filename)
        blob_client = container_client.get_blob_client(filename)

        with open(local_file_path, "rb") as data:
            blob_client.upload_blob(data, overwrite=True)

print("Upload complete.")

3. Data transfer using AzCopy
Download and install AzCopy from the Microsoft website. Use the following command to upload files to Azure Blob Storage:

https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10

azcopy copy "local_folder_path\*.json" "https://<storage_account>.blob.core.windows.net/<container_name>?<SAS_token>" --recursive

If data growth is slow and incremental, consider using incremental data refresh in Power BI to reduce refresh time and efficiency. Implement logging in Python scripts to monitor successful uploads and any errors. Azure Blob Storage provides detailed logs that can also help with troubleshooting. If the Internet connection is slow, consider using the AzCopy utility to efficiently transfer data to Azure Blob Storage. AzCopy is a command-line utility designed for high-performance uploads and downloads.
By migrating to Azure Blob storage, you'll benefit from better performance, scalability, and more reliable data management, which should solve the problems you're experiencing with OneDrive.

 

 

Best Regards,

hackcrr

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
Europe Fabric Conference

Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.

RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

Top Solution Authors