Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Grow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.

Reply
messengineer2
Frequent Visitor

Speed of linking Power BI to SharePoint via SharePoint.Files versus SharePoint.Tables

My company has a SharePoint for file attachments for an application.
There are almost 80,000 files taking up 111 GB.

I am trying to get a list in Power BI of all the files, their size, their created/modified date plus two custom SharePoint columns associated with each file to classify it. I don't care about the actual contents of the file. The SharePoint ID is a bonus but not essential.

If I link via SharePoint.Files I can get all the required file information but I cannot see the two custom SharePoint columns.
The only columns are Content (which I dont need), Name, Extension, Date Accessed (which I don't need), Date Modified, Date Created, Attributes (I can get file size from here), and Folder Path.
The advantage of this method if speed ... the entire Query Refresh takes under 2 minutes and it does it by rows.

If I link via SharePoint.Tables it displays the two custom SharePoint columns I want, plus the ID, Created, Modified, and a File column which I can expand to get the file name, size and extension. There is a huge number of other columns which can be expanded most of which I don't care about. There are URL links to the file itself which could be useful.
The problem with this is speed ... it sits there for over 20 minutes progressively loading heaps of MB and I cancelled the refresh as it was going to be too slow.

Am I missing something?
Is there a hybrid method between these two where I can also read the custom SharePoint column values associated with each file ... without the time blowing out?

Edit : I am aware of the 3rd Option Sharepoint.Contents but have not managed to try that yet.

1 ACCEPTED SOLUTION
messengineer2
Frequent Visitor

The end result of this was I had to use two methods to get the speed to be acceptable.
Step 1
I used OData method to get everything I wanted apart from File Size.
Then create a Primary Key based on Folder and File Name.
Step 2
I used SharePoint Files method to get half the data but also the File Size.
Then create a Primary Key based on Folder and File Name.
Step 3
Merge Step 1 and Step 2 to get the full data set.


This query took only about 10 minutes to run on Power BI Desktop.
Every other method timed out after 4 hours.

View solution in original post

6 REPLIES 6
messengineer2
Frequent Visitor

The end result of this was I had to use two methods to get the speed to be acceptable.
Step 1
I used OData method to get everything I wanted apart from File Size.
Then create a Primary Key based on Folder and File Name.
Step 2
I used SharePoint Files method to get half the data but also the File Size.
Then create a Primary Key based on Folder and File Name.
Step 3
Merge Step 1 and Step 2 to get the full data set.


This query took only about 10 minutes to run on Power BI Desktop.
Every other method timed out after 4 hours.

christinepayton
Super User
Super User

I like the OData connector for getting custom library column data. Here's a tutorial on it for SP: https://youtu.be/yeIQ9jKln_I?si=9Mi4AUvVqYtzrsCZ

I've used this on about 100k files; the file sizes shouldn't matter if you are just trying to pull metadata. 

Thanks @christinepayton that works a lot faster than the other method.

Via the SharePoint.Files method (which is the fastest) I can't get out the custom columns or the id ... but I can get all the data I want including file size.

Via the OData method (which is slower but under 5-10 minutes) I seem to be able to get everything I need ... including the custom columns and id .... but I can't see where to get the file size. Is this included somewhere?

I may end up having to merge the two query results using Folder Path and File Name as Primary Key if there is no other alternative.

Yeah, you're right... I would have said just use the SPO List connector, because that will get the fields and the filesize column, but I just tested it now and the file size column is blank for some reason on my library (?! idk). 

lbendlin
Super User
Super User

Create a custom view that only contains the columns you are interested in. Consume that view.

Thanks.
I only have read access to the SharePoint site so I don't think I have the required rights to do that. I will talk to the team next week.
Create a custom view of a document library - Microsoft Support

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayPowerBICarousel1

Power BI Monthly Update - May 2024

Check out the May 2024 Power BI update to learn about new features.