Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
hjaf
Advocate I
Advocate I

SharePoint list query alternative or optimization

Hello everyone!

 

Because sharepoint list queries quickly comes too slow to work with in Power BI, I have experimented with the query that sharepoint generates for excel. This is vastly more effective in Excel compared to getting the same information using the sharepoint query in Power BI. I think the excel export query is basically a query of the list view, where all the lookup values are flattened.  I can't seem to find the way to recreate this query in Power BI, It looks like its using a OLE DB method, but not sure how to continue from there, and some posts suggest that this is not supported by PowerBI.
Connectionstring for excel: "Provider=Microsoft.Office.List.OLEDB.2.0;Data Source="";ApplicationName=Excel;Version=12.0.0.0"
Anyone have experience dealing with this?

 

I think the reason for queries to sharepoint-resources become extremely slow, is due to all the lookup and choice columns, to get the value of the column I have to expand them. An alternative to exploring these view queries is optimizing the queries, suggestions are most welcome! 😄

1 ACCEPTED SOLUTION
mahoneypat
Employee
Employee

@hjaf FYI that I finally made a video to describe this approach, and am adding it here for others that may find this post.  It also gets the count of items and makes the right number of API calls.

Get SharePoint List Data with Power BI ... Fast - YouTube

 

Also, a reminder to mark one of these as the solution.

 

Regards,

Pat





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


View solution in original post

39 REPLIES 39

Thank you, your video helped me. After I moved to version 2 everything was faster. MicrosoftTeams-image.png

If you think version 2 is fast, wait until you try the "good" method using the REST API.  I have a list with more than 150,000 items, and it refreshes in about 8 seconds.

 


CmdrKeene

Thank you times a bazillion.


CmdrKeene
mahoneypat
Employee
Employee

Sharepoint List data sources can be slow.  Fortunately, there is a much faster way.  Try a blank query with this formula as your Source.

= Json.Document(Web.Contents("https://<YourTenantName>.sharepoint.com/sites/<YourSiteName>/_api/web/lists/GetByTitle('<YourListNam...", [Headers=[Accept="application/json"]]))

 

Replace all the parts in < >, including the < >.  You will get a JSON response.

 

There is another version that does pagination if your list is >5000 items.  Please let me know if you need that one.

 

If this solution works for you, please mark it as the solution.  Kudos are appreciated too.  Please let me know if not.

Regards,

Pat

 

 

 





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


@mahoneypat Awesome!
yes, I most definately need to use pagination 🙂 near 20k items in the lists 🙂 

Replace after the ? with the following

 

?$skipToken=Paged=TRUE%26p_ID=30&$top=5000", [Headers=[Accept="application/json"]]))

 

I would make a list with = {0, 5000, 10000, 15000, 20000}  or something more dynamic for when the list gets bigger.  Convert that to a table and add a custom column that concatenates the list value in place of the 30 in red text above.  Then expand the table to get all your data.

 

Sharepoint Lists can be slow.  This approach has saved much refresh time.

 

If this works for you, please mark it as solution.  Kudos are appreciated too.  Please let me know if not.

Regards,

Pat

 





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


@mahoneypat  you are a true hero! Thank you for the video and for the comment! Worked for me like a charm 😊

@munchkin666 Please see this article I wrote on this topic.

Updated – Get SharePoint List Data … Fast – Hoosier BI

 

Pat





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


@mahoneypat  you are awesome!

I have tried replying and asking for guidance only to have my post marked as spam. But I eventually came up with a solution. Do you concur with this? Quote or correct it, then I'll mark mark your reply as a solution so that other people may get the whole picture 🙂

The query i ended up at that seems to be working(replaced RED values):

  • Column1 values is created with:  List.Generate(() => 0each _ < 120000each _ + 5000)
  • SPItems: Json.Document(Web.Contents("https://TennantShortName.sharepoint.com/sites/SiteName/_api/web/lists(guid'ListGUID')/items?$skipToken=Paged=TRUE%26p_ID="&Text.From([Column1])&"&$top=5000", [Headers=[Accept="application/json"]])))


Complete commented query:

 

let
    Source = List.Generate(() => 0, each _ < 120000, each _ + 5000),  // Generate a list that increments 5000 up to max value 120 000
    #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error), // Convert the list into a table
    #"Added Custom" = Table.AddColumn(#"Converted to Table", "SPItems", each Json.Document(Web.Contents("https://<TennantShortName>.sharepoint.com/sites/<SiteName>/_api/web/lists(guid'<ListGUID>')/items?$skipToken=Paged=TRUE%26p_ID="&Text.From([Column1])&"&$top=5000", [Headers=[Accept="application/json"]]))), //custom column that does the actual query to sharepoint 
    //Note: replace <TennantShortName>, <SiteName> and <ListGUID> 
    //Instead of using list guid, you can use list names with GetByTitle(): "https://<TennantShortName>.sharepoint.com/sites/<SiteName>/_api/Web/Lists/GetByTitle('<List Title>')/items?$skipToken=Paged=TRUE%26p_ID="&Text.From([Column1])&"&$top=5000"
    #"Expanded SPItems" = Table.ExpandRecordColumn(#"Added Custom", "SPItems", {"odata.metadata", "odata.nextLink", "value"}, {"odata.metadata", "odata.nextLink", "value"}), //expand the results 
    #"Removed Duplicates1" = Table.Distinct(#"Expanded SPItems", {"odata.nextLink"}), // Remove the duplicated nextLink items to get unique items
    #"Expanded value" = Table.ExpandListColumn(#"Removed Duplicates1", "value"), // Expand the results from queries into new rows
    #"Expanded value1" = Table.ExpandRecordColumn(#"Expanded value", "value", {"Id", "Title"}), // Expand wanted columns in the sharepoint list
    #"Removed Columns" = Table.RemoveColumns(#"Expanded value1",{"Column1", "odata.metadata", "odata.nextLink"}),  // Remove columns from the initial query.
    #"Changed Type" = Table.TransformColumnTypes(#"Removed Columns",{{"Id", Int64.Type}, {"Title", type text}}) // Type setting
in
    #"Changed Type"

 




I really like this method, it is super fast and super easy to configure. However, I have found that whilst using this method, it doesn't pick up all the fields. I have tried expanding all the rows and columns, selecting all values but if simple does not pick up the one field I need. I really don't want to use a v2 connector as it is slow and terrible. 

 

Any help would be appreciated. 

I can try to poke around and see if I can find a solution. What type of field is missing? 

 

I will mention that person fields did not come in in a format that was great for me, so in my power query I also pulled in the user information table and then used the user ID to relate to it.

 

 


CmdrKeene

The field in questrion is called Name, which appears to be a text field. This field is required as it links back to a SharePoint attachment so I wish to use the values contained to connect to the corresponding attachment via its URL. 

 

The field in question IS exposed if I use a v2 connector, but I did not want to go down this route and would much rather use your solution. 

That's really odd, for me all the text fields just show up.  Is it maybe a lookup field on the front-end of SharePoint?  Or you're sure it's either "single line of text" or "multiple lines of text" on the list itself?


CmdrKeene

Hello CmdrKeene,

 

I hope you are well?

 

Thank you for looking into this. I have spoken to the developer and have been advised that the name is indeed a lookup. Apparently it does not appear on the underlying dataset, which I understand. It's a shame as the v2, whilst extremely slow, does pick up the field. Maybe because it is interogating the UL rather than the data itself?

 

Is there a way to GetByForm or GetByURL?

I can't think of a great workaround for lookup field really. Is it possible to retreive the data and then use a merge or relationship to perform the lookup in the query?


CmdrKeene

Not sure, but I shall certainly go back to the developers. 

 

Thanks for the info Cmdr, you need a promotion to CAPT.

That looks good to me. I'm glad it works for you. I'm curious, how much of an upgrade time improvement did you see?

I wanted to write a blog with this hoosierbi.com (my blog about making Power BI pro bono with non-profit benefits). Your question exactly on this subject. I ended up doing a query version and function of it which makes it easier to modify/use. The function could be used if there were multiple lists that had the same columns and one had a tenant table, site, list (tenant, of course, would not change within a company).

Here they are:

As a function -

Leave

Origin (tenant name, name, name, and so on) >

Leave

site : site name,

tenant - tenant's name,

list : list name,

getdata á Json.Document(Web.Contents("https://" & tenant & ".sharepoint.com/sites/" & site & "/_api/web/lists/GetByTitle('" & list & "')/items?$top-5000", [Headers-[Accept"application/json"]]))

In

Getdata

In

Source

As a consultation

Leave
Source?

Leave

"NameOfMySite" website,

tenant : "NameOfMyTenant",

list of names "NameOfMyList",

getdata á Json.Document(Web.Contents("https://" & tenant & ".sharepoint.com/sites/" & site & "/_api/web/lists/GetByTitle('" & list & "')/items?$top-5000", [Headers-[Accept"application/json"]]))

In

Getdata
In
Source

If this works for you, mark it as a solution. Praise is also appreciated. Please let me know if you don't.

Best regards

Pat





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


I probably did a lot sub-optimal steps in the previously used standard method, so I went from literally 3-4 hours, down to less than 3 minutes! But even just getting the raw data with the traditional query method still took 1+ hour, I believe its due to a lot of choice-columns and lookups in the list.

 

It would be interesting to create a function that determines the maximum ID and automatically get all the items. I think there is some room for further optimizing, regarding the overlapping queries this method produces. because the first 5k items have IDs ranging from 5k to 80k means that the first 16-ish queries will basically return the same data, but anyways. going down from hours to mere minutes is a giant leap I am very satisfied with 😄 Thank you again @mahoneypat

PS Updates to previous query: I realized did not clear out all the duplicates, so I added an additional duplication removal on IDs. I also put tennantId, sitename and list into parameters which made it a bit easier to configure:) 

Ok, this is promising! 

 

I successfully got the first 5000 item in a breeze, however, the way you suggest iterating / paging the query is somewhat unclear to me.
In my case, the lowest ID starts at 5495, item # 5000 has ID around 80k. The "density" of the id range varies because of creations and deletions over time, and that sharepoint doesn't immediately re-use ID's. I also noticed that sharepoint provides an odata.nextLink value for the next 5k items, maybe I can somehow  create an iteration that continues until this property is not appearing. Right now this appears to happen at about 120 000 (even though the item list contains just around 20k items).

 

I can probably use a list as you suggested and have the list stop at 200 000 in 5000 increments, but not exactly sure how to make a query that iterates around the this list. can you provide an example?

PS:
I had to add "/items?" in the uri so the current uri for web.contents() in my case is as follows, could you use these uris in your answer so that when I tag the post as a solution, other people would have a better understanding 🙂


= Json.Document(Web.Contents("https://<TennantShortName>.sharepoint.com/sites/<SiteShortName>/_api/Web/Lists/GetByTitle('<List Title>')/items?%24skiptoken=Paged%3dTRUE%26p_ID%3d<StartAtId>&%24top=5000" , [Headers=[Accept="application/json"]]))

or using list guid the url looks like this :
https://<TennantShortName>.sharepoint.com/sites/<SiteShortName>/_api/web/lists(guid'<item list guid>')/items?%24skiptoken=Paged%3dTRUE%26p_ID%3d<StartAtId>&%24top=5000

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.