Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Anonymous
Not applicable

Web.Contents Makes Too Many Calls To API, Buffering does not work

I am writing a custom connector to interface with an API, and I'm running into some incredibly frustrating issues with the way that Web.Contents is behaving. The connector goes as follows;

  1. supply a list of IDs (referencing server stored objects)
  2. supply a Username and Password (The authorization is UsernamePassword)
  3. The username and password are combined to make a short term auth token, which are sent to a /login endpoint server side that returns a longer lived authtoken that will be used later
  4. Using this new authoken that was the result of hitting the /login endpoint, make a dynamic amount of calls (dependant on other information that was returned in the /login request) to an endpoint that generates csv files. Most of these csv files are relatively small (~100 rows, 5 columns) with one large csv at the end (~400,000 rows and ~10 columns).
  5. All of these csvs are then processed (promote headers, modify some column types, etc.)  and then placed into a navigation table, which is then shown to the user.

The connector looks something like this:

shared Connector.Contents = (IDs as list) =>
let
       longLivedAuthToken = 
       let
	      shortLivedAuthToken = Extension.GetCredentials()[Username] & Extension.GetCredentials()[Password],
	      response = Binary.Buffer(Web.Contents("/Login", Authorization = shortLivedAuthToken)),//This does not get cached
       in
	      response[longLivedAuthToken],
       BigCSV = 
       let
	      table = Binary.Buffer(Web.Contents("/BigCSVEndpoint", Authorization = longLivedAuthToken)),//Neither does this
	                                                          //                      ^
//It looks like this doesn't use the value longLivedAuthToken,
//But rather it replaces this with the actual call (think a C macro)
promoted = Table.PromoteHeaders(table), named = CustomRenameFunction(promoted, Table.First(promoted)), in named, manySmallCSVs = let manySmallCSVs = List.Transform(IDs, each Binary.Buffer(Web.Contents("/SmallCSVEndpoint", Authorization = longLivedAuthToken, query = _))),//Or this // ^
//It looks like this doesn't use the value longLivedAuthToken,
//But rather it replaces this with the actual call (think a C macro)
promoted = List.Transform(manySmallCSVs, each Table.PromoteHeaders(_)), renamed = List.Transform(manySmallCSVs, each (CustomRenameFunction(_, Table.First(_)))) in renamed navtable = let allCSVs = List.Combine(manySmallCSVs, {BigCSV}), table = Table.GenerateNavigationTableFromList(allCSVs) in table in navtable;

 

 

My issue is two fold:

Firstly, I can see in the server logs that a log in attempt is made over 7-8 times for each CSV in the import (in general there are 7-10 csvs). This eventually causes the /login endpoint to return a 409 Conflict error (too many concurrent users of the endpoint) which causes the whole import to fail. I have tried buffering the results of the login call using Binary.Buffer() on the Web.Contents call, but it does not seem to work.

 

Secondly, the first issue also manifests with the CSV downloads. Hitting these endpoints with a browser generally takes ~2 seconds to download the files individually in a single request, but for some bizarre reason importing them in the connector takes minutes, and on the server side there are literally hundreds of requests being made. (It looks like Power BI is downloading them in small chuncks, and making subsequent requests to get the next chunck? This is not programmed behaviour on the API side but most of the request contain the line 'expect 100 continue' which seems to indicate this is what is happening)

 

I have found that disabling parallel loads solves the issue of the /login attempts returning a 409 status code, but that is just because they all happen in series rather than in parallel, the same number of total calls are still made and the whole import takes a lot longer (scales with the number of csvs).

 

An interesting thing that I've noted though is that there seem to be Three distinct phases of the import. The first phase occurs immediately after supplying the ID and credentials, and results in ~3 calls to the /login endpoint and a similarly small number of calls to the csv endpoints. This phase ends when the preview window opens with all the csv files in the navigation table unchecked. The names of these tables are derived from specific cells in the tables, so they must have been loaded into memory for the names to have been pulled out (and the endpoints send the csvs in full). At this point I would expect that all the necessary requests have been made, and that no further API calls are necessary (otherwise how could the tables be dynamically named based off of data inside the table?).

 

Now when I start checking all the boxes for the tables, phase 2 starts, and calls start being made again for what I'm assuming is preview data. Why does this have to happen, shouldn't the data already be stored/cached?

 

Phase 2 ends and Phase 3 starts when I click the load button. Doing this causes all the endpoints to start being hit again, accounting for a further 4-5 calls per CSV which finally overwhelms the /login endpoint and the load errors out. 

 

I managed to eek out an error message from Phase 3 that looks somethin like this

Formulas:

section Section1;

shared _csv1 = let
    Source = Connector.Contents("471516961986314240"),
    _csv11 = Source{[Key="_csv1"]}[Data]
in
    _csv11;

shared _csv2 = let
    Source = Connector.Contents("471516961986314240"),
    _csv21 = Source{[Key="_csv2"]}[Data]
in
    _csv21;

shared _csv3 = let
    Source = Connector.Contents("471516961986314240"),
    _csv31 = Source{[Key="_csv3"]}[Data]
in
    _csv31;
...

 

This error message contains a shared _csvX = let... once for each check box that I checked. What it looks like it's doing internally is calling the connector (which gets all the csv's) once for every csv (which again, makes no sense because the csv's have already been imported). So in total if the import generates 7 output csv's, this explicitly fetches 7*7 csv's, or 49. This relationship of ~n^2 coincides with the number of login attempts that I am seeing. I'm assuming that this is not intended behaviour, and that the Source = Connector.Contents("...") should be cached from the initial import that generated the navtable that I interacted with, or that at least only one more call to it should be made. It's impossible to find out what is and isn't being cached, and the lazy loading evaluation model means that even if I write Binarry.Buffer(), which would result in an in memory copy being kept and subsequent uses of the output being fetched from cache rather than from calling Web.Contents, the call to Binary.Buffer itself seems to be deffered until the contents are actually needed, which means that a call to it can be outside of the initial scope where it would have been useful to cache the information.

 

I think that this is essentially what is going wrong:

let
    rand = (List.Random(2)),
    x = rand,
    y = rand,
    z = rand
in
    {x{0},y{0},z{0}}

In this example, you would expect that rand gets evaluated into a list containing two random numbers, and x,y,z would then reference that List. The output would then be a list containing the same three numbers. 

This is not the case. The evaluation of List.Random(2) is put off until it is absolutely needed, and the evaluation model generates something like this as the actual output {(List.Random(2)){0}, (List.Random(2)){0}, (List.Random(2)){0}} For a dynamic call like this you end up with three diferent numbers.

If you add a buffer and change the code to 

let
    rand = List.Buffer(List.Random(2)),//Buffer this now
    x = rand,
    y = rand,
    z = rand
in
    {x{0},y{0},z{0}}

Then the output generated is a list containing three identical numbers. 

It seems as if in my connector buffering Web.Contents does not have the same effect, and I'm left with a waterfall of API calls that should not need to be made.

Any help would be greatly appreciated.

 

 

 

2 REPLIES 2
Anonymous
Not applicable

I actually have a similar problem, and the Table.Buffer function also didn't help. When my navigator appears, 3 calls have been made aswell. I figured the only workaround would be to uncheck "Enable Parallel Loading of Tables" and/or Load queries slowly, disabling load of the ones already in the query editor. I'm an intern who just made a custom connector as the first project, so i can't help you. But if you find any solution, i would appreciate it.

Kind Regards.

v-piga-msft
Resident Rockstar
Resident Rockstar

Hi @Anonymous ,

If I understand your scenario correctly that you have problems when you import data via the custom connector in power bi?

If so, could you show the error message details with the screenshot?

In addition, you could have a reference of this blog firstly which should be helpful.

Best Regards,

Cherry

 

Community Support Team _ Cherry Gao
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors