Multiple requests for a single Get Data and Refresh Data with Web Data Source
It seem like for a single Web Data Source query Power BI Desktop is issuing a number of requests for the URL and the number of requests for the URL for a single query increases as the records to be returned increases.
We have an inhouse REST API server that return a pipe seperated value dataset upon receiving a URL request. For testing purposes we turned on logging in the REST API application so that it will capture the get requests being received.
Following are results of the test scenerios with different records size and setting timeout to 30 mins:
Web request that return 9 records. a. From invoking the Get Data till showing the Load dialog, 3 requests are received by the REST API server. b. Upon clicking on the LOAD in the dialog 2 additional requests are received by the REST API server. c. With each click of the Refresh 2 additional requests are received by the REST API server.
Web request that return 500K records a. From invoking the Get Data till showing the Load dialog, 11 requests are received by the REST API server. b. Upon clicking on the LOAD in the dialog 4 additional requests are received by the REST API server. c. With each click of the Refresh 4 additional equests are received by the REST API server.
Web request that return 1M records. a. The Get Data resulted in what seem like never ending loop of get requests to the URL until timeout of 30 mins.
If the above observation is correct then for 500K records which take about a mins for the server to response resulted in 20 mins for it to be loaded into Power BI Desktop. It will be quite impossible to load even larger recordset.
I suspect it is a bug with Power BI Web Data Source as the repeated requests is much shorter than the Command Timeout value that is specified in the Advance option.
For the time being I am testing a workaround by implementing Varnish Cache Server for the REST API application. It does help by reducing loading to the REST API applcation and responding to subsequent Power BI requests within seconds. Problem is Varnish Cache Server does not support SSL and I hope Power BI will have a fix for this.
We're having this same problem, I have a single query table that is a Web Query to a 3rd party REST server with stringent traffic throttling. If PBI were only doing a single request for that one table, it would be fine. Instead, what I'm seeing is that in the dependent tables that consist only of List.Distinct(PrimaryTable[InterestingColumn]) also generates a full duplicate query of the master table. As a result, my provider gets slammed with 10 concurrent calls instead of 1 (breaking overhead throttles), and not only that, PBI keeps going back for more and more iterations of the same thing while it's building the connection models.
I ended up having to deploy a cache system that would query my provider, store the data locally, then respond to the excessive PBI calls locally through the on-premise gateway. I needed the gateway though anyways to wrap the requests since PBI does not support Oauth and my provider requires it. Sample sets returned quickly enough from the provider that they didn't stall out, full production sets is where it stalled out and I started looking into the query traffic.
dpiret, did you get a resolution to this issue? I am also seeing multiple requests (three) every time I try to refresh a certain query and then it is timing out but continuing to send requests even after that. This is over taxing the server resources with the multiple calls.