I have a Power BI project which source is Salesforce. One of my reports is really hard to refresh - I guess it's because of one big Salesforce Object is really big (more than one million lines and many columns).
In order to optimize/analyze the interface, I created a Dataflow. The first object to import is the big object. This dataflow shows :
- when it doesn't work, I get a 2 hours 'time out' error on my first objet
- when it works, all the refreshing time is about 30 mn (Salesforce to Dataflow)
Another thing : I schedule 5 refreshes a day : 90% of the time, it fails. But if I refresh on Demand (at 9 AM or 2 PM), my chances of success are about 80-90%.
Refresh History (almost only fails !) :
Completed - Logs :
Failed - Logs :
My questions :
Do you understand why my chances of success are much more important when I refresh 'On demand' ? Do you think the reasons of the fails are linked to salesforce or Power BI ? Do you suggest somethoing to try ?
Thanks for your time. I don't know if my question was a real question. What I understand :
On Salesforce side : as I'm blind regarding Salesforce limits, it's hard to know what happens on this side : I suppose some other Salesforce export are running during my interfaces, explaining why it works sometimes.
On Power BI side : first I use dataflow with a pro license (possible but limited). I was wondering if using a dataflow with a pro license had a link with a poor priority in MS cloud. As I was wondering if the fact to run the dataflow 'on demand' had a better priority than to schedule it.
On network side (firewall, bandwidth) : I don't think that can be a reason. Other 'Salesforce =>Power BI' are running well + sometimes, the interface is running well. Maybe, as it's a big interface, the priority is low so when a lot of people are unning ingterfaces, the interfaces is still waing till it reaches 2 hours.
As you see, I don't know more about this point. Maybe the conclusion is that anyway, 'Salesforce to Power BI' interfaces are running well only with small-middle size data. If the data source is big, we must first export data from Salesforce to a datawarehouse, before importing to Power BI.
On the Power BI side, I do not believe that there is any prioritization of resources based on the type of license other than a Premium license has dedicated capacity to it. But, running something to be "on demand" or to be scheduled doesn't change the priority or how it runs. They both drop into the queue and run when next available. (That is why when you schedule something or hit "refresh now" it can take a bit before it actually starts.)
When I run into these types of problems I have found it to be a combination of items. For example, are there other processes running in your tenant of Salesforce that are in conflict? Or, do you have enough "bandwidth" within your SalesForce tenant that it can handle the request? Are you using a data storage location that has its own bandwidth issues (Azure SQL for example) or space issues or speed issues? (Are you using a data storage location at all?) And then you have to look at your network itself - are all the appropriate firewalls, networks, etc open (if needed, depending on where your hitting for the data)?
The 2 hour timelimit is based on the Dataflow so that is more about Power BI than it is about the data source. That would indicate to me that the data is flat not being "sent" to your dataflow.
Let us know about these items!
Did I answer your question? Mark my post as a solution!
Proud to be a Datanaut! Private message me for consulting or training needs.