Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Anonymous
Not applicable

Data Flows Schedule refresh not working consistently - Urgent help needed

Hello Community,

 

I want to bring this as a fresh discussion and seek help on managing data flows schedule refresh.  There is one data flow i have created and i knew that it holds only 30 records. However , it takes more 30 minutes to refresh the data flow. Am I doing something wrong in following processes ? 

 

Some where i read that If I am under premium capacity, they suggested to run datasets/dataflows at different times. But i checked the schedules and not more than 5 datasets are running currently.  In this way, I am not getting good use of dataflows and unable to convey pros of using data flows with clients . 

 

I am under Premium capacity and sure that environment doesnt have any momory issues as well. Could some one tell how to get rid of this situation ? 

 

Thanks,

G VEnkatesh 

36 REPLIES 36
Anonymous
Not applicable

Community,

 

on other end, I captured the refresh history shown in screen shot below. It says that when we are approaching business hours(EST ) , data flows started taking more time or not executing at all. In off hours, they run pretty quick. But I am not sure how to make the data refreshed in this business hours where more users login and look for reports. 

 

Kindly assist !! 

 

Capture.PNG

Anonymous
Not applicable

Is there any other dataset refresh during dataflows refresh during business hours?

 

Anonymous
Not applicable

Yes we do have (Say some 25 datasets) . But given a seperate memory allocation for data flows(10 gb), we expect the data flows to run fast . If not, data flows can never be a solution for us . 

 

Please assist ! 

 

G Venkatesh

Anonymous
Not applicable

 I attached one more screen shot. This data flow contain a single entity which retrieves only 30 records.  Just before some time, it just took few seconds to complete. However, it is still running when i triggered it now. 

 

Capture.PNG

Anonymous
Not applicable

I want to post some more updates regarding the issue. 

 

- In the workspace data flows are residing , Only 2 dataset refreshes are happening at this time when data flow is refreshing . 

- There are no paginated reports running at all. 

 

I have provided sufficient information. So i would request you to provide me suggestions based on information i have provided.

 

Anonymous
Not applicable

Try to run dataflows refresh on their own without any other dataset that could make contention on Gateway and see if something changes.

Anonymous
Not applicable

I will try to do that . I do not want to stop the refreshes in between. 

 

Would creating a seperate workspace for data flows a good idea?

Anonymous
Not applicable

It depends.

If the problem is caused by preassure on the Gateway or because of low resources setup at capacity level it won't solve your issue.

Anonymous
Not applicable

Could you please tell me how i have to check the resources that are set up at capacity level ? As far as i know, We are runing on P1 which holds more capacity.  I understand there are many other workspaces in my Org that uses resources.  We though being on premium services can help us run n number of reports and resources are enough to manage datasets refreshes and data flows . 

 

We looked at Memory space and CPU utilisation. What else we can check to make sure there is nothing wrong we are doing from environment side . 

 

 

G Venkatesh 

Anonymous
Not applicable

Which level of CPU and RAM consumption for the whole capacity do you reach while dataflows refreshes are slow?

Maybe during your business hours users interactivity don't allow to perform backgroud operations like dataflows refresh because of to high resources request. Interactive operations are always prioritized over background operations.

 

Anonymous
Not applicable

Hello Buddy,

 

Well, the resources arent hitting the maximum utilization when the data flows are running . Below are some of the details shown in screen shot. 

 

I understand interactive operations should be given priority over the back end data operations. However, given the seperate space for data flows to run fast , I still dont see what is claimed as advantages of data flows are really working here. Below attached link is a youtube video which is recently released . Not to blame anyone, but full advantages of Data flows are yet to come .. 

 

https://www.youtube.com/watch?v=jEuECrCRVdY

 

Thanks,

G Venkatesh Capture.PNG

Anonymous
Not applicable

I think that could be the point. Too much resources requested during business hours and since dataflows is a backend process and since it's dynamically assigned memory you could end up with a "no end process" for dataflows refresh.

 

I would consider to move this refresh out of business hours if it's suitable to your scenario.

Anonymous
Not applicable

I would have definitely done that to manage resources well between datasets and dataflows refresh. However, as I said, looking at the benefits of data flows , I conveyed my team to move reports that hold millions of records to start using data flows as data source . 

 

If I move refreshes out of business hours , People looking at this reports will see old data(stale) in business hours  . In this way, we are not showing the required information . So , I am like stuck on what to do here. 

 

G Venkatesh 

Anonymous
Not applicable

Keep in mind that datasets <> dataflows and they have completely different goals.

 

I usually use dataflows just to centralize and to standardize Data Preparation phase when I have the same table and M scripts repeated in many reports and if it could be refreshed asynchronously from datasets.

 

Obviously it depends by your scenarios.

Anonymous
Not applicable

Sorry I think i confused you here a bit. I completely understand the differences between datasets and dataflows. 

 

Let me explain my scenario again . i work on service management data (ITSM modules) like Incidents, Work orders ,changes etc. 

we use relational databases sql as data source ( Get data) in Power BI desktop and design reports > Publish them to service to view reports.   However , we need to capture at least one year to 18 months of data to do trend analysis of incidents etc in few reports board members look at . But to just load 4 months of Incidents data into report, it is taking 20 to 30 mins time (data retrieval) .. and some times it gets timed out .  We are not happy with this way . 

 

Then i have read some where about data flows and felt that this could help me in resolving this issue . I have seen that data flows are fetching 18 months in just few mins which is a positive sign. But now, i need to refresh the entities and then data flows to show accurate data in reports ( users always like to see real time or near time data) . so, i started using incremental refresh for entities and schedule refresh for data flows.  I am stuck at this point . 

 

Some where, i read about Flows/Streaming datasets as well . But do not know whether there are helpful for my scenario. 

 

Sorry for  long notes. But I am really stuck  . Please assist !!!

Anonymous
Not applicable

Hello Community,

 

Any other leads on this ? Is some one else facing the same issue and sorted out with best practices ?

 

Any kind of suggestions are welcomed . Kindly assist !! 

 

G Venkatesh

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors