Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
dnaman
Helper I
Helper I

Refreshing a large dataset using parameterization

Hi All,

 

Fairly new to Power BI, very new to any scripting in Power BI. 

 

We have a large dataset that we are connecting to, hosted in Azure SQL (about 9k rows by 500 columns wide). We plan to access this data through Azure API Management gateway, to provide secure access for external vendors to consume. Refreshing this dataset as one refresh is very very time-consuming.

 

We have thought about parameterizing the underlying stored procedure to call the dataset in 'chunks', in this case, divided up by City. We have 10 cities, so that could be 10 separate parameterized calls . (i.e. call the API URL, and append a parameter to filter out the dataset).

 

In Power BI, how can we set it up so that on a data refresh, it cycles through each of the cities, refreshes that portion of the data model, and repeats until all cities data are refreshed?

1 ACCEPTED SOLUTION
Watsky
Solution Sage
Solution Sage

I'm not sure there is a way to use parameters to loop through a refresh. 

The way I see it you have a few options:

1.  Speak with the dataset owner about the possiblity of creating smaller views based on the dataset. 

2.  Write out 10 native SQL queries to filter the data based on the cities  like SELECT columns FROM Large_Database WHERE City = 'Boston'). 

3.  If one of the 500 columns happens to be a date field like create date or last update, then you could setup incremental refreshes. You would have to be able to refresh the entire data set once but after that it would only update however many days of data you tell it to and will significantly cut down on refresh time.

 

Out of curiosity though, are you actually using all 500 columns? Have you been able to refresh the dataset on the service or does it error out?


Did my answer(s) help you? Give it a kudos by clicking the Thumbs Up! ?
Did my post answer your question(s)? Mark my post as a solution. This will help others find the solution.


Did my answer(s) help you? Give it a kudos by clicking the Thumbs Up!
Did my post answer your question(s)? Mark my post as a solution. This will help others find the solution.

Proud to be a Super User!

View solution in original post

2 REPLIES 2
Watsky
Solution Sage
Solution Sage

I'm not sure there is a way to use parameters to loop through a refresh. 

The way I see it you have a few options:

1.  Speak with the dataset owner about the possiblity of creating smaller views based on the dataset. 

2.  Write out 10 native SQL queries to filter the data based on the cities  like SELECT columns FROM Large_Database WHERE City = 'Boston'). 

3.  If one of the 500 columns happens to be a date field like create date or last update, then you could setup incremental refreshes. You would have to be able to refresh the entire data set once but after that it would only update however many days of data you tell it to and will significantly cut down on refresh time.

 

Out of curiosity though, are you actually using all 500 columns? Have you been able to refresh the dataset on the service or does it error out?


Did my answer(s) help you? Give it a kudos by clicking the Thumbs Up! ?
Did my post answer your question(s)? Mark my post as a solution. This will help others find the solution.


Did my answer(s) help you? Give it a kudos by clicking the Thumbs Up!
Did my post answer your question(s)? Mark my post as a solution. This will help others find the solution.

Proud to be a Super User!

Thanks Watsky!

 

1) Yep we did, see below

 

2) We parameterized the SP and created 10 queries in Power BI, each one calling by a different province. It took some time (few mins) to create the connection, load the initial data into Power BI memory, but it did not error out. I also used the APPEND feature to create a single query of those 10 queries to re-combine into a single table for the eventual end reporting.

 

Then i published it to Power BI service as a single dataset. I set it on scheduled refresh 6x/over night, it's taking about 10-15 minutes to refresh, which is not bad if the refresh frequency we are looking at is weekly.

 

3) We dont have a column like that

 

 

Thanks for the help!

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors