Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Grow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.

Reply
Michaeldias
Helper I
Helper I

Datamarts Preview Appending data

Hello Forum,

I'm trying to use Datamarts as a master database for reporting purposes. 100GB is plenty.

This requires me to source data from a dataflow which is fine.

 

However I would like the ability to have change the source data for the data flow from month to month and have the complete set of data stored in the datamart.  What I'm finding is that any refresh removes everything that was there.

 

Is this possible or do I need to build an ELT solution separately on SQL Server and use this as a Power BI source?

 

 

Many thanks,

Michael

1 ACCEPTED SOLUTION
edhans
Super User
Super User

You need a separate ELT solution. No Power BI solutions are a proper datawarehouse - not meant be. Azure SQL, or other tools are meant for that. All Power Bi tools (dataflows/datasets/datamarts) wipe all data on refresh (or keep some historical data in an incremental refresh scenario, but that is still not meant to be a datawarehouse for storing snapshots, and those historical partitions can be purged on a schema change, losing all data).



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

View solution in original post

6 REPLIES 6
edhans
Super User
Super User

You need a separate ELT solution. No Power BI solutions are a proper datawarehouse - not meant be. Azure SQL, or other tools are meant for that. All Power Bi tools (dataflows/datasets/datamarts) wipe all data on refresh (or keep some historical data in an incremental refresh scenario, but that is still not meant to be a datawarehouse for storing snapshots, and those historical partitions can be purged on a schema change, losing all data).



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting
otravers
Community Champion
Community Champion

You can create a dataflow outside of your datamart, set it up with the Bring Your Own Storage option so that you have access to the underlying datalake, then set it to a monthly scheduled refresh. This will generate monthly snapshots:

https://docs.microsoft.com/en-us/power-bi/transform-model/dataflows/dataflows-azure-data-lake-storag...

 

In your datamart you can then ingest from the ADLS Gen 2 bucket where you saved the output of the initial dataflow, and add the files' timestamp as your snapshot date.

------------------------------------------------
1. How to get your question answered quickly - good questions get good answers!
2. Learning how to fish > being spoon-fed without active thinking.
3. Please accept as a solution posts that resolve your questions.
------------------------------------------------
BI Blog: Datamarts | RLS/OLS | Dev Tools | Languages | Aggregations | XMLA/APIs | Field Parameters | Custom Visuals

Thank you otravers.

In this model I think we'd have an outside dataflow reading a folder from a growing set of source files.  Over time I am concerned over the management of these source files and growing costs of reading all of these every refresh unless the data flow knows to only read what has changed since the last refresh.

Yes, you'll have to manage these snapshot files eventually. Cloud storage cost is really the least of your concerns these days if "100GB is plenty" for you.

 

You can set up incremental refresh against ADLS Storage Gen 2:

https://community.powerbi.com/t5/Community-Blog/Parquet-ADLS-Gen2-ETL-and-Incremental-Refresh-in-one...

------------------------------------------------
1. How to get your question answered quickly - good questions get good answers!
2. Learning how to fish > being spoon-fed without active thinking.
3. Please accept as a solution posts that resolve your questions.
------------------------------------------------
BI Blog: Datamarts | RLS/OLS | Dev Tools | Languages | Aggregations | XMLA/APIs | Field Parameters | Custom Visuals

That's really neat!  I was concerned over PowerBI having to read from every file from the growing data set every time and the associated network costs.  If it only does this incrementally based on what has changed that does change things.

Microsoft won't charge you egress costs as long as the Azure data lake is in the same region as your Power BI tenant ("The Power BI workspace tenant region should be the same as the storage account region"):

https://docs.microsoft.com/en-us/power-bi/transform-model/dataflows/dataflows-azure-data-lake-storag...

 

The main concern here is ever increasing refresh times that might eventually lead to time outs if you were to always fully refresh from your growing collection of snapshots.

------------------------------------------------
1. How to get your question answered quickly - good questions get good answers!
2. Learning how to fish > being spoon-fed without active thinking.
3. Please accept as a solution posts that resolve your questions.
------------------------------------------------
BI Blog: Datamarts | RLS/OLS | Dev Tools | Languages | Aggregations | XMLA/APIs | Field Parameters | Custom Visuals

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayPowerBICarousel1

Power BI Monthly Update - May 2024

Check out the May 2024 Power BI update to learn about new features.

Top Solution Authors