Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
henryaj
New Member

Do I need a Data Warehouse first ?

I'm currently supporting SQL based system that comprises of 700 plus sql tables.

The database size is small, only 2 gig of data.

The database is highly normalised.

 

The exiting reporting application is built in winforms and all reports are hand crafted in C# writing out to excel.

If a new report or tweak is required this requires a developer to make a code change and release a new version of the reporting application.   The takes up a huge amout of the deveopment resource.

 

Management now require dashboards and self service reports.

 

I've looked at various tools in the market and decided on Power BI.

 

 

I then have a number of options.

 

1.  Run Power BI against the existing SQL database.

2.  Copy data to a Data Warehouse each night and run BI aganst that.  The DW would just be a copy of the OLTP database structure.

3.  Create a Data Warehouse withthe data flattened out (i.e. transform the data into fact and dimensions using SSIS) and run Power BI against the new Data Warehouse structure.

 

Please could someone advise as to the beast approach.

Thanks.

 

1 ACCEPTED SOLUTION
AltGr9
Helper I
Helper I

Lets consider the pros and cons of each.

 

Point 1 - pointing Power BI to your normalised data source

If you pointed PowerBI to the OLTP (normalised) data source, you would have a lot of relationships to set up in PowerBI. This is a problem for a couple of reasons:

  • setting them up in Power BI takes more time. I don't find it quite as easy as doing it in SQL. For example, Power BI doesn't allow you to join two tables together on more than one column. So if you need to join two tables together on more than one column, you need to create another column in Power BI to concatinate the other columns, and then join on that.
  • joins in DAX are costly on performance, so your reports may be slow, which could be a disaster

One thing you shouldn't need to worry about is using the Power BI model elsewhere. With SQL Server 2016, you should be able to import the Power BI model into SSAS Tabular.

 

Point 2 - pointing Power BI to a copy of your normalised data source

I would say this is a bad idea. Lets think about why we would want to do this: basically the reason is, to ensure we don't put too much strain on our live OLTP system. But actually, implementing this solution is not going to help. Consider the following:

  • With Power BI, if your data source is a SQL Server database, it will import all the data into the Power BI model. So when it queries it (i.e. when users are using the PBI report), it won't be directly querying your OLTP system, it will be querying its own model. In other words, reporting in Power BI does not impact your OLTP system. So the only time Power BI will impact the performance of your OLTP system, is when its importing data into the model, which it does on a schedule. You get to set up the time of the schedule, so you could set the schedule up to import data in the evening, or whenever people are not using the OLTP system. Note: Direct Query mode is availbable in SSAS databases.

So in other words, there is no point in this option. Its always going to put a strain on your OLTP database when you're extracting the data, whether you're extracting the data and importing it into Power BI, or into a SQL Server database on another server.

 

Point 3 - pointing Power BI to a de-normalised DW

As @MattAllington said, this is the best option of the ones you mentioned. It reduced performance issues caused by joins in Power BI and it means you can easily implement the relationships between tables in SQL.

 

But it does mean you have to provision another database elsewhere for your data warehouse, potentially set up SSIS packages to import the data, or use linked servers to import your data (which can be a security auditing nightmare, and which also produce poor execution plans). And so, there is this setup cost to consider.

 

There is another option...

 

Alternative - pointing Power BI to a de-normalised view contained in the source system

Instead having to set up all those SSIS packages, why not just implement a view (or better, a stored procedure, which will run faster due to cached SQL Server execytion plans) to join your tables together in the SQL source, and point Power BI to that?

 

This way, you get all the performance advantages of Power BI, you don't need to implement all those SSIS packages to import your data, you don't need to set up many relationships in Power BI (as they're handled in the view/SP) and this all adds up to using Power BI as soon as possible.

 

One advantage of the DW (point 3) over this alternative, is that it enables you to do other reporting directly from the denomalised data struture in the DW, such as SSRS reports, or simply SQL statements that business users may write. So it gives you more room for the future. But if you have all the joins already in your View/SP, you could use the View/SP for now, and then later on copy the SQL code to implement a fuller solution with a data warehouse.

 

Just a note - Kimball Star schemas are not always needed. You can use flat tables too.

View solution in original post

4 REPLIES 4
AltGr9
Helper I
Helper I

Lets consider the pros and cons of each.

 

Point 1 - pointing Power BI to your normalised data source

If you pointed PowerBI to the OLTP (normalised) data source, you would have a lot of relationships to set up in PowerBI. This is a problem for a couple of reasons:

  • setting them up in Power BI takes more time. I don't find it quite as easy as doing it in SQL. For example, Power BI doesn't allow you to join two tables together on more than one column. So if you need to join two tables together on more than one column, you need to create another column in Power BI to concatinate the other columns, and then join on that.
  • joins in DAX are costly on performance, so your reports may be slow, which could be a disaster

One thing you shouldn't need to worry about is using the Power BI model elsewhere. With SQL Server 2016, you should be able to import the Power BI model into SSAS Tabular.

 

Point 2 - pointing Power BI to a copy of your normalised data source

I would say this is a bad idea. Lets think about why we would want to do this: basically the reason is, to ensure we don't put too much strain on our live OLTP system. But actually, implementing this solution is not going to help. Consider the following:

  • With Power BI, if your data source is a SQL Server database, it will import all the data into the Power BI model. So when it queries it (i.e. when users are using the PBI report), it won't be directly querying your OLTP system, it will be querying its own model. In other words, reporting in Power BI does not impact your OLTP system. So the only time Power BI will impact the performance of your OLTP system, is when its importing data into the model, which it does on a schedule. You get to set up the time of the schedule, so you could set the schedule up to import data in the evening, or whenever people are not using the OLTP system. Note: Direct Query mode is availbable in SSAS databases.

So in other words, there is no point in this option. Its always going to put a strain on your OLTP database when you're extracting the data, whether you're extracting the data and importing it into Power BI, or into a SQL Server database on another server.

 

Point 3 - pointing Power BI to a de-normalised DW

As @MattAllington said, this is the best option of the ones you mentioned. It reduced performance issues caused by joins in Power BI and it means you can easily implement the relationships between tables in SQL.

 

But it does mean you have to provision another database elsewhere for your data warehouse, potentially set up SSIS packages to import the data, or use linked servers to import your data (which can be a security auditing nightmare, and which also produce poor execution plans). And so, there is this setup cost to consider.

 

There is another option...

 

Alternative - pointing Power BI to a de-normalised view contained in the source system

Instead having to set up all those SSIS packages, why not just implement a view (or better, a stored procedure, which will run faster due to cached SQL Server execytion plans) to join your tables together in the SQL source, and point Power BI to that?

 

This way, you get all the performance advantages of Power BI, you don't need to implement all those SSIS packages to import your data, you don't need to set up many relationships in Power BI (as they're handled in the view/SP) and this all adds up to using Power BI as soon as possible.

 

One advantage of the DW (point 3) over this alternative, is that it enables you to do other reporting directly from the denomalised data struture in the DW, such as SSRS reports, or simply SQL statements that business users may write. So it gives you more room for the future. But if you have all the joins already in your View/SP, you could use the View/SP for now, and then later on copy the SQL code to implement a fuller solution with a data warehouse.

 

Just a note - Kimball Star schemas are not always needed. You can use flat tables too.

Thanks, I do see SSRS coming into the equation to write pre canned reports that I can embed directly into our existing winforms and wpf applications so I think option 3 may be the way to go.  Lots for me to consider.  Thanks for the advice.

TomMartens
Super User
Super User

Hey,

 

if you want to provide a solution that enables its users to create reports in a "self service" manner, you have to "simplify" the OLTP data model creating a star schema (fact and dimensions).

 

If it is mandatory to build a data warehouse in a first step, depends, complexity of the OLTP model and number of users
(the more users, the more likely it will become, that a clutter of Power BI models will surface), that are willing and able to create their own reports, and how users gain access to the star schema.

 

As @MattAllington already mentioned there is a little more to consider, but maybe it's a good start to create Power BI queries to create denormalized dimension tables and some fact tables.



Did I answer your question? Mark my post as a solution, this will help others!

Proud to be a Super User!
I accept Kudos 😉
Hamburg, Germany

I would write queries to denormalise the source data to a reporting star Schema and load into power bi desktop. The need for an interim deal really depends on when you can refresh Power bi and if you can do this without causing issues. You can schedule the refresh overnight, so that may work. 

 

Of course there is a lot more behind this. 



* Matt is an 8 times Microsoft MVP (Power BI) and author of the Power BI Book Supercharge Power BI.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.