Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
Anonymous
Not applicable

Data Modeling - is it worth to create a star schema?

Hi guys,

 

I am working with a single source CSV file, size 11.2 MB, around 43k rows and 19 columns.

The information is about Incidents (tickets), so it has information about agents, SLAs, assignment groups, priority etc. 

 

The thing is, I'm trying to create a star schema, but I only have one source file, so I'm practically duplicating my fact table to create a dim table, removing columns, creating IDs and merging the dim table with the fact table so I can have only IDs in the latter. So far I have created 8 dim tables.

 

Whenever I load the model, I see it gets slower as I add dim tables, I'm guessing this is because I'm duplicating the source. Also when applying changes I see it gets up to 60.. 70.. 80 MB. 

 

OttmarV_0-1610072776235.png

 

Here's my final model:

OttmarV_1-1610072895322.png

 

The questions are:

Is this the right approach for a single source file with 11.2 MB size?

Should I lower the number of dim tables?

Should I keep all the information in one table and create calculated columns or measures?

How can I improve the performance? I was told to always look for a star schema while modeling in power bi.

 

Thanks guys

OV

 

 

 

 

 

 

8 REPLIES 8
Kucrapok
Helper I
Helper I

Can you access the same Data using SQL/Database? The best approach would be to have a table for each of your dimension.

The refreshing takes time because you are using the same 43K Datasource when generating each of the dimensions

If you are using a Star schema, the compression of your model will better and result in a smaller Pbix file. 
A smaller file will always imply a better performance when doing a scan on a table. Of course, for a model containing a Fact table of 43K rows, it might not have a big gain in performance. 

Always remember to lower the number of columns in your table (especially the Fact table). Remove columns with high cardinality (number of distinct values) if you don't need them.

Anonymous
Not applicable

Thanks for the tips!

I'm not creating a connection to a database, but a CSV file.

I managed to duplicate the source only twice, my performance is way much better now!!

I'm going to create a few dim tables using DAX instead of PQ (:

m3tr01d
Continued Contributor
Continued Contributor

@Anonymous  Ok, sounds good, 
Try to avoid SummarizeColumns when creating this table and be careful if you use VALUES Function, you might have Circular dependency error when creating the relationship.

Anonymous
Not applicable

@m3tr01d Oh! actually I'm using SUMMARIZE, not SUMMARIZECOLUMNS, I guess I'm safe... (?)

 

OttmarV_0-1610152474502.png

 

 

m3tr01d
Continued Contributor
Continued Contributor

Yes, with Summarize, you'll have no issue :P.

littlemojopuppy
Community Champion
Community Champion

Ideally you want a star/snowflake schema.  Fact table(s) in the center and dimensions radiating outward.

 

If you don't want to use PQ to create dimensions you can always use DAX to create those tables.  Don't listen to the bot.

Anonymous
Not applicable

Do the dimensions using DAX provide the same performance in the dashboard (processing, slicers, etc..) as the ones in PQ?

Is it advisable to create no dim tables at all and only use one table with the whole 19 columns?

amitchandak
Super User
Super User

@Anonymous , I know power Query is the suggested way. But in DAX you can get bit slower but less load of data.

New tables using Distinct, Summarize  etc

Distinct(Table[Col1])

Summarize(Table, Table[Col2], Table2[Col3])

 

Check which one performs better.

 

I think Star Schema is best, the cases when need all or need to ingore some filter , star schmea come bit handy.

Helpful resources

Announcements
PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.