Re: Data Modeling - is it worth to create a star s...

Anonymous · ‎01-07-2021

Hi guys,

I am working with a single source CSV file, size 11.2 MB, around 43k rows and 19 columns.

The information is about Incidents (tickets), so it has information about agents, SLAs, assignment groups, priority etc.

The thing is, I'm trying to create a star schema, but I only have one source file, so I'm practically duplicating my fact table to create a dim table, removing columns, creating IDs and merging the dim table with the fact table so I can have only IDs in the latter. So far I have created 8 dim tables.

Whenever I load the model, I see it gets slower as I add dim tables, I'm guessing this is because I'm duplicating the source. Also when applying changes I see it gets up to 60.. 70.. 80 MB.

Here's my final model:

The questions are:

Is this the right approach for a single source file with 11.2 MB size?

Should I lower the number of dim tables?

Should I keep all the information in one table and create calculated columns or measures?

How can I improve the performance? I was told to always look for a star schema while modeling in power bi.

Thanks guys

OV

Kucrapok · ‎01-08-2021

Can you access the same Data using SQL/Database? The best approach would be to have a table for each of your dimension.

The refreshing takes time because you are using the same 43K Datasource when generating each of the dimensions

If you are using a Star schema, the compression of your model will better and result in a smaller Pbix file.
A smaller file will always imply a better performance when doing a scan on a table. Of course, for a model containing a Fact table of 43K rows, it might not have a big gain in performance.

Always remember to lower the number of columns in your table (especially the Fact table). Remove columns with high cardinality (number of distinct values) if you don't need them.

Anonymous · ‎01-08-2021

Thanks for the tips!

I'm not creating a connection to a database, but a CSV file.

I managed to duplicate the source only twice, my performance is way much better now!!

I'm going to create a few dim tables using DAX instead of PQ (:

m3tr01d · ‎01-08-2021

@Anonymous Ok, sounds good,
Try to avoid SummarizeColumns when creating this table and be careful if you use VALUES Function, you might have Circular dependency error when creating the relationship.

Anonymous · ‎01-08-2021

@m3tr01d Oh! actually I'm using SUMMARIZE, not SUMMARIZECOLUMNS, I guess I'm safe... (?)

m3tr01d · ‎01-08-2021

Yes, with Summarize, you'll have no issue :P.

littlemojopuppy · ‎01-07-2021

Ideally you want a star/snowflake schema. Fact table(s) in the center and dimensions radiating outward.

If you don't want to use PQ to create dimensions you can always use DAX to create those tables. Don't listen to the bot.

Anonymous · ‎01-08-2021

Do the dimensions using DAX provide the same performance in the dashboard (processing, slicers, etc..) as the ones in PQ?

Is it advisable to create no dim tables at all and only use one table with the whole 19 columns?

amitchandak · ‎01-07-2021

@Anonymous , I know power Query is the suggested way. But in DAX you can get bit slower but less load of data.

New tables using Distinct, Summarize etc

Distinct(Table[Col1])

Summarize(Table, Table[Col2], Table2[Col3])

Check which one performs better.

I think Star Schema is best, the cases when need all or need to ingore some filter , star schmea come bit handy.

!! Power BI 101 Interview questions !! !! Master Microsoft Fabric- 36 Videos !!
Microsoft Power BI Learning Resources, 2023 !!
Learn Power BI - Full Course with Dec-2022, with Window, Index, Offset, 100+ Topics !!
Did I answer your question? Mark my post as a solution! Appreciate your Kudos !! Proud to be a Super User! !!

Data Modeling - is it worth to create a star schema?

Helpful resources

Europe’s largest Microsoft Fabric Community Conference

New forum boards available in Real-Time Intelligence.

Power BI Monthly Update - May 2024

How to Get Your Question Answered Quickly

Jumpstart your career with the Fabric Career Hub