Help with best practice

Anonymous · ‎08-23-2020

Hi everyone,

Ok so i been workin with a dataset of O&G production which brings lot of information like weel name or id, company, area where the well is, and of course production of oil and gas of each well each month.

So I have 2 main questions,

1) When I upload all this files to my BI the inicial size is around 130 mb ¿Should I feel ok with this size? Or as soon as I start working with it the tool is going to be too slow?

2) In order to make a better report and avoid making too many graph I want to be able to filter the evolution of the oil production by costumer or by area name or any other field wich is valuable to see the evolution of production. So to do this I always make an index in my original database and then I duplicate it and unpivot with the fields I want but this make the size to go to more than 300 mb ¿There is another best way to do this?

3) Finally I want to create a costum column that add me the first time a well appear in the database ¿How could I do this? I know how to do it with Calculate in a calculate column but I want it to be in my database in order to be able to unpivot it and have this field.

If anyone is interested all the production files are public in this site ("Producción de Pozos de Gas y Petroleo - 2020" ,"-2019",etc)

Thanks,

Greg_Deckler · ‎08-23-2020

@Anonymous - I looked at your CSV files, are you able to post your PBIX files? I don't see why you are unpivoting anything but I may not fully understand what you are trying to achieve. But at first glance, I don't know why you would unpivot anything versus just use slicers or build yourself a star schema with your fact table exactly as the CSV files are.

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Anonymous · ‎08-24-2020

Thanks @Greg_Deckler for your answers, in the following LINK you will find mi PBIX file.

1. You will see there are 2 pages "Produccion General Unpivot" and "Producción General" the idea of doing the unpivot is to be able to see the evolution of oil production by "cuenca" or by "sub_tipo_recurso" in 1 graph and not as you will see at the other page that you have to create one graph for each thing.

2. The main problem I have with the 380 mb file is that where I work the servers don't allow files bigger than 250mb jaja So I need to mantain it under 250 mb size.

Thanks,

v-easonf-msft · ‎08-27-2020

Hi, @Anonymous

I uploaded the report you shared to a workspace with a Dedicated capacity.Although the report upload is slow, everything is running normally. I‘m not clear on what you mean by limitation of the size more than 250 mb.

If it is really caused by the large imported dataset,have you considered storing part of your data sources in the database and then building reports in directquery mode rather than import mode?

Best Regards,
Community Support Team _ Eason

swise001 · ‎08-24-2020

@Anonymous

Let's suppose you want to keep both tables (unpivoted and pivoted). These are the two largest contributors to your file size, largely because of the index used to link them. Aside from removing one of these two tables outright - you aren't likely going to drop the file below 250MB.

A few best practices to consider:

1. Filter your data before saving the file to your server. After you upload to the service (which can accept up to 1GB file size in Pro capacity) - filter your underlying PBIX down by amending the queries. You will have to remember to undo this before each re-publish - but it's quick way to reduce your file size.

2. Setup incremental refresh on your archive tables. This will reduce how much data is actually imported into your PBIX file (with the bulk of the import happening in the service). Once enabled - you'll want to be careful that you don't 'overwrite' your dataset in the cloud each time you try to republish report changes.

Greg_Deckler · ‎08-23-2020

@Anonymous - For 1, 130 MB isn't all that big in Power BI terms. Performance is largely going to rest on how efficient your data model and measures are. I'll have to take a closer look for the rest.

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Help with best practice

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024

How to Get Your Question Answered Quickly