Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Reducing data size - remove columns or merge with 2nd table size

Hi,

 

I have data table A that seems to be too large for powerbi service to load (when I refresh it just loads and loads). It's over 10gb. 

 

However I'm only interested in a sub set of Table A, and only some of the columns. In Desktop I have merged Table A with Table B (inner) to get only the rows I'm interested in, I have also removed columns I don't need. It seems to still be too large to load. 

 

Performance wise, what should I be doing first, removing columns to reduce data width first, or merging Table A with Table B first to reduce the number of rows. 

 

I use premium. 

9 REPLIES 9
Anonymous
Not applicable

Hi @Anonymous ,

 

Can you provide more details like below which might help to narrow down the bottleneck and resolve the issue -

Are you doing Incremental refresh and still you have this much data and performance is slow?

Any Network related issues during your refresh which might be causing the delay?

 

Best Regards,
- Pavan M.

 

Anonymous
Not applicable

@Anonymous 

 

I'm not using incremental refresh. 

 

It's a recently created report. I created it in desktop with a weeks worth of data, now I've published it to service for 3 months worth of data and the file isn't loading (the circle spinning icon just goes for an hour+). I'm trying to use the manual refresh icon in Service "dataset + dataflows" tab.

 

Table A is a several .dat files on sharepoint appended in M query. 

 

I've used this advice to create Table A, then use it as an internal source before merging with Table B in a seperate query

 

https://www.excelguru.ca/blog/2015/03/11/power-query-errors-please-rebuild-this-data-combination/

Anonymous
Not applicable

No network issues I'm aware of. It wasn't loading Friday when i first tried, and still isn't today. 

Anonymous
Not applicable

The dataset is currently connected to 3 .dat files (in csv format). each .dat file is a month worth of data and around 5-5.5gb each.

Anonymous
Not applicable

@Anonymous 

 

So I've had a look in the detail and it is actually through up an error. 

 

Table C is the result of merging Table A and Table B (Table A has "enable load" disabled is the PowerBI Desktop Query mode).

 

Data source error:{"error":{"code":"ModelRefresh_ShortMessage_ProcessingError","pbi.error":{"code":"ModelRefresh_ShortMessage_ProcessingError","parameters":{},"details":[{"code":"Message","detail":{"type":1,"value":"Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host."}}],"exceptionCulprit":1}}} Table: TABLE_C.
Cluster URI:WABI-NORTH-EUROPE-redirect.analysis.windows.net
Activity ID:9784f625-0fc4-4b01-ade0-530045abb358
Request ID:b151ec94-bd95-bf75-42ce-047262dca8de
Time:2020-05-22 18:42:26Z

Hi @Anonymous,

 

with importing files this big you might run into query timeouts.

An important question to me is: What is the source of your .dat files? Is it a database? Are you able to connect to that source directly, that would be much better. Depending on the source, you could even use the concept of Query Folding, which can potentially push the transformations back to the source system and give you much better performance and hence no issues with refreshing.



Did I answer your question? Mark my post as a solution!

Blog: nickyvv.com | @NickyvV


Anonymous
Not applicable

HI @nickyvv 

 

Direct connection to the source data isn't possible at this time unfortunately. Would reducing the .dat files from monthly to weekly files change anything?

Hi @Anonymous,

that would certainly help. You'll have to join them back together in Power BI, but that could solve the problem of timeouts.


Did I answer your question? Mark my post as a solution!

Blog: nickyvv.com | @NickyvV


Anonymous
Not applicable

@nickyvv 

 

I currently join them together in M query using the import sharepoint folder option so it's appending all files into one big one before doing data transformations on them. I'll try splitting the data down and report back.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors