Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Grow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.

Reply
mahenkj2
Solution Sage
Solution Sage

Help on optimizing refresh - much more rows than actual rows

I am developing a solution to create match/mismatch list with two set of excel files and it is working perfectly. I do see that while refresh, rows count goes very high from actual data and I am afraid that when excel file increases, refresh might become a concers.

 

I share below mechanism how the queries are designed, I think it would be much difficult to exactly represent the data, but I still need to ask what are the key terms to look for the solution.

 

There are two folders. Each folder have multiple excel files. I make two data connection with each folder and merge corresponding files in one query for each folder.

 

So in power query I get two queries, one for each folder. I need to find exact match, partial match and mismatches in both these queries.

 

Query 1 -  folder1 (1500 rows, so it is total of rows in all excel file in that folder)

Query 2 – folder2  (600 rows, so it is total of rows in all excel file in that folder)

 

Query 3 – merge query 1 and query 2 with inner join to find matches (203 rows)

Query 4 - merge query 1 and query 2 with outer join to find mismatches (1350 rows)

 

Query 5 – merge Query 4 with Query 2 to find mismatch in one column (7 rows)

Query 6 – merge Query 4 with Query 2 to find mismatch in one column   (4 rows)

Query 7 – merge Query 4 with Query 2 to find mismatch in one column   (4 rows)

Query 8 – merge Query 4 with Query 2 to find mismatch in one column (1 rows)

 

Query 9 – Append Query 3, 5,6,7,8 to make query for match or partial match

 

Query 10 – Merge Query 9 with Query 1 to find records not in Query 2

Query 11 – Merge Query 9 with Query 2 to find records not in Query 1

 

Query 12 (final result) – Append Query 9,10,11 (1960 rows)

 

At the time refresh, it takes couple of minutes (3 excel file in each folder) and several steps seen during the refresh (you may understand what I am talking about):

mahenkj2_0-1708159665913.png

 

I need to find ways to optimize the refresh or need to find alternate ways to handle the purpose?

Hope I was able to explain the concern.

 

 

3 REPLIES 3
some_bih
Super User
Super User

Hi @mahenkj2 latter you will probably looking for some tool for identife columns not used or used in model

Check two links: first for power pivot utilities (Excel and DAX) and second with power bi with DAX Query view part

It these helps, accept as solution so other member could see it.





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!






some_bih
Super User
Super User

Hi @mahenkj2  do not import all the columns which you do need for your model / queries.

Analyse which columns you need and only import them.

Backup your working and pbi  files  and play a bit with decreased number of columns to figure out what columns are not needed.

Maybe the best would be to create some query out of pbi and then import the Excel file which you think will grow / be needed in future.





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!






Hi @some_bih 

 

I do have couple of additional column for future usage possibitlity, these have just 1 or 2 unique values in each column. I do see little improvement in performance. I would expect if there are some alternate means to reduce total rows processed during query processing or some alternate ways to handle such application.

Secon solution you suggest is not possible in my case.

 

 

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayPowerBICarousel1

Power BI Monthly Update - May 2024

Check out the May 2024 Power BI update to learn about new features.