Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and a 50 percent discount on exams.
Get startedEarn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
hi all,
I have 2 tables. One table with 20+ million records and one excel with 70 records.
I want to merge the 2 tables because in need to apply logic based upon the 2 tables file. I merged the 2 tables with left outer join, but this gives really slow performance in refreshes. Anyone knows a good workaround or alternative?
thanks!
Solved! Go to Solution.
I would try Remove Duplicates on the Excel file (even if you know there are none) so that PQ knows there are none. If your key columns are DEFINITELY sorted, then you can use Table.Join (not nested join) and the final parameter JoinAlgorithm.SortMerge.
--Nate
Very awesome suggestion to try, @watkinnc .
I reviewed this case and please allow me to offer some additional thoughts:
Incremental Refresh: If you're refreshing in Power BI Service, consider implementing an incremental refresh policy for your large table. This approach limits the amount of data processed and refreshed to only what's new or changed, significantly reducing refresh times. For more details on setting this up, see Configure incremental refresh.
Use DirectQuery Mode: If applicable, using DirectQuery mode for your large dataset can improve performance by executing queries directly on the source data without the need to load it into Power BI. This can be particularly effective for large datasets, but it's important to understand the trade-offs, such as dependency on the source system's performance. More on DirectQuery can be found here: Use DirectQuery in Power BI Desktop.
Hope above could help.
Best Regards,
Stephen Tao
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Very awesome suggestion to try, @watkinnc .
I reviewed this case and please allow me to offer some additional thoughts:
Incremental Refresh: If you're refreshing in Power BI Service, consider implementing an incremental refresh policy for your large table. This approach limits the amount of data processed and refreshed to only what's new or changed, significantly reducing refresh times. For more details on setting this up, see Configure incremental refresh.
Use DirectQuery Mode: If applicable, using DirectQuery mode for your large dataset can improve performance by executing queries directly on the source data without the need to load it into Power BI. This can be particularly effective for large datasets, but it's important to understand the trade-offs, such as dependency on the source system's performance. More on DirectQuery can be found here: Use DirectQuery in Power BI Desktop.
Hope above could help.
Best Regards,
Stephen Tao
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
I would try Remove Duplicates on the Excel file (even if you know there are none) so that PQ knows there are none. If your key columns are DEFINITELY sorted, then you can use Table.Join (not nested join) and the final parameter JoinAlgorithm.SortMerge.
--Nate