Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hello,
I have a very large amount of data, so I need to split it into multiple queries to keep the data source from crashing.
But then I have to merge those queries back together to have a single query and transform the fields.
However, when referenced again from the source, they are all loaded together and this leads to chrash in the data source.
Unfortunately, with the reference, the data source is also loaded again, what alternative is there to the reference?
Hi @PWRBI,
Theoretically, switching off "Parallel table loading" in the setting may help to avoid concurrent access to the datasource. But if you are having those problems, I guess, the datasource doesnot support anything like pagination or query folding. In this case splitting query into separate queries may do the opposed as each query may be loading the entire datasource and then filter it based on the filters it has. Without knowing exactly what you are doing, it is hard to tell, but this is quite possible scenario. Maybe re-think your strategy and test if you can load the entire dataset from the source at once one more time? Can you split the datasource to a smaller data tables/sources?In this case, splitting can make sense (if you requesting separate files or tables each time), in this case try prohibiting parallel data load in settings.
Cheers,
John
Thank you for your feedback.
So the individual queries work great and I have already deactivated parallel queries.
I have the following example:
query 1
query 2
Query 3
Query 4
Query 5
These 5 queries run one after the other and the database has no problems.
In the Master Query I refer to queries 1 - 5 and in this query all five queries are started at the same time and the database crashes.
Hi @PWRBI,
I see your point, yes, you are right since this is a single query referring to several other queries, it triggers them to run all at once regardless the settings.
Do you wan tto try playing along these lines:
let
Source_1 = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlSK1YlWMgKTxkqxsQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
Source_2= Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlSK1YlWMgKTxkqxsQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
Source_3 = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlSK1YlWMgKTxkqxsQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
Combine_1 = Table.Combine({Source_1, Function.InvokeAfter(()=>Source_2, #duration(0,0,0,10))}),
Combine_2 = Table.Combine({Combine_1, Function.InvokeAfter(()=>Source_3, #duration(0,0,0,10))})
in Combine_2
The idea behind it to manually force the delay on gettin the date from each table (using Function.InvokeAfter). You will need to somehow manually measure timing for each query to load and then add 20% on top for the duration parameter. I think this may works, but the performance of course will be quite suboptimal.
Kind regards,
John
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.