Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
Hi,
I have a query like this that computes an intermediate metric:
Success := COUNTROWS( FILTER(SUMMARIZE(Table, Table[id], "TotalRows", SUM(Table[nRows]), "SuccessfulRows", SUM(Table[Successes])), [TotalRows] = [SuccessfulRows]))
Esentially, for a given id, I want to mark it as successful if number of rows per id is equal to the number of successful rows for that id.
Since I'm using summarize, the query is scanning the entire table for each id and computing the metric. My table has more than 18M rows and this single computation is taking more than 30secs to execute.
Is there a way to optimize this ? Thanks for the help.
HI @raghun ,
Maybe you can consider add two columns to your table to stored summary result, then write formula to compare with these fields.
It should increase calculation speed for measure but it will also increase memory usage of data model.
BTW, you can also try to use GROUPBY function to instead SUMMARIZE function:
Nested grouping using GROUPBY vs SUMMARIZE
Please understand that this link is provided with no warranties or guarantees of content changes, and confers no rights.
Regards,
Xiaoxin Sheng
@raghun -
Maybe you could use DaxStudio to help identify your bottlenecks https://daxstudio.org/documentation/videos/.
Proud to be a Super User!
I already did that this is what I see but couldn't figure out where I can optimize.
Line Records Physical Query Plan 1 AddColumns: IterPhyOp LogOp=AddColumns IterCols(0)(''[SuccessfulCalls]) 2 SingletonTable: IterPhyOp LogOp=AddColumns 3 1 SpoolLookup: LookupPhyOp LogOp=CountRows Integer #Records=1 #KeyCols=0 #ValueCols=1 DominantValue=BLANK 4 1 AggregationSpool<Count>: SpoolPhyOp #Records=1 5 Filter: IterPhyOp LogOp=Filter IterCols(0, 1, 2)('Table'[Id], ''[TotalRows], ''[SuccessfulRows]) 6 Extend_Lookup: IterPhyOp LogOp=EqualTo IterCols(1, 2)(''[TotalRows], ''[SuccessfulRows]) 7 AddColumns: IterPhyOp LogOp=AddColumns IterCols(0, 1, 2)('Table'[Id], ''[TotalRows], ''[SuccessfulRows]) 8 21183022 Spool_Iterator<SpoolIterator>: IterPhyOp LogOp=GroupBy_Vertipaq IterCols(0)('Table'[Id]) #Records=21183022 #KeyCols=569 #ValueCols=0 9 21183022 ProjectionSpool<ProjectFusion<>>: SpoolPhyOp #Records=21183022 10 Cache: IterPhyOp #FieldCols=1 #ValueCols=0 11 21183022 SpoolLookup: LookupPhyOp LogOp=Sum_Vertipaq LookupCols(0)('Table'[Id]) Integer #Records=21183022 #KeyCols=569 #ValueCols=2 DominantValue=BLANK 12 21183022 ProjectionSpool<ProjectFusion<Copy, Copy>>: SpoolPhyOp #Records=21183022 13 Cache: IterPhyOp #FieldCols=1 #ValueCols=2 14 21183022 SpoolLookup: LookupPhyOp LogOp=Sum_Vertipaq LookupCols(0)('Table'[Id]) Integer #Records=21183022 #KeyCols=569 #ValueCols=2 DominantValue=BLANK 15 21183022 ProjectionSpool<ProjectFusion<Copy, Copy>>: SpoolPhyOp #Records=21183022 16 Cache: IterPhyOp #FieldCols=1 #ValueCols=2 17 EqualTo: LookupPhyOp LogOp=EqualTo LookupCols(1, 2)(''[TotalRows], ''[SuccessfulRows]) Boolean 18 ColValue<''[TotalRows]>: LookupPhyOp LogOp=ColValue<''[TotalRows]>''[TotalRows] LookupCols(1)(''[TotalRows]) Integer 19 ColValue<''[SuccessfulRows]>: LookupPhyOp LogOp=ColValue<''[SuccessfulRows]>''[SuccessfulRows] LookupCols(2)(''[SuccessfulRows]) Integer
@raghun -
To be honest I don't know much about improving performance. A resource I use though is the sqlbi.com guys.
https://www.sqlbi.com/articles/best-practices-using-summarize-and-addcolumns/ presents some best practices, have you tried to implement the techniques in your project?
Proud to be a Super User!