Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
AaronApexPeak
Frequent Visitor

Very slow lookup using Table.SelectRows to find event IDs for each row based on date

I am having a problem with performance on table with 1.5million rows of time series data where I am using the timestamp to look up an event ID from a table of events. The secondary tables have an event ID column with a column for start time and a column for end time. Note that not all rows have an event ID. With the event ID columns added to the time series table, a relationship is created back the event tables so that we can filter charts of the time series data by clicking on the events.

It takes 2 minutes to load all the data and set the types on the time series table, but when I added the custom columns and drop errors (bolded rows below) it takes hours to load.

Is there a better way to do this? I tried creating these columns as a Calculated Column (which is much faster), but it does not allow a relationship to be created between the calculated column and the original table because of circular dependency.


Note:  The TimestampDT column exist because all these datetimes are in Date/Time/Zone format and I was not able to directly do a <= >= comparison in the power query without first converting to a Date/Time columns.

 

let
Source = TimeSeriesDataSource,
#"Changed Type" = Table.TransformColumnTypes(Source,List.Transform(List.Skip(Table.ColumnNames(Source),1), each {_, Number.Type})),
#"Changed Type2" = Table.TransformColumnTypes(#"Changed Type",{{"Timestamp", type datetimezone}}),
#"Duplicated Column" = Table.DuplicateColumn(#"Changed Type2", "Timestamp", "TimestampDT"),
#"Changed Type1" = Table.TransformColumnTypes(#"Duplicated Column",{{"TimestampDT", type datetime}}),
#"Added Custom" = Table.AddColumn(#"Changed Type1", "EventType1ID", each (let rowTime = [TimestampDT] in Table.SelectRows(Event1Table, each [StartDateTimeDT] <= rowTime and [EndDateTimeDT] >= rowTime)){0}[BusinessKey]),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "EventType2ID", each (let rowTime = [TimestampDT] in Table.SelectRows(Event2Table, each [StartDateTimeDT] <= rowTime and [EndDateTimeDT] >= rowTime)){0}[BusinessKey]),
#"Replaced Errors1" = Table.ReplaceErrorValues(#"Added Custom1", {{"EventType1ID", ""}, {"EventType2ID", ""}})
in
#"Replaced Errors1"

 

 

Thanks in advance for the help.

1 ACCEPTED SOLUTION

The solution ended up being Custom Channels using the following code to move the lookup out of the Power Query.

To avoid the circular references, I created new duplicates of the event tables and I do the lookup from there. Then it is possible create a relationship between the Time Series EventID column (sourced from the lookup copy of the event table) to the original event table. 

 

EventType1ID =
CALCULATE(
    SELECTEDVALUE(EventType1Lookup[BusinessKey]),
     FILTER(EventType1Lookup,
    EventType1Lookup[StartDateTimeDT] <= TimeSeriesData[TimestampDT]
      && EventType1Lookup[EndDateTimeDT] >= TimeSeriesData[TimestampDT]
    )
)

View solution in original post

3 REPLIES 3
AaronApexPeak
Frequent Visitor

I'm not sure if this will help but I ran SQL Server Profiler and this is the item that is taking all the time (7+ hours):

AaronApexPeak_0-1663678521651.png

This is the TextData from that row:

let __AS_Query__ = TimeSeriesData,
__AS_Table__ = Table.FromValue(__AS_Query__),
__AS_Compact__ = Table.RemoveColumns(__AS_Table__, Table.ColumnsOfType(__AS_Table__, { type table, type record, type list })),
__AS_Effective__ = Table.TransformColumnNames(__AS_Compact__, Text.Clean)
in __AS_Effective__

v-rzhou-msft
Community Support
Community Support

Hi @AaronApexPeak ,

 

I think your issue should be caused by large size of data or complex query. Here I suggest you to optmize your data model or update your query.

For reference: Optimization guide for Power BI - Power BI | Microsoft Learn

 

Best Regards,
Rico Zhou

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

The solution ended up being Custom Channels using the following code to move the lookup out of the Power Query.

To avoid the circular references, I created new duplicates of the event tables and I do the lookup from there. Then it is possible create a relationship between the Time Series EventID column (sourced from the lookup copy of the event table) to the original event table. 

 

EventType1ID =
CALCULATE(
    SELECTEDVALUE(EventType1Lookup[BusinessKey]),
     FILTER(EventType1Lookup,
    EventType1Lookup[StartDateTimeDT] <= TimeSeriesData[TimestampDT]
      && EventType1Lookup[EndDateTimeDT] >= TimeSeriesData[TimestampDT]
    )
)

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.