Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
rudisoft
Frequent Visitor

List.Contains() is slow 2.0

Hi all,

 

I'm having a power query on a medium sized table (15k rows, ~30 columns) where I need to iterate through  (working on atlassian jira rest api) like so:

 

result = Table.ReplaceValue(buffered_table, // yes, this is buffered
             each [Epic Link],
             each if [Epic Link]<>null
                     then
                             if List.Contains(buffered_list,[Epic Link])
                             then [Epic Link]
                             else ""
                     else null,
              Replacer.ReplaceText,{"Epic Link"}
         ),

The buffered table has internal self-references (epics are issues that are referenced by other issues) that are problematic on the DAX side.

 

I've read through https://community.powerbi.com/t5/Desktop/List-Contains-is-slow/m-p/950128#M455289, read through  @ImkeF performance blogposts and checked @Greg_Deckler's and @Anonymous  sites as well and following (I believe) most of the hints I could find. 

 

I'm however getting read rates (During "Loading data into model ...") of about 10 rows/sec with the above code inside the power query. Without the code above (monitoring using fiddler) it's at least 100 times faster. I'd love to perform the code in power query since the dataset's consumers (as well as myself) have trouble filtering for this using DAX.

 

Any further hints to improve performance would be highly appreceated. Maybe there's a way to have List.Contains use an index?

2 ACCEPTED SOLUTIONS
AlexisOlson
Super User
Super User

Instead of using Table.Replace, try using Table.AddColumn to define the same thing as a custom column (and then delete the original [Epic Link] column and rename the new custom column to [Epic Link]).

View solution in original post

Both of those seem intuitive to me. Adding a column is often faster than modifying an existing one.

 

For your second point, buffered_table[issue key] is already in memory as part of buffered_table, so buffering it into memory separately as a list probably(?) doesn't help that much. Since you are using buffered_list in List.Contains for each row, you want it to be as fast as possible and using a distinct version is more efficient. Having the distinct version already computed and explicitly buffered in that form is even better.

View solution in original post

6 REPLIES 6
rudisoft
Frequent Visitor

Thanks for the help @AlexisOlson and @watkinnc , the hints combined got the job done:

 

// this is for performance boosting the following list.contains function call
buffered_table = Table.Buffer(renamed),
buffered_list = List.Buffer(List.Distinct(buffered_table[issue key])),

result_a = Table.ReplaceValue(buffered_table,
each
[Epic Link],
each
if [Epic Link]<>null
then
if List.Contains(buffered_list,[Epic Link])
then [Epic Link]
else ""
else null,
Replacer.ReplaceText,{"Epic Link"}
),

result_b = Table.AddColumn(buffered_table,
"Epic Link (unlinked)",
each if [Epic Link]<>null
then
if List.Contains(buffered_list, [Epic Link])
then [Epic Link]
else ""
else null)

 

Result_b is about 75 times faster than result_a, but I'm puzzled that:

- adding a column and putting (new) values in there is faster than replacing a value in another column

- List.Buffer(List.Distinct(buffered_table[issue key])) should be faster than List.Distinct(List.Buffer(buffered_table[issue key]))

Both of those seem intuitive to me. Adding a column is often faster than modifying an existing one.

 

For your second point, buffered_table[issue key] is already in memory as part of buffered_table, so buffering it into memory separately as a list probably(?) doesn't help that much. Since you are using buffered_list in List.Contains for each row, you want it to be as fast as possible and using a distinct version is more efficient. Having the distinct version already computed and explicitly buffered in that form is even better.

Got it solved as described above. The column was distinct already.

watkinnc
Super User
Super User

You could also try marking the list as distinct, which almost always makes a difference:

 

 if List.Contains(List.Distinct(buffered_list)...

 

--Nate


I’m usually answering from my phone, which means the results are visualized only in my mind. You’ll need to use my answer to know that it works—but it will work!!

Probably even better to use List.Distinct before List.Buffer if possible.

AlexisOlson
Super User
Super User

Instead of using Table.Replace, try using Table.AddColumn to define the same thing as a custom column (and then delete the original [Epic Link] column and rename the new custom column to [Epic Link]).

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors