Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
campelliann
Post Patron
Post Patron

When to use Table.Buffer

Hi there, 

I would like to know in each steps or before each steps it is recommended to use Table.Buffer to improve performance.

Many thanks

1 ACCEPTED SOLUTION
BA_Pete
Super User
Super User

Hi @campelliann ,

 

This is an enormous question with many, many variables. However, very broadly, once you understand what Table.Buffer and List.Buffer do, this will often inform you of when is a good time to use them.

At their core, these functions load the entire target into memory. Therefore, one key use is to use them when Power Query would otherwise make multiple scans of the target.

For example:

If you want to do a conditional join in PQ (in this example to merge values from an SCD table), you may find yourself creating a column something like this to achieve it:

= Table.AddColumn(
  previousStep,
  "columnName",
  (OT) => Table.SelectRows(
    Table2,  // The 'Target'
    each OT[Field] = [Field]
      and OT[dateField] >= [table2StartDate]
      and OT[matchField2] <= [table2EndDate]
    ){0}[valueToMerge],
  type number
)

 

In this instance, PQ will need to keep rescanning Table2 for the conditions on each row. Therefore, if we buffer Table2 in the Table1 query (T2Buffer = Table.Buffer(Table2) ), then swap the target from Table2 to T2Buffer, PQ can just hit the memory for what it needs each time.

 

List.Buffer example:

If you want to filter a table by the values in another table, you may find yourself righting code like this:

Table.SelectRows(
  TableToFilter,
  each List.Contains(
    FilterTable[ColumnWithFilterValues]),  // The 'Target'
    TableToFilter[ColumnName]
  )
)

 

In this instance, PQ will scan FilterTable in order to check the values against each row value in TableToFilter.

Therefore, if you buffer the FilterTable values first, PQ can just hit the memory for these values. You can do this inline (we created a separate step for this previously), like this:

Table.SelectRows(
  TableToFilter,
  each List.Contains(
    List.Buffer(FilterTable[ColumnWithFilterValues])),  // The 'Target'
    TableToFilter[ColumnName]
  )
)

 

This example is particularly important when applying this technique on foldable sources, as doing it this way means you can stream non-foldable lists to foldable queries and maintain folding 👍

 

Hope this helps?

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




View solution in original post

3 REPLIES 3
campelliann
Post Patron
Post Patron

@BA_Pete your posts are always insightful :). 

But for instance creating a Table.Buffer on an excel Sharepoint File, after the "call" to sharepoint. Or let's say I retrieve data from an API and in the step imediatly after the call, could that improve performance (which is affected by many variables as you say).

 

Generally, I'd say no, it wouldn't improve performance significantly, if at all. But, again, there's many variables.

If you bring in a 200 million row dataset with 50 columns you could potentially hurt performance by trying to buffer it as you may get close to, or exceed, your memory limit. This could slow down all of the mashup container processes, cause the query to fail/crash, or may even limit the functioning of the processing machine itself.

Furthermore, I don't think buffering prevents PQ making calls to the source if it needs to, it just helps to limit/prevent multiple table scans internally.

As a rule of thumb, I'd keep buffering limited to relatively small utility tables and lists, such as in my examples. As above, used too generously or incorrectly, buffering data has the potential to cause more harm than good.

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




BA_Pete
Super User
Super User

Hi @campelliann ,

 

This is an enormous question with many, many variables. However, very broadly, once you understand what Table.Buffer and List.Buffer do, this will often inform you of when is a good time to use them.

At their core, these functions load the entire target into memory. Therefore, one key use is to use them when Power Query would otherwise make multiple scans of the target.

For example:

If you want to do a conditional join in PQ (in this example to merge values from an SCD table), you may find yourself creating a column something like this to achieve it:

= Table.AddColumn(
  previousStep,
  "columnName",
  (OT) => Table.SelectRows(
    Table2,  // The 'Target'
    each OT[Field] = [Field]
      and OT[dateField] >= [table2StartDate]
      and OT[matchField2] <= [table2EndDate]
    ){0}[valueToMerge],
  type number
)

 

In this instance, PQ will need to keep rescanning Table2 for the conditions on each row. Therefore, if we buffer Table2 in the Table1 query (T2Buffer = Table.Buffer(Table2) ), then swap the target from Table2 to T2Buffer, PQ can just hit the memory for what it needs each time.

 

List.Buffer example:

If you want to filter a table by the values in another table, you may find yourself righting code like this:

Table.SelectRows(
  TableToFilter,
  each List.Contains(
    FilterTable[ColumnWithFilterValues]),  // The 'Target'
    TableToFilter[ColumnName]
  )
)

 

In this instance, PQ will scan FilterTable in order to check the values against each row value in TableToFilter.

Therefore, if you buffer the FilterTable values first, PQ can just hit the memory for these values. You can do this inline (we created a separate step for this previously), like this:

Table.SelectRows(
  TableToFilter,
  each List.Contains(
    List.Buffer(FilterTable[ColumnWithFilterValues])),  // The 'Target'
    TableToFilter[ColumnName]
  )
)

 

This example is particularly important when applying this technique on foldable sources, as doing it this way means you can stream non-foldable lists to foldable queries and maintain folding 👍

 

Hope this helps?

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors