Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Compare Any 2 non-consecutive Rows (PowerQuery)

Hi,

 

I need to compare the Count values by Sector at every Time interval (e.g. row at Index 2 with row at Index 7) and output the result in Count.1 column.

- These Rows may be at different Index distances at every run (depending on sector's data received).

- The Sector values itself may vary.

- After the first 1-2 rows (always 0 Count value) the Time interval is constant every 10s.

I cannot play with custom Index columns given the requirements mentioned above, so I came up with the below solution (Count.1 values in the Table below are the correct output):

 

#"Added Count.1" = Table.AddColumn(#"Previous step", "Count.1", (a) =>

  let   

    b = Table.SelectRows(#"Previous step", (x) => x[Sector] = a[Sector] and x[Time] = a[Time]+10)  

  in

    b[#"Count"]{0} 

  )

 

But this is veeery slow, and I'm using test data (300 rows), when the report will be live the number of rows will grow exponentially. Any idea how to improve my code to a more efficient/fast way?

 

Sample data:

 

IndexTimeSectorCountCount.1
01593018439   20000
1159301844020000
215930184504198208
315930184505176214
415930184506176180
515930184503176180
615930184501178216
715930184604208208
815930184605214220
915930184606180180
1015930184603180180
1115930184601216220
1215930184701220220
1315930184706180180
1415930184703180180
1515930184704208208
1615930184705220220
1715930184804208208
1815930184805220220
1915930184806180180
2015930184803180180
2115930184801220220

 

Thanks

2 ACCEPTED SOLUTIONS
Anonymous
Not applicable

try making a partition of the source table with the table.group function on the sector field.
Then apply a function to each sub-table that does the calculation you need.

PS

how many rows you table has?

how many sector do you have?

 

View solution in original post

edhans
Super User
Super User

@Anonymous Power Query is not optimized for this type of analysis as you've discovered. @ImkeF has written an excellent article on this here, which essentially says to first group the data in some way first, so maybe 10,000 records becomes 500 groups. then within each group, do your lookup code, and now each lookup would be processing 200 rows at a time, not 10,000.

 

But it will eventually get too large.

 

This is one of those times where a Calculated Column in DAX may be more efficient. A measure is even better - MUCH BETTER - if you can do that. Or, do this at the source system. In SQL Server for example, if possible. But within the Tablular model of Power BI, either columns or as a measure, it is designed to rapidly scan and filter millions of rows for these operations.

 

In general, try to avoid calculated columns. There are times to use them, but it is rare. Getting data out of the source system, creating columns in Power Query, or DAX Measures are usually preferred to calculated columns. See these references:
Calculated Columns vs Measures in DAX
Calculated Columns and Measures in DAX
Storage differences between calculated columns and calculated tables



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

View solution in original post

6 REPLIES 6
edhans
Super User
Super User

@Anonymous Power Query is not optimized for this type of analysis as you've discovered. @ImkeF has written an excellent article on this here, which essentially says to first group the data in some way first, so maybe 10,000 records becomes 500 groups. then within each group, do your lookup code, and now each lookup would be processing 200 rows at a time, not 10,000.

 

But it will eventually get too large.

 

This is one of those times where a Calculated Column in DAX may be more efficient. A measure is even better - MUCH BETTER - if you can do that. Or, do this at the source system. In SQL Server for example, if possible. But within the Tablular model of Power BI, either columns or as a measure, it is designed to rapidly scan and filter millions of rows for these operations.

 

In general, try to avoid calculated columns. There are times to use them, but it is rare. Getting data out of the source system, creating columns in Power Query, or DAX Measures are usually preferred to calculated columns. See these references:
Calculated Columns vs Measures in DAX
Calculated Columns and Measures in DAX
Storage differences between calculated columns and calculated tables



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting
Anonymous
Not applicable

Thanks @edhans , I will explore the Table.Group as mentioned in the article. Otherwise I will look into some DAX options as well

Great @Anonymous - post back in the DAX area if you decide to go with a DAX solution and need help. 



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting
Greg_Deckler
Super User
Super User

@ImkeF or @edhans  or another Power Query guru can probably assist.


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...
Anonymous
Not applicable

try making a partition of the source table with the table.group function on the sector field.
Then apply a function to each sub-table that does the calculation you need.

PS

how many rows you table has?

how many sector do you have?

 

Anonymous
Not applicable

Thanks @Anonymous  I will explore the Table.Group option. The rows number will keep growing at every run, and the amount of sectors and the Sector values received may vary as well

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors