cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Kashuo Helper I
Helper I

Joins are creating MILLIONS of rows to process...why?

Hi,

 

The way I'm joining and merging my data is creating millions of rows to process and it takes hours to compute anything. There must be a better way!

 

My data is from the real estate development industry.

 

  • I have 3 tables: Property, Floor, and Zone.
  • Properties have multiple Floors, and Floors have multiple Zones. (In other words: Each Zone row has a Floor ID field of the Floor it belongs to, and each Floor row has a Property ID field of the Property it belongs to).
  • The three tables are joined on their respective IDs.
  • Each Zone has a field called Opening Date, which is the date we finish construction on that Zone (a portion of a Floor).
  • It's easy to filter a table visualization of Zones by Zone Opening Date.
  • In order to filter a table visualization of Floors by Earliest Floor Opening Date, I created a separate table (merge as new) that merged Zones and Floors (on Floor ID) and then aggregated the Opening Date column of the Zones table to show only "Min" of Opening Date. Then I merged that table with the Floors table so that each row of the Floors table has the "Earliest Floor Opening Date” for that floor.
  • With this data structure in place, I'm able to add a range slicer to the page for Earliest Floor Opening Date and it will filter out Floors from a table visualization of the floors table have have an Earliest Floor Opening Date outside my range slicer. The same slicer works for the areas.
  • However, this filter will not filter out the Properties. Therefore, I have done the same process to find the Earliest Property Opening Date for each row of the Property Table.
  • I created a separate table (merge as new) that merged Floors and Properties this time. But when I try to then aggregate the Earliest Floor Opening Date column to create an "Earliest Property Opening Date" column, I run into trouble.
  • This technically works: the area date slicer filters the areas, the floor date slicer filters floors and areas, and the property date slicer filters all three.
  • The problem is that when I refresh my data source or apply any changes from the query editor, the update dialogue box takes FOREVER as it updates the more than TWELVE MILLION rows of the table of Property Opening Dates.
  • Is there a better way for me to do this without it taking so long?

 2017-06-14 11_31_39-Greenshot image editor.png

 

1 ACCEPTED SOLUTION

Accepted Solutions
Microsoft v-jiascu-msft
Microsoft

Re: Joins are creating MILLIONS of rows to process...why?

Hi @Kashuo,

 

Any changes in one query (table) will be applied to the related merged query. That’s why it takes so much time.  Adding a calculated column of the earliest date to the table respectively would be a good practice.

      1. In order to simply the model, it keeps the essential columns.

      2. The relationships are clear.

      3. Add a column in table “Floor” with this formula.

Earlist_Opening_Date_Each_Floor = calculate(min('Zone'[opening_date]))

      4. Add a column in table “Property” with this formula.

Earliest_Date_Each_Porperty = CALCULATE(min('Floor'[Earlist_Opening_Date_Each_Floor]))

 Filtering by date takes many hours to compute .jpg

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Best Regards!
Dale

 

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

6 REPLIES 6
Kashuo Helper I
Helper I

Filtering by date takes many hours to compute

Hi,

 

My data is from the real estate development industry.

 

  • I have 3 tables: Property, Floor, and Zone.
  • Properties have multiple Floors, and Floors have multiple Zones. (In other words: Each Zone row has a Floor ID field of the Floor it belongs to, and each Floor row has a Property ID field of the Property it belongs to).
  • The three tables are joined on their respective IDs.
  • Each Zone has a field called Opening Date, which is the date we finish construction on that Zone (a portion of a Floor).
  • It's easy to filter a table visualization of Zones by Zone Opening Date.
  • In order to filter a table visualization of Floors by Earliest Floor Opening Date, I created a separate table (merge as new) that merged Zones and Floors (on Floor ID) and then aggregated the Opening Date column of the Zones table to show only "Min" of Opening Date. Then I merged that table with the Floors table so that each row of the Floors table has the "Earliest Floor Opening Date” for that floor.
  • With this data structure in place, I'm able to add a range slicer to the page for Earliest Floor Opening Date and it will filter out Floors from a table visualization of the floors table have have an Earliest Floor Opening Date outside my range slicer. The same slicer works for the areas.
  • However, this filter will not filter out the Properties. Therefore, I have done the same process to find the Earliest Property Opening Date for each row of the Property Table.
  • I created a separate table (merge as new) that merged Floors and Properties this time. But when I try to then aggregate the Earliest Floor Opening Date column to create an "Earliest Property Opening Date" column, I run into trouble.
  • This technically works: the area date slicer filters the areas, the floor date slicer filters floors and areas, and the property date slicer filters all three.
  • The problem is that when I refresh my data source or apply any changes from the query editor, the update dialogue box takes FOREVER as it updates the more than TWELVE MILLION rows of the table of Property Opening Dates. (None of the tables have anywhere near this many rows, bnut that's what the dialogue box says it's processing through).
  • Is there a better way for me to do this without it taking so long?

2017-06-14 11_31_39-Greenshot image editor.png

Microsoft v-jiascu-msft
Microsoft

Re: Filtering by date takes many hours to compute

Hi @Kashuo

 

Any changes in one query (table) will be applied to the related merged query. That’s why it takes so much time.  Adding a calculated column of the earliest date to the table respectively would be a good practice.

      1. In order to simply the model, it keeps the essential columns.

      2. The relationships are clear.

      3. Add a column in table “Floor” with this formula.

Earlist_Opening_Date_Each_Floor = calculate(min('Zone'[opening_date]))

      4. Add a column in table “Property” with this formula.

Earliest_Date_Each_Porperty = CALCULATE(min('Floor'[Earlist_Opening_Date_Each_Floor]))

Filtering by date takes many hours to compute .jpg

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Best Regards!
Dale

 

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Microsoft v-jiascu-msft
Microsoft

Re: Joins are creating MILLIONS of rows to process...why?

Hi @Kashuo,

 

Any changes in one query (table) will be applied to the related merged query. That’s why it takes so much time.  Adding a calculated column of the earliest date to the table respectively would be a good practice.

      1. In order to simply the model, it keeps the essential columns.

      2. The relationships are clear.

      3. Add a column in table “Floor” with this formula.

Earlist_Opening_Date_Each_Floor = calculate(min('Zone'[opening_date]))

      4. Add a column in table “Property” with this formula.

Earliest_Date_Each_Porperty = CALCULATE(min('Floor'[Earlist_Opening_Date_Each_Floor]))

 Filtering by date takes many hours to compute .jpg

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Best Regards!
Dale

 

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

Kashuo Helper I
Helper I

Re: Joins are creating MILLIONS of rows to process...why?

This worked perfectly, thank you so much!

Kashuo Helper I
Helper I

Re: Joins are creating MILLIONS of rows to process...why?

Hello again,

 

This worked for a while, but no longer.

 

Now I'm getting Circular Dependency Errors when I use Calulate. If I take out the "calculate" then the circular dependency error goes away, but then the min gives me an absolute minumum (the first date ever), rather than the earliest date per floor.

 

If I use a measure, then the tabular view gives me the earliest date per floor, but then I can't use Measures in the date slicers at the bottom of my sheet.

 

Any idea how to remove the Circular Dependency? Other than using a measure like this thread suggests?

Microsoft v-jiascu-msft
Microsoft

Re: Joins are creating MILLIONS of rows to process...why?

Hi @Kashuo,

 

Could you please open a new thread? Because the new issue isn't related to this topic, which would be hard for others to search solutions in this community.

 

Could you please provide more information in the new thread?

1. A little sample;

2. The formula you already tried to use;

3. The result you want to get from this formula. (maybe we can create a new one.)

 

Best Regards!

Dale

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
New Ranks Launched March 24th!

New Ranks Launched March 24th!

The time has come: We are finally able to share more details on the brand-new ranks coming to the Power BI Community!

‘Better Together’ Contest Finalists Announced!

‘Better Together’ Contest Finalists Announced!

Congrats to the finalists of our ‘Better Together’-themed T-shirt design contest! Click for the top entries.

Arun 'Triple A' Event Video, Q&A, and Slides

Arun 'Triple A' Event Video, Q&A, and Slides

Missed the Arun 'Triple A' event or want to revisit it? We've got you covered! Check out the video, Q&A, and slides now.

Join THE global Microsoft Power Platform event series.

Join THE global Power Platform event series.

Attend for two days of expert-led learning and innovation on topics like AI and Analytics, powered by Dynamic Communities.

Community Summit North America

Community Summit North America

Innovate, Collaborate, Grow. The top training and networking event across the globe for Microsoft Business Applications

Top Solution Authors