Solved: Filtering by date takes many hours to compute

Kashuo · ‎06-14-2017

Hi,

The way I'm joining and merging my data is creating millions of rows to process and it takes hours to compute anything. There must be a better way!

My data is from the real estate development industry.

I have 3 tables: Property, Floor, and Zone.
Properties have multiple Floors, and Floors have multiple Zones. (In other words: Each Zone row has a Floor ID field of the Floor it belongs to, and each Floor row has a Property ID field of the Property it belongs to).
The three tables are joined on their respective IDs.
Each Zone has a field called Opening Date, which is the date we finish construction on that Zone (a portion of a Floor).
It's easy to filter a table visualization of Zones by Zone Opening Date.
In order to filter a table visualization of Floors by Earliest Floor Opening Date, I created a separate table (merge as new) that merged Zones and Floors (on Floor ID) and then aggregated the Opening Date column of the Zones table to show only "Min" of Opening Date. Then I merged that table with the Floors table so that each row of the Floors table has the "Earliest Floor Opening Date” for that floor.
With this data structure in place, I'm able to add a range slicer to the page for Earliest Floor Opening Date and it will filter out Floors from a table visualization of the floors table have have an Earliest Floor Opening Date outside my range slicer. The same slicer works for the areas.
However, this filter will not filter out the Properties. Therefore, I have done the same process to find the Earliest Property Opening Date for each row of the Property Table.
I created a separate table (merge as new) that merged Floors and Properties this time. But when I try to then aggregate the Earliest Floor Opening Date column to create an "Earliest Property Opening Date" column, I run into trouble.
This technically works: the area date slicer filters the areas, the floor date slicer filters floors and areas, and the property date slicer filters all three.
The problem is that when I refresh my data source or apply any changes from the query editor, the update dialogue box takes FOREVER as it updates the more than TWELVE MILLION rows of the table of Property Opening Dates.
Is there a better way for me to do this without it taking so long?

v-jiascu-msft · ‎06-16-2017

Hi @Kashuo,

Any changes in one query (table) will be applied to the related merged query. That’s why it takes so much time. Adding a calculated column of the earliest date to the table respectively would be a good practice.

1. In order to simply the model, it keeps the essential columns.

2. The relationships are clear.

3. Add a column in table “Floor” with this formula.

Earlist_Opening_Date_Each_Floor = calculate(min('Zone'[opening_date]))

4. Add a column in table “Property” with this formula.

Earliest_Date_Each_Porperty = CALCULATE(min('Floor'[Earlist_Opening_Date_Each_Floor]))

Best Regards!
Dale

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

v-jiascu-msft · ‎06-16-2017

Hi @Kashuo,

Any changes in one query (table) will be applied to the related merged query. That’s why it takes so much time. Adding a calculated column of the earliest date to the table respectively would be a good practice.

1. In order to simply the model, it keeps the essential columns.

2. The relationships are clear.

3. Add a column in table “Floor” with this formula.

Earlist_Opening_Date_Each_Floor = calculate(min('Zone'[opening_date]))

4. Add a column in table “Property” with this formula.

Earliest_Date_Each_Porperty = CALCULATE(min('Floor'[Earlist_Opening_Date_Each_Floor]))

Best Regards!
Dale

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Kashuo · ‎07-18-2017

Hello again,

This worked for a while, but no longer.

Now I'm getting Circular Dependency Errors when I use Calulate. If I take out the "calculate" then the circular dependency error goes away, but then the min gives me an absolute minumum (the first date ever), rather than the earliest date per floor.

If I use a measure, then the tabular view gives me the earliest date per floor, but then I can't use Measures in the date slicers at the bottom of my sheet.

Any idea how to remove the Circular Dependency? Other than using a measure like this thread suggests?

v-jiascu-msft · ‎07-18-2017

Hi @Kashuo,

Could you please open a new thread? Because the new issue isn't related to this topic, which would be hard for others to search solutions in this community.

Could you please provide more information in the new thread?

1. A little sample;

2. The formula you already tried to use;

3. The result you want to get from this formula. (maybe we can create a new one.)

Best Regards!

Dale

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Kashuo · ‎06-16-2017

This worked perfectly, thank you so much!

Kashuo · ‎06-14-2017

Hi,

My data is from the real estate development industry.

I have 3 tables: Property, Floor, and Zone.
Properties have multiple Floors, and Floors have multiple Zones. (In other words: Each Zone row has a Floor ID field of the Floor it belongs to, and each Floor row has a Property ID field of the Property it belongs to).
The three tables are joined on their respective IDs.
Each Zone has a field called Opening Date, which is the date we finish construction on that Zone (a portion of a Floor).
It's easy to filter a table visualization of Zones by Zone Opening Date.
In order to filter a table visualization of Floors by Earliest Floor Opening Date, I created a separate table (merge as new) that merged Zones and Floors (on Floor ID) and then aggregated the Opening Date column of the Zones table to show only "Min" of Opening Date. Then I merged that table with the Floors table so that each row of the Floors table has the "Earliest Floor Opening Date” for that floor.
With this data structure in place, I'm able to add a range slicer to the page for Earliest Floor Opening Date and it will filter out Floors from a table visualization of the floors table have have an Earliest Floor Opening Date outside my range slicer. The same slicer works for the areas.
However, this filter will not filter out the Properties. Therefore, I have done the same process to find the Earliest Property Opening Date for each row of the Property Table.
I created a separate table (merge as new) that merged Floors and Properties this time. But when I try to then aggregate the Earliest Floor Opening Date column to create an "Earliest Property Opening Date" column, I run into trouble.
This technically works: the area date slicer filters the areas, the floor date slicer filters floors and areas, and the property date slicer filters all three.
The problem is that when I refresh my data source or apply any changes from the query editor, the update dialogue box takes FOREVER as it updates the more than TWELVE MILLION rows of the table of Property Opening Dates. (None of the tables have anywhere near this many rows, bnut that's what the dialogue box says it's processing through).
Is there a better way for me to do this without it taking so long?

2017-06-14 11_31_39-Greenshot image editor.png

v-jiascu-msft · ‎06-16-2017

Hi @Kashuo

Any changes in one query (table) will be applied to the related merged query. That’s why it takes so much time. Adding a calculated column of the earliest date to the table respectively would be a good practice.

1. In order to simply the model, it keeps the essential columns.

2. The relationships are clear.

3. Add a column in table “Floor” with this formula.

Earlist_Opening_Date_Each_Floor = calculate(min('Zone'[opening_date]))

4. Add a column in table “Property” with this formula.

Earliest_Date_Each_Porperty = CALCULATE(min('Floor'[Earlist_Opening_Date_Each_Floor]))

Best Regards!
Dale

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Joins are creating MILLIONS of rows to process...why?

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024

How to Get Your Question Answered Quickly