Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
nova_3013
Frequent Visitor

Merging Queries based on Date Ranges to get a full table

Hi

 

I have two tables that need to be combined. One table contains basic information about deals and another table information about the associated products.Each row consists of a state a deal was in from Start Date Deal to End Date Deal. If one of the attributes changes, I have a new row with the new information and the date range for which this information held true. The same applies to the products table. 

In order to work with the data, I need a table that contains the information from both tables. The problem is that I have Date Ranges and this approach should be robust to this. 

 

 

Here's an example of my deals-table with one single deal (but I have many in my data):

Deal IDStart Date DealEnd Date DealDeal attributes
11.1.20211.2.2021status 1
12.2.20213.3.2021status 2
14.3.202115.6.2021status 3

 

And this is my table with the products:

Deal IDProduct IDStart Date ProductEnd Date ProductProduct attributes
123415.1.20212.2.2021Price 1
12343.2.202115.6.2021Price 2
145615.1.20214.2.2021Price 5
14565.2.202110.6.2021Price 6

 

And the desired input would look like this:

Deal IDProduct IDStart Date PhaseEnd Date PhaseDeal attributesProduct attributes
12341.1.202114.1.2021status 1(None)
123415.1.20211.2.2021status 1Price 1
12342.2.20212.2.2021status 2Price1
12343.2.20213.3.2021status 2Price 2
12344.3.202115.6.2021status 3Price 2
24561.1.202114.1.2021status 1(None)
245615.1.20211.2.2021status 1Price 5
24562.2.20214.2.2021status 2Price 5
24565.2.20213.3.2021status 2Price 6
24564.3.202110.6.2021status 3Price 6
245611.6.202115.6.2021status 3(None)

 

The challenge for me is that I have to create rows for each phase, irrespective of whether something changed on the deal side or the product side. 

 

I would really appreciated your help. 

Thank in advance

1 ACCEPTED SOLUTION
artemus
Employee
Employee

I would expand these tables to have 1 deal/product row per day.

 

You can do this with by adding a custom column that lists all the dates between the start and end period.

= List.Generate(() => [Start Date], each _ <= [End Date], each _ + #duration(1, 0, 0, 0))

After doing that just click the new column and choose expand list value. From this point you should be able to merge the tables, and add some logic for the Product attributes. 

 

If you going to create some kind of interactive power bi report with this, I would leave the data in this format. If your loading it for quick viewing, you will need to do a grouping operator with the min and max of the date column to get back the starting and ending date.

View solution in original post

8 REPLIES 8
nova_3013
Frequent Visitor

thank you @artemus and @Anonymous . 
In the end I resorted to create expanded tables and merge those. The performance so far seems ok.

Anonymous
Not applicable

"Thank you very much for your suggestion. That was actually what I wanted to do at first. However, this is obviously just a small example and this table will grow huge as I have thousands of deals and products. That is why I wanted to look for another solution and try to avoid this."

 

 

"In the end I resorted to create expanded tables and merge those. The performance so far seems ok."

 

To make the discussion of general use, could you explain which solution in the end is the one that solved the problem?
Could you post the script you used?

Anonymous
Not applicable

following the scheme suggested by @artemus

artemus
Employee
Employee

I would expand these tables to have 1 deal/product row per day.

 

You can do this with by adding a custom column that lists all the dates between the start and end period.

= List.Generate(() => [Start Date], each _ <= [End Date], each _ + #duration(1, 0, 0, 0))

After doing that just click the new column and choose expand list value. From this point you should be able to merge the tables, and add some logic for the Product attributes. 

 

If you going to create some kind of interactive power bi report with this, I would leave the data in this format. If your loading it for quick viewing, you will need to do a grouping operator with the min and max of the date column to get back the starting and ending date.

Thank you very much for your suggestion. That was actually what I wanted to do at first. However, this is obviously just a small example and this table will grow huge as I have thousands of deals and products. That is why I wanted to look for another solution and try to avoid this.

Anonymous
Not applicable

could you give the dimensions of your tables in terms of rows and columns?
the solution I proposed does not use list.generate and could be sufficiently performing.
Otherwise you should look for other tools.
DataFrames.jl in the julia data ecosystem, for example.

These two tables each have about 8 columns and approximately 3000 rows. With all the timestamps and changes I already get to more than 300'000 rows each for these two tables. 

I'm a bit afraid that the data will grow really fast and will soon reach 1+ millions of rows so that performance could be an issue.

Anonymous
Not applicable

give a try, starting from a subset of db (f.i from 500 or 1k rows) and let's know the execution time.

 

Could you provide a full db (CSV or XLSX) with tokenized sensible data?

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors