Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and a 50 percent discount on exams.
Get startedEarn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
Hi,
I have Table A where majority of the data is stored and another table, Table B where I am storing a record of Product ID, Product Price and Product Price Valid From and To Date.
I am using a merge to merge Table A and Table B using Product ID and then I am using a Custom Column to do a range look up, Table.SelectRows, to pull back the correct corresponding record based on the date in Table A, matching it to Date From and Date To in Table B.
Previously, it used to take around 1-2 mins for my queries to refresh but since adding this, the query is taking too long (I havent actually sat long enough where it has finished yet) and when loading, it is looking multipe gigabytes worth of data
Please can someone advise on best practice here?
Thanks
Solved! Go to Solution.
Hi @threw001,
I think the problem that you have is i nth ecode (or more precisely, in how PQ deals with the code).
What is happening (most lkely) is that PQ loads all 2k rows (table B) for each line in the table A and what is the worst, it recalculates table B (including any source import from datafiles if any) for each row in the table A (when the code expands col JoinToPriceMatrix). Of course, this is quite slow and will get slower and slower as the data grows.
Generally speaking, there are several strategies to address it.
1. The simpliest could be buffering both tables (or at least table B - the one which joins on the "right") using Table.Buffer imediatelly prior to merging it (i.e. it would be the last step in the PriceMatrix table/query). 2k of rows does not look too bad in this case and should not cause memory issues/paging.
2. Alternatively, you can expand PriceMatrix column before applying date filter, this will push the calculation one step forward and the filtering will be dealing with a single set of data, not multiple tables fetched for each row. This also comes at a cost to memory (potentially higher than #1, but also may be a bit quicker), should not be too significant though in your case.
3. I, personally, would expand table B to create an exhaustive set of dates for each product/price and then merged it on product and date rather than just a product. I think this may be the most efficient way to resolve it. Kind of:
Table.TransformColumnTypes(
Table.ExpandListColumn(
Table.AddColumn(#"Changed Type", "Custom", each {Number.From([Price Valid From])..Number.From([Price Valid To])})
, "Custom")
,{{"Custom", type date}})
Cheers,
John
@threw001 Can you share sample data and sample output in table format? Or a sample pbix after removing sensitive data.
Hi @Mahesh0016 ,
Thanks for your help
I have recreated a very simplified version of the table I have in PowerBi, the data is very much reduced (columns and rows)
Table A is my main table
Table A | ||
Product ID | Sale Date | Sales Person |
1 | 01/01/2022 | Joe |
1 | 16/07/2022 | Alex |
2 | 15/04/2022 | Alex |
3 | 12/01/2023 | Amanda |
3 | 06/11/2022 | Leah |
5 | 25/06/2022 | George |
5 | 05/03/2022 | Alex |
4 | 05/01/2023 | Sam |
Table B is essentially my price look up table
Table B | |||
Product ID | Price Valid From | Price Valid To | Price |
1 | 01/01/2022 | 30/04/2022 | 100 |
1 | 01/05/2022 | 01/01/2099 | 150 |
2 | 01/01/2022 | 30/04/2022 | 500 |
2 | 01/05/2022 | 01/01/2099 | 450 |
3 | 01/01/2022 | 30/04/2022 | 650 |
3 | 01/05/2022 | 01/01/2099 | 600 |
4 | 01/01/2022 | 30/04/2022 | 750 |
4 | 01/05/2022 | 01/01/2099 | 800 |
5 | 01/01/2022 | 30/04/2022 | 800 |
5 | 01/05/2022 | 01/01/2099 | 1000 |
6 | 01/01/2022 | 30/04/2022 | 6000 |
6 | 01/05/2022 | 01/01/2099 | 7800 |
7 | 01/01/2022 | 30/04/2022 | 9000 |
7 | 01/05/2022 | 01/01/2099 | 10000 |
Output Table - This is Table A where I have pulled in Price, Price Valid From and Price Valid To from Table B.
Output Table | |||||
Product ID | Sale Date | Sales Person | Price | Price Valid From | Price Valid To |
1 | 01/01/2022 | Joe | 100 | 01/01/2022 | 30/04/2022 |
1 | 16/07/2022 | Alex | 150 | 01/05/2022 | 01/01/2099 |
2 | 15/04/2022 | Alex | 500 | 01/01/2022 | 30/04/2022 |
3 | 12/01/2023 | Amanda | 600 | 01/05/2022 | 01/01/2099 |
3 | 06/11/2022 | Leah | 600 | 01/05/2022 | 01/01/2099 |
5 | 25/06/2022 | George | 1000 | 01/05/2022 | 01/01/2099 |
5 | 05/03/2022 | Alex | 800 | 01/01/2022 | 30/04/2022 |
4 | 05/01/2023 | Sam | 800 | 01/05/2022 | 01/01/2099 |
For scale, my table A has approx 305k lines of data across 30-40 columns.
Table B has approx 2k lines of data however my date valid from and to expands from 01/01/1900 to 31/12/2023 (broken up across 3 sequences due to when the price change happened)
The below is the additional code I have added to do this:
#"Merged Queries3" = Table.NestedJoin(#"Expanded ClientBase", {"Service Code Name"}, PriceMatrix, {"Service Code Name"}, "PriceMatrix", JoinKind.LeftOuter),
#"Added Custom7" = Table.AddColumn(#"Merged Queries3", "JoinToPriceMatrix", each let visitdate= [Visit Start Date] in
Table.SelectRows([PriceMatrix], each [Date From] <= visitdate and [Date To] >= visitdate)),
#"Expanded JoinToPriceMatrix" = Table.ExpandTableColumn(#"Added Custom7", "JoinToPriceMatrix", {"Service Code Name", "Funnel", "Service Code Is Disabled (Yes / No)", "FinalArticle", "Price Per Hour", "Visit Count", "Date From", "Date To", "Sequence"}, {"JoinToPriceMatrix.Service Code Name", "JoinToPriceMatrix.Funnel", "JoinToPriceMatrix.Service Code Is Disabled (Yes / No)", "JoinToPriceMatrix.FinalArticle", "JoinToPriceMatrix.Price Per Hour", "JoinToPriceMatrix.Visit Count", "JoinToPriceMatrix.Date From", "JoinToPriceMatrix.Date To", "JoinToPriceMatrix.Sequence"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded JoinToPriceMatrix",{"PriceMatrix"})
in
#"Removed Columns"
Thanks!
Hi @threw001,
I think the problem that you have is i nth ecode (or more precisely, in how PQ deals with the code).
What is happening (most lkely) is that PQ loads all 2k rows (table B) for each line in the table A and what is the worst, it recalculates table B (including any source import from datafiles if any) for each row in the table A (when the code expands col JoinToPriceMatrix). Of course, this is quite slow and will get slower and slower as the data grows.
Generally speaking, there are several strategies to address it.
1. The simpliest could be buffering both tables (or at least table B - the one which joins on the "right") using Table.Buffer imediatelly prior to merging it (i.e. it would be the last step in the PriceMatrix table/query). 2k of rows does not look too bad in this case and should not cause memory issues/paging.
2. Alternatively, you can expand PriceMatrix column before applying date filter, this will push the calculation one step forward and the filtering will be dealing with a single set of data, not multiple tables fetched for each row. This also comes at a cost to memory (potentially higher than #1, but also may be a bit quicker), should not be too significant though in your case.
3. I, personally, would expand table B to create an exhaustive set of dates for each product/price and then merged it on product and date rather than just a product. I think this may be the most efficient way to resolve it. Kind of:
Table.TransformColumnTypes(
Table.ExpandListColumn(
Table.AddColumn(#"Changed Type", "Custom", each {Number.From([Price Valid From])..Number.From([Price Valid To])})
, "Custom")
,{{"Custom", type date}})
Cheers,
John
Hi @jbwtp
Sorry for the response again - just did a bit of research online and found how add/create the exhaustive date list
I followed your 3rd option and its worked great - thanks for the tip RE PQ calcs, this is super useful!
Thanks
Hi @jbwtp
Thank you for your help!
Your preferred solution sounds great to me! I am just struggling to create this element even with the code you provider (apologies, I am still super new to PQ!)
The source is an excel spreadsheet that I am feeding data into the PQ from.
let
Source = X, null, true),
PriceMatrix_Sheet = Source{[Item="PriceMatrix",Kind="Sheet"]}[Data],
#"Promoted Headers" = Table.PromoteHeaders(PriceMatrix_Sheet, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Date From", type date}, {"Date To", type date}, {"Service Code Name", type text}})
in
#"Changed Type"
Can you advise / explain a bit more on how I can create this exhaustive set of dates as currently my dates ranges from:
Sequence | Date From | Date To |
1 | 01/01/1900 | 11/09/2022 |
2 | 12/09/2022 | 25/09/2022 |
3 | 26/09/2022 | 31/12/2023 |
Many thanks,
John
Hi @threw001,
Happy it helped. Just in case, this is how to convert From Date - To Date to a list:
1. First we add a column that contain a list of ranges decoded from the From and To columns (Create FromToList below). Notice that we convert dates to numbers as there is no {From..To} list template for dates.
2. Then we expand the list and remove redundant columns
3. Then convert numbers back to :)dates. All done.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("VYzBCQAwCAN3yVuoRlroLOL+a1Q/pYV87g4SAYNAbdRsqxZYwR5UEikBtuJVAs6ve6v1dK8nNjgyDw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Sequence = _t, #"Date From" = _t, #"Date To" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Sequence", Int64.Type}, {"Date From", type date}, {"Date To", type date}}),
#"Create FromToList" = Table.AddColumn(#"Changed Type", "FromToList", each {Number.From([Date From])..Number.From([Date To])}),
// we convert dates to numbers above to make it a list, so we could expand it. Unfortunatelly, there is no simple {From..To} template for dates, only for chars and numbers
#"Expanded FromToList" = Table.ExpandListColumn(#"Create FromToList", "FromToList"),
#"Removed Redundant Columns" = Table.RemoveColumns(#"Expanded FromToList",{"Date From", "Date To"}),
#"Convert To Dates" = Table.TransformColumnTypes(#"Removed Redundant Columns",{{"FromToList", type date}})
in
#"Convert To Dates"
Cheers,
John