Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Grow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.

Reply
rraja2k2
Regular Visitor

Merge Vs List function for my scenario where I need to map a value from another query

Hi Experts,

I have 2 table queries in Power BI editor.

Query1 has many columns including below columns (data refreshed from SAP HANA DB)

Owner_idComp_codeCreation_date
KANR100102/03/2023
TANJI150004/05/2023
KANR100306/06/2023
MPAVI104707/07/2023
KANR100231/03/2020
KANR100217/04/2023

 

 

Query2 has below columns only with mapping value ( Data sourced from excel sheet in Sharepoint) hardly 20 rows only.

 

Owner_idcomp_codecreation_dateBU
KANR100202/02/2020BABL
KANR100201/04/2023CALB
KANRnull (no values)null (no values)BABL
TANJInullnullDART
MPAVInullnullRANM

 

I need to a new column (BU) in my Query1 and map it from query2. I need to implement most specific to least with some conditions. 

First condition is match owner_id, comp_code and creation_date in Query2 should be less than or equal to creation_date in query1.

If there is no match for this then select based on owner_id match only where comp_code = null , creation_date = null. 

Based on these 2 conditions the result table would be as below

Owner_idComp_codeCreation_dateBU
KANR100102/03/2023BABL (derived from row-3)
TANJI150004/05/2023DART (derived from row-4)
KANR100306/06/2023BABL (derived from row-3)
MPAVI104707/07/2023RANM (derived from row-5)
KANR100231/03/2020

BABL (derived from row-1)

creation_date in query2 is less  than creation_date from query1. but only matching entry so it should take this.

KANR100217/04/2023CALB (derived from row-2) creation_date in query2 is less  than creation_date from query1. but there are 2 records  for this condition but system should consider the latest date record

 

But we have millions of record in Query1 and hence try to avoid merge queries instead want to use list functions. But would like to hear from the experts which option is more efficient from performance perspective. Please provide solution for both options.

 

1 ACCEPTED SOLUTION
rubayatyasmin
Super User
Super User

Hi, @rraja2k2 

The goal of your problem is to assign a value from Query2's "BU" column to Query1 based on certain conditions. You have correctly identified two approaches:

  1. Using Power Query's built-in merge queries functionality
  2. Using Power Query M functions, specifically list functions

In terms of performance, neither approach is necessarily optimal for very large datasets, as both involve row-by-row operations. Power Query is best for transforming medium-sized datasets, while a database-level operation would be more efficient for large datasets. It may be worth considering moving this operation to the source system (SAP HANA DB) if possible, as databases are designed for handling this kind of large-scale operations more efficiently than Power BI.

If you find performance degrades with either approach, it may be helpful to simplify the model or reduce the amount of data being processed.

 

rubayatyasmin_0-1689517080227.png


Did I answer your question? Mark my post as a solution!super-user-logo

Proud to be a Super User!


View solution in original post

8 REPLIES 8
rraja2k2
Regular Visitor

I am finally able to arrive the result but the performance is very poor ...so I am planning to DAX query.

the code I used as below

let
exactMatch=Table.SelectRows(VIM_BUMAP, (IT) => IT[CO_OWNER_ID] = [CO_OWNER_ID] and IT[Company_Code] = [Company Code] and IT[Document_Create_Date] <= [Document Create Date] ),
partialMatch=Table.SelectRows(VIM_BUMAP, (IT) => IT[CO_OWNER_ID] = [CO_OWNER_ID] and IT[Company_Code] = null and IT[Document_Create_Date] = null )

in
if Table.RowCount(exactMatch) > 0 then List.Last(exactMatch[BU])
else if Table.RowCount(partialMatch) > 0 then partialMatch{0}[BU] else null

rraja2k2
Regular Visitor

Thanks for your input.

Due to some reason , I want to do in power query only. Can you please help how to do as I tried many options in the M query but not succeeded ? or merge queries option also ?

for power query:

 

let
currentRow = [Owner_id = _[Owner_id], Comp_code = _[Comp_code], Creation_date <= _[Creation_date]],
matches = Table.SelectRows(Query2, each Record.HasFields(currentRow, _)),
result = if List.IsEmpty(matches) then
List.FirstN(Table.SelectRows(Query2, each [Owner_id] = _[Owner_id] and [Comp_code] = null and [creation_date] = null), 1)
else
List.FirstN(matches, 1)
in
result

 

and for merge: 

 

  1. On the Home tab, click Merge Queries.
  2. Choose the first table from the first drop-down list, and the second table from the second drop-down list.
  3. Select the columns you want to join on (in your case, Owner_id, Comp_code and Creation_date).
  4. Choose the join type. In your case, you would use a Left Outer join to keep all from first and matching from second.

 

rubayatyasmin_0-1689517080227.png


Did I answer your question? Mark my post as a solution!super-user-logo

Proud to be a Super User!


I am getting error on the M query it shows error against date field.  

Also the merge queries option how to add the additional logic .

 

let
currentRow = [CO_OWNER_ID = _[CO_OWNER_ID], Company Code = _[Company Code],Document_Create_Date <= [Document Create Date]],
matches = Table.SelectRows(VIM_BUMAP, each Record.HasFields(currentRow, _)),
result = if List.IsEmpty(matches) then
List.FirstN(Table.SelectRows({VIM_BUMAP}, each [CO_OWNER_ID] = _[CO_OWNER_ID] and [Company Code] = null and [Document Create Date] = null), "1")
else
List.FirstN(matches,"1")
in
result

 

 

If removes date then it is no syntax error but the result column is error with below.

Expression.Error: We cannot convert a value of type Table to type List.
Details:
Value=[Table]
Type=[Type]

modified M

let
    Source = Query1,
    AddCustom = Table.AddColumn(Source, "BU", each 
        let 
            currentRow = [Owner_id = [Owner_id], Comp_code = [Comp_code], Creation_date <= DateTime.From([Creation_date])],
            matches = Table.SelectRows(Query2, each Record.HasFields(currentRow, _)),
            result = 
                if List.IsEmpty(matches) then
                    List.FirstN(Table.SelectRows(Query2, each [Owner_id] = [Owner_id] and [Comp_code] = null and [creation_date] = null)[BU], 1)
                else
                    List.FirstN(matches[BU], 1)
        in
            result
    )
in
    AddCustom

 

in merger query, 

to add the additional logic, use conditional column. 

Learn more about conditional column.

Add a conditional column - Power Query | Microsoft Learn

 

rubayatyasmin_0-1689517080227.png


Did I answer your question? Mark my post as a solution!super-user-logo

Proud to be a Super User!


I am creating custom column in the query1 but the current code shows me the columns from Query1 instead of query2. Still   the date is showing as invalid identifier.

you are creating custom column in query 1 so it is obvious that it will show columns from query1. nit query 2. And power Query uses  #date and date funstions to identify date. 

 

#date - PowerQuery M | Microsoft Learn

Date functions - PowerQuery M | Microsoft Learn

 

rubayatyasmin_0-1689517080227.png

 

 


Did I answer your question? Mark my post as a solution!super-user-logo

Proud to be a Super User!


rubayatyasmin
Super User
Super User

Hi, @rraja2k2 

The goal of your problem is to assign a value from Query2's "BU" column to Query1 based on certain conditions. You have correctly identified two approaches:

  1. Using Power Query's built-in merge queries functionality
  2. Using Power Query M functions, specifically list functions

In terms of performance, neither approach is necessarily optimal for very large datasets, as both involve row-by-row operations. Power Query is best for transforming medium-sized datasets, while a database-level operation would be more efficient for large datasets. It may be worth considering moving this operation to the source system (SAP HANA DB) if possible, as databases are designed for handling this kind of large-scale operations more efficiently than Power BI.

If you find performance degrades with either approach, it may be helpful to simplify the model or reduce the amount of data being processed.

 

rubayatyasmin_0-1689517080227.png


Did I answer your question? Mark my post as a solution!super-user-logo

Proud to be a Super User!


Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayPowerBICarousel1

Power BI Monthly Update - May 2024

Check out the May 2024 Power BI update to learn about new features.

Top Solution Authors
Top Kudoed Authors