Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
JoCurry_Aus
Frequent Visitor

How do I Manage False Duplicates?

Hey there

 

I posted this issue last week https://community.powerbi.com/t5/Service/On-Premise-Data-Gateway-Refresh-Success-but-Some-Tables-Ret... which I still haven't been able to solve - I've raised a ticket with Mictosoft but have heard nothing - so I decided to go with a new strategy. This issue was about how when I source data from a data warehouse using ODBC then transform it using Power BI functionality the tables return blank when using the service/gateway to refresh them. 

 

Instead of using Power BI functionality to transform my customer and product tables I have written all instructions into a SQL query. I access the data via an ODBC connection. The customers table is great - but the products table is not ok. I am getting the error; there is a duplicate value in the primary key field, which in this case is the product code. I know this isn't true. I can say confidently that the source data (being our data warehouse) has no duplicate values. If I look at the data in Power BI Desktop in table view there are 8,227 products in the table and the Product Code field has 8,227 distinct values. I have extracted to Excel and pivoted the data and no duplicates. I have used exactly the same style of SQL statement to extract the customers which is working correctly - as well as the transactional table. These 3 tables in Power BI all come from the same data warehouse.

 

If I manually refresh the tables in desktop and publish to the service everything is great so long as I have added the 'remove duplicates' step in Power BI edit query functionality - Power BI Service displays all reports correctly as all tables are updated.

BUT:

  • Subsequent to this change if I use Power BI Service to 'refresh now' using the gateway the products table does not update. 
  • If I have not added the step within Power BI to 'remove duplicates' the 'refresh now' function in Power BI Service will not work due to the duplicate error.

Any assistance is much appreciated!

 

SQL Statement - BTW I have edited the statement and used inner joins - same outcome
SELECT
IPROD as "Product Code"
IDESC as "Product"
IUMS as "Item UoM"
ICLAS as "Product Class"
IREF01 as "Product Type Code"
IREF02 as "Product Sub Category Code"
IREF05 as "P&L Group Code"
IREF04 as "Product Category Code"
C.CCDESC as "Product Category Name"
G.CCDESC as "P&L Group"
S.CCDESC as "Product Sub Category"
T.CCDESC as "Product Type"
FROM DB NAME P
LEFT JOIN TABLE 1 C
ON P.IREF04 = C.CCCODE
LEFT JOIN TABLE 2 G
ON P.IREF05 = G.CCCODE
LEFT JOIN TABLE 3 S
ON P.IREF02 = S.CCCODE
LEFT JOIN TABLE 4 T
ON P.IREF01 = T.CCCODE

 

3 REPLIES 3
v-shex-msft
Community Support
Community Support

Hi @JoCurry_Aus ,

Can you please share provide some detail information about refresh operations to help us clarify your scenario?(e.g. refresh history, detail error message, gateway log)

How to Get Your Question Answered Quickly

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

Hi @v-shex-msft 

 

Apologies for the late reply - I am waiting to get my admin rights fixed so I can view the logs.

 

I've done some further testing today by creating a new .pbix file and writing a query that returns very few fields from the table I have been having trouble with - and there are no joins. This stand alone table with no joins in the query and no relationships within Power BI does not return any duplicate record errors when I publish it to the service -  and in this test I did not have to use Power BI functionality to 'remove duplicates'. 

 

When I use the gateway and use 'refresh now' to refresh the table I get the 'last refresh succeeded' notation when viewing the dataset, but my visual is blank. I have exported the pbix file and find that of the 8 fields I have extracted via the query, 2 are blank - product code (primary key) and product description. 

 

It seems now the issue is between the data warehouse and the gateway/service. We are using an ODBC connection to query the data warehouse. I've also reached out to our external dba who built the data warehouse to see if he has experienced this before.

 

Has anyone else had a similar problem? I'm not getting any error messages, as far as refresh history goes, manually refreshing the pbix and publishing the data works as expected but using the gateway does not as some fields are entirely blank. I don't have access to logs at this point but since the gateway refresh was successful - would they help?

  

Thanks!

 

Hi @JoCurry_Aus ,

It seems like non error message appears, I think it may related to odbc driver compatibility between power bi desktop and service side. What version/bit of odbc driver you use?
BTW, you can also open a support ticket to get further support.(it is free for pro and above licenses)

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors