Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
jonoCX
Frequent Visitor

How to investigate Python script error

Hi

 

I am trying to write a simple script in Power Query Editor in PowerBI, and keep getting an error below:

 

DataSource.Error: ADO.NET: Python script error.
<pi>pandas.errors.MergeError: Data columns not unique: Index(['Date', 'Date'], dtype='object')
</pi>
Details:
DataSourceKind=Python
DataSourcePath=Python
Message=Python script error.
<pi>pandas.errors.MergeError: Data columns not unique: Index(['Date', 'Date'], dtype='object')
</pi>
ErrorCode=-2147467259

The python scripts works in Spyder, and I also created a separate PowerBI file to test the scripts on a more simple makeup dataset and it works! So my guess is that the actual data is the problem. How does one start to investigate where the potential data issues could cause above problem?

 

Below is the script if that helps.. with Excel import tables: [dataset=#"Replaced Errors", AcctList=AccountHierachy]

 

 

import pandas as pd
aList = AcctList.loc[AcctList["Type"]=="Account"]
volTbl = pd.DataFrame(columns=["Date","Port_12M_vol", "Bmk_12M_vol", "Port_12M_TE","Portfolio"])
for x in aList["PortfolioName"]:
    data = dataset.loc[dataset["PortfolioName"] == x,["MonthEnd","PortfolioReturn","Benchmark Returns", "Excess"] ]
    vol12M = data.rolling(12, min_periods = 12).std()*(12**0.5)
    vol12M["Date"] = data["MonthEnd"]
    vol12M=vol12M.dropna()
    vol12M.rename(columns = {"PortfolioReturn":"Port_12M_vol", "Benchmark Returns":"Bmk_12M_vol", "Excess":"Port_12M_TE","MonthEnd":"Date"}, inplace=True)
    vol12M["Portfolio"]=x
    vol12M = vol12M[["Date","Port_12M_vol", "Bmk_12M_vol", "Port_12M_TE","Portfolio"]]
    #print(x, vol12M.shape)
    vol36M = data.rolling(36, min_periods = 36).std()*(12**0.5)
    vol36M["Date"] = data["MonthEnd"]
    vol36M=vol36M.dropna()
    vol36M.rename(columns = {"PortfolioReturn":"Port_36M_vol", "Benchmark Returns":"Bmk_36M_vol", "Excess":"Port_36M_TE","MonthEnd":"Date"}, inplace=True)
    volM = pd.merge(vol12M,vol36M, how = 'left', sort=True)
    vol60M = data.rolling(60, min_periods = 60).std()*(12**0.5)
    vol60M["Date"] = data["MonthEnd"]
    vol60M=vol60M.dropna()
    vol60M.rename(columns = {"PortfolioReturn":"Port_60M_vol", "Benchmark Returns":"Bmk_60M_vol", "Excess":"Port_60M_TE","MonthEnd":"Date"}, inplace=True)
    volM = pd.merge(volM,vol60M, how = 'left', sort=True)
    volTbl = volTbl.append(volM, sort=True)
    volTbl = volTbl[["Date","Portfolio","Port_12M_vol", "Bmk_12M_vol", "Port_12M_TE","Port_36M_vol", "Bmk_36M_vol", "Port_36M_TE","Port_60M_vol", "Bmk_60M_vol", "Port_60M_TE"]]
print(volTbl)

 

 

I have been pulling my hair out on this one, so any help would be appreciated!!

 

Many thanks

Jonathan

3 REPLIES 3
jonoCX
Frequent Visitor

Hi

 

Thanks for your help and inputs.. I have no choice at the end, but go thru my data segment and segments and see which part of the data that may be causing the issue.

 

So the problem at the end is that is that a few of the "PortfolioName" (x in the loop) in aList does not match the dataset table "PortfolioName", hence producing some empty dataframe which hence cannot be merged in PowerBI. This name mismatch problem did not cause a problem in Python based application however, it must somehow ignore the empty dataframe, and continue the loop. 

 

Anyway, woud be nice to have a way to debug Python codes in PowerBI.. as it seems to behave slightly differently in PowerBI, and find it hard how to identify the issue - if is a data issue, how powerbi execute the python codes or something else....

 

Many thanks

 

Jonathan

v-yingjl
Community Support
Community Support

Hi @jonoCX ,

As the error message mentioned,

pandas.errors.MergeError: Data columns not unique: 


Please check whether there are multiple duplicated columns names 'Date' in your data source table when you create it by the python script.

 

Best Regards,
Community Support Team _ Yingjie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

YukiK
Impactful Individual
Impactful Individual

You may want to look into this documentation taking about how to use an external python IDE with power bi. You might be able to debug there, which I'm not entirely sure is possible. Use an external Python IDE with Power BI - Power BI | Microsoft Docs 

 

But it looks like it's more of an error that resides in your code regarding pandas merge operation. Since it's complaining about "Data columns not unique", I'm guessing there are more than 1 column in the table where the column name is "Date". So maybe consolidating that would help.

 

Hope that helps!

 

 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors