cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
jonoCX
Frequent Visitor

How to investigate Python script error

Hi

 

I am trying to write a simple script in Power Query Editor in PowerBI, and keep getting an error below:

 

DataSource.Error: ADO.NET: Python script error.
<pi>pandas.errors.MergeError: Data columns not unique: Index(['Date', 'Date'], dtype='object')
</pi>
Details:
DataSourceKind=Python
DataSourcePath=Python
Message=Python script error.
<pi>pandas.errors.MergeError: Data columns not unique: Index(['Date', 'Date'], dtype='object')
</pi>
ErrorCode=-2147467259

The python scripts works in Spyder, and I also created a separate PowerBI file to test the scripts on a more simple makeup dataset and it works! So my guess is that the actual data is the problem. How does one start to investigate where the potential data issues could cause above problem?

 

Below is the script if that helps.. with Excel import tables: [dataset=#"Replaced Errors", AcctList=AccountHierachy]

 

 

import pandas as pd
aList = AcctList.loc[AcctList["Type"]=="Account"]
volTbl = pd.DataFrame(columns=["Date","Port_12M_vol", "Bmk_12M_vol", "Port_12M_TE","Portfolio"])
for x in aList["PortfolioName"]:
    data = dataset.loc[dataset["PortfolioName"] == x,["MonthEnd","PortfolioReturn","Benchmark Returns", "Excess"] ]
    vol12M = data.rolling(12, min_periods = 12).std()*(12**0.5)
    vol12M["Date"] = data["MonthEnd"]
    vol12M=vol12M.dropna()
    vol12M.rename(columns = {"PortfolioReturn":"Port_12M_vol", "Benchmark Returns":"Bmk_12M_vol", "Excess":"Port_12M_TE","MonthEnd":"Date"}, inplace=True)
    vol12M["Portfolio"]=x
    vol12M = vol12M[["Date","Port_12M_vol", "Bmk_12M_vol", "Port_12M_TE","Portfolio"]]
    #print(x, vol12M.shape)
    vol36M = data.rolling(36, min_periods = 36).std()*(12**0.5)
    vol36M["Date"] = data["MonthEnd"]
    vol36M=vol36M.dropna()
    vol36M.rename(columns = {"PortfolioReturn":"Port_36M_vol", "Benchmark Returns":"Bmk_36M_vol", "Excess":"Port_36M_TE","MonthEnd":"Date"}, inplace=True)
    volM = pd.merge(vol12M,vol36M, how = 'left', sort=True)
    vol60M = data.rolling(60, min_periods = 60).std()*(12**0.5)
    vol60M["Date"] = data["MonthEnd"]
    vol60M=vol60M.dropna()
    vol60M.rename(columns = {"PortfolioReturn":"Port_60M_vol", "Benchmark Returns":"Bmk_60M_vol", "Excess":"Port_60M_TE","MonthEnd":"Date"}, inplace=True)
    volM = pd.merge(volM,vol60M, how = 'left', sort=True)
    volTbl = volTbl.append(volM, sort=True)
    volTbl = volTbl[["Date","Portfolio","Port_12M_vol", "Bmk_12M_vol", "Port_12M_TE","Port_36M_vol", "Bmk_36M_vol", "Port_36M_TE","Port_60M_vol", "Bmk_60M_vol", "Port_60M_TE"]]
print(volTbl)

 

 

I have been pulling my hair out on this one, so any help would be appreciated!!

 

Many thanks

Jonathan

3 REPLIES 3
jonoCX
Frequent Visitor

Hi

 

Thanks for your help and inputs.. I have no choice at the end, but go thru my data segment and segments and see which part of the data that may be causing the issue.

 

So the problem at the end is that is that a few of the "PortfolioName" (x in the loop) in aList does not match the dataset table "PortfolioName", hence producing some empty dataframe which hence cannot be merged in PowerBI. This name mismatch problem did not cause a problem in Python based application however, it must somehow ignore the empty dataframe, and continue the loop. 

 

Anyway, woud be nice to have a way to debug Python codes in PowerBI.. as it seems to behave slightly differently in PowerBI, and find it hard how to identify the issue - if is a data issue, how powerbi execute the python codes or something else....

 

Many thanks

 

Jonathan

v-yingjl
Community Support
Community Support

Hi @jonoCX ,

As the error message mentioned,

pandas.errors.MergeError: Data columns not unique: 


Please check whether there are multiple duplicated columns names 'Date' in your data source table when you create it by the python script.

 

Best Regards,
Community Support Team _ Yingjie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

YukiK
Responsive Resident
Responsive Resident

You may want to look into this documentation taking about how to use an external python IDE with power bi. You might be able to debug there, which I'm not entirely sure is possible. Use an external Python IDE with Power BI - Power BI | Microsoft Docs 

 

But it looks like it's more of an error that resides in your code regarding pandas merge operation. Since it's complaining about "Data columns not unique", I'm guessing there are more than 1 column in the table where the column name is "Date". So maybe consolidating that would help.

 

Hope that helps!

 

 

Helpful resources

Announcements
2022 Release Wave 1 760x460.png

2022 Release Wave 1 Plan

Power Platform release plan for the 2022 release wave 1 describes all new features releasing from April 2022 through September 2022.

Power BI December 2021 Update_carousel 768x460.jpg

Check it Out!

Click here to read more about the December 2021 Updates!

User Group Leader Meeting January 768x460.png

Calling all User Group Leaders!

Don't miss the User Group Leader meetings on January, 24th & 25th, 2022.

Jan 2022 Dev Camp 768x460 copy.png

Power BI Dev Camp- January 27th, 2022

Mark your calendars and join us for our next Power BI Dev Camp!

Top Solution Authors