Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
valcat27
Helper III
Helper III

IndexError: index 1323169739 is out of bounds for axis 0 with size 1322985896

Hello all, 

 

I'm pretty new to Power BI and I'm facing some difficulties in running python script.

 

In fact, I'm getting an error in Power Query when running python script, for which I couldn't find a solution that works.

 

I have already worked with the latest version of python (3.9), but at this moment I'm working with version 3.6 because I suspect it may be more compatible with Power BI. Moreover, the packages used in the code have also been installed and imported.

 

My code: 

import pandas as pd

pd.crosstab(index=dataset[‘clientID’], columns=dataset[‘productID’])

 

The error:

DataSource.Error: ADO.NET: Python script error.

Traceback (most recent call last):

  File "PythonScriptWrapper.PY", line 15, in <module>

    pd.crosstab(index=dataset['clientID'], columns=dataset['productID'])

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\pivot.py", line 577, in crosstab

    **kwargs

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\frame.py", line 6089, in pivot_table

    observed=observed,

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\pivot.py", line 127, in pivot_table

    table = agged.unstack(to_unstack)

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\frame.py", line 6393, in unstack

    return unstack(self, level, fill_value)

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\reshape.py", line 412, in unstack

    return _unstack_frame(obj, level, fill_value=fill_value)

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\reshape.py", line 442, in _unstack_frame

    constructor=obj._constructor,

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\reshape.py", line 142, in __init__

    self._make_selectors()

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\reshape.py", line 177, in _make_selectors

    mask.put(selector, True)

IndexError: index 1323169739 is out of bounds for axis 0 with size 1322985896

 

Details:

    DataSourceKind=Python

    DataSourcePath=Python

    Message=Python script error.

Traceback (most recent call last):

  File "PythonScriptWrapper.PY", line 15, in <module>

    pd.crosstab(index=dataset['clientID'], columns=dataset['productID'])

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\pivot.py", line 577, in crosstab

    **kwargs

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\frame.py", line 6089, in pivot_table

    observed=observed,

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\pivot.py", line 127, in pivot_table

    table = agged.unstack(to_unstack)

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\frame.py", line 6393, in unstack

    return unstack(self, level, fill_value)

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\reshape.py", line 412, in un...

    ErrorCode=-2147467259

    ExceptionType=Microsoft.PowerBI.Scripting.Python.Exceptions.PythonScriptRuntimeException

 

 

I would appreciate if someone could help me.

4 REPLIES 4
valcat27
Helper III
Helper III

Thank you for your answer. 

If pandas crosstab has a limit, I cannot find it... 

Both columns have about 1 263 000 rows. ClienteID has almost 40 000 unique values and ProductID has more than 460 000 unique values. 

So you should expect a crosstab with 18.4 billion cells.  A bit rich.

I have already tried with a sample of the dataset and I think it works. However, when running python script, a decimal place is added to "ProductID". For example, before I had "12345" and now "12345.0". I tried to round it in python but it says it is a string, so I cannot understand where it came from. I think it happens before applying crosstab function since in the input dataset (created after running the script), this change is already there.
Also, when applying the power query changes it returns the error: Fail to save modifications on the server. Error returned: 'The SUM function only accepts column reference as the argument number 1. The '215965.0' column does not exist in the rowset. ' .

lbendlin
Super User
Super User

What is the cardinality of your clientID and ProductID columns?  Looks like you are hitting a pandas crosstab limit.  Might want to try with a smaller dataset.

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors