Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
dfredp
Regular Visitor

Python heatmap visual displaying data in wrong days (x-axis)

Hello everyone.

I created a python heatmap in VSCode and it showed data correctly (figure below)

dfredp_0-1713782792415.png

When I used the python visual and integrated my code in it, data started jo just join in the first month (year 2022 and 2023, figures below)

dfredp_1-1713782883520.png

dfredp_2-1713782929682.png

I don't know if there is something to do with Power BI or my python script itself, but as I said, it was showing data correctly in VSCode. I tried to check similar problems but couldn't find in other posts.

 

I would really appreciate your support 🙂

 

Thanks

 

3 REPLIES 3
v-cgao-msft
Community Support
Community Support

Hi @dfredp ,

Python visual in Power BI has this limitation:
The data the Python visual uses for plotting is limited to 150,000 rows. If more than 150,000 rows are selected, only the top 150,000 rows are used, and a message appears on the image. The input data also has a limit of 250 MB.

Please check if this limit has been triggered.
Create Power BI visuals using Python in Power BI Desktop - Limitations

Best Regards,
Gao

Community Support Team

 

If there is any post helps, then please consider Accept it as the solution  to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

How to get your questions answered quickly --  How to provide sample data in the Power BI Forum -- China Power BI User Group

Thanks for your help, however my dataset only has 10k rows so I don't believe this is the problem. If this could help, here is my python script: # Cole ou escreva o código do script aqui: import pandas as pd import numpy as np from scipy.interpolate import Rbf # Import Rbf for radial basis function interpolation import matplotlib.pyplot as plt import seaborn as sns from matplotlib.ticker import FuncFormatter # Drop rows where any of the data is missing or infinite dataset = dataset.replace([np.inf, -np.inf], np.nan).dropna(subset=['Day', 'Depth', 'Temperature']) # Remove outlier temperature values dataset.loc[dataset['Temperature'] > 35, 'Temperature'] = np.nan # Initialize variables for all days in the range of selected years year_min = dataset['Year'].min() year_max = dataset['Year'].max() all_days = pd.date_range(start=f"{year_min}-01-01", end=f"{year_max}-12-31").date day_range = np.arange(1, len(all_days) + 1) depth_range = np.arange(np.floor(dataset['Depth'].min()), np.ceil(dataset['Depth'].max()) + 1) # Initialize the grid with NaNs grid_z = np.full((len(depth_range), len(day_range)), np.nan) # Perform RBF interpolation daily for each depth for day in day_range: day_data = dataset[dataset['Day'] == day] if not day_data.empty: # Filter out rows with NaN values in any key columns day_data = day_data.dropna(subset=['Depth', 'Temperature']) points_y = day_data['Depth'] values = day_data['Temperature'] if len(points_y) > 2: # Ensure there are enough points to interpolate try: rbf_interpolator = Rbf(points_y, values, function='linear', smooth=0.2) valid_depths = np.intersect1d(depth_range, np.round(points_y)) # Interpolate only for available depths grid_z[valid_depths.astype(int) - int(depth_range[0]), day - 1] = rbf_interpolator(valid_depths) except ValueError as e: print(f"Skipping interpolation for day {day} due to insufficient data: {e}") # Create the dataframe for the heatmap heatmap_data = pd.DataFrame(grid_z, index=depth_range, columns=all_days) # Define a function to perform linear interpolation for small gaps in the data def interpolate_small_gaps(data, max_gap=3): for depth_idx in range(data.shape[0]): # Iterate over each depth series = data.iloc[depth_idx] is_na = series.isna() filled = series.copy() for start, group in is_na.groupby((is_na != is_na.shift()).cumsum()): if group.all(): # If the entire group is True (NaNs) if len(group) <= max_gap: # Get the integer indices of the start and end of the gap start_idx = series.index.get_loc(group.index[0]) end_idx = series.index.get_loc(group.index[-1]) # Only interpolate if we have data on both sides of the gap if start_idx > 0 and end_idx < len(series) - 1: interp_range = slice(series.index[start_idx - 1], series.index[end_idx + 1]) filled[interp_range] = series[interp_range].interpolate() data.iloc[depth_idx] = filled return data # Apply the interpolation function to fill small gaps heatmap_data = interpolate_small_gaps(heatmap_data) # Custom function to format the x-axis labels to show only months in Portuguese def custom_formatter(x, pos): index = int(x) # Convert to int to use as index if index < len(all_days): date = all_days[index - 1] # Adjust for zero-based index if date.strftime('%d') == '01': # First day of the month # Mapping English month names to Portuguese month_map = { 'Jan': 'Jan', 'Feb': 'Fev', 'Mar': 'Mar', 'Apr': 'Abr', 'May': 'Mai', 'Jun': 'Jun', 'Jul': 'Jul', 'Aug': 'Ago', 'Sep': 'Set', 'Oct': 'Out', 'Nov': 'Nov', 'Dec': 'Dez' } return month_map[date.strftime('%b')] return '' # Custom function to format the y-axis labels as integers def int_formatter(x, pos): return '%d' % x # Plotting the heatmap plt.figure(figsize=(20, 8)) heatmap = sns.heatmap(heatmap_data, cmap='jet', cbar_kws={'label': 'Temperatura (°C)'}, vmin=0, vmax=30) # Use the custom_formatter function when setting x-tick labels heatmap.set_xticks(np.arange(len(all_days))) heatmap.set_xticklabels([custom_formatter(x, None) for x in range(1, len(all_days) + 1)], rotation=90) # Add the vertical dotted lines and adjust their style based on the day for day, date in enumerate(all_days, start=1): if date.day in [5, 10, 15, 20, 25]: plt.axvline(x=day - 1, color='grey', linestyle=':', linewidth=1) # Thicker line for specified days if date.day == 1: plt.axvline(x=day - 1, color='black', linestyle=':', linewidth=1) # Thicker line for the first day of the month else: plt.axvline(x=day - 1, color='grey', linestyle=':', linewidth=0.5) plt.xlabel('Dia e Mês', fontsize=15) plt.ylabel('Profundidade (m)', fontsize=15) # Apply the custom formatter to the y-axis plt.gca().yaxis.set_major_formatter(FuncFormatter(int_formatter)) plt.tight_layout() plt.show()

 

johnbasha33
Solution Sage
Solution Sage

@dfredp 

It seems like the issue you're encountering might be related to how the data is being processed or visualized within the Python script when integrated into Power BI.

Here are a few potential reasons why you might be seeing different results:

1. **Data Processing**: Check if the data processing steps in your Python script are the same when integrated into Power BI as they are in VSCode. Ensure that the data is being read and processed correctly, including any transformations or aggregations that may affect the heatmap.

2. **Data Structure**: Verify that the structure of the data being passed to the Python script in Power BI is the same as the data you used in VSCode. Differences in data structure or formatting could lead to unexpected results.

3. **Visualization Configuration**: Review the configuration of the heatmap visualization in your Python script. Check if there are any settings or parameters that may need adjustment for it to display the data correctly within Power BI.

4. **Environment Differences**: Consider any differences in the environment between VSCode and Power BI that may impact the execution of your Python script. This could include differences in Python versions, package dependencies, or other environmental factors.

5. **Error Handling**: Add error handling to your Python script to catch any potential issues that may arise during execution within Power BI. This can help identify and troubleshoot any errors that may be occurring.

By thoroughly reviewing these aspects and comparing the behavior of your Python script in VSCode versus Power BI, you should be able to identify and address the root cause of the discrepancy in the heatmap visualization.

Did I answer your question? Mark my post as a solution! Appreciate your Kudos !!

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayPowerBICarousel

Power BI Monthly Update - May 2024

Check out the May 2024 Power BI update to learn about new features.

LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.