Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!
I have simple Python code that I run in the Jupyter editor:
import matplotlib.pyplot as plt
import seaborn as sns
data = [["a", 4], ["a", 5], ["b", 8], ["b", 9]]
df = pd.DataFrame(data, columns = ["x", "y"])
sns.violinplot(x = df["x"], y = df["y"])
which correctly produces the following visual:
In Power BI, I create the same table:
Insert a Python visual with the code:
but the result is not the same:
Anyone know what I am doing wrong here? Why is it not the same graph as in the Jupyter notebook? This graphs seems to be summing the data and not showing the distribution.
Solved! Go to Solution.
Hi @sm_accordion,
Power BI will typically group data by unique values. This can be mitigated by specifying Don't Summarize for your fields in the visual, e.g.:
This will produce a dataset like the following:
But, even this will consolidate rows if you have the same value repeated multiple times, e.g. x=a, y=4, and this will mess with your distribution. I've added a duplicate row to the above dataset and added a count to the table to show what happens:
If you truly want to have individual rows passed into your visual, you'll need to add unique value to each row of your dataset (such as an auto-incrementing int) to prevent grouping, so it looks something like this:
index | x | y |
1 | a | 4 |
2 | a | 5 |
3 | b | 8 |
4 | b | 9 |
... | ... | ... |
Here's the table representation now (note that count = 1):
If you want further details, I've produced a violin plot custom visual, and have written an article on how grouping/sampling needs to work in these situations. Hopefully this will be useful for you.
Good luck!
Daniel
If my post helps, then please consider accepting as a solution to help other forum members find the answer more quickly 🙂
Proud to be a Super User!
My course: Introduction to Developing Power BI Visuals
On how to ask a technical question, if you really want an answer (courtesy of SQLBI)
Hi @sm_accordion,
Power BI will typically group data by unique values. This can be mitigated by specifying Don't Summarize for your fields in the visual, e.g.:
This will produce a dataset like the following:
But, even this will consolidate rows if you have the same value repeated multiple times, e.g. x=a, y=4, and this will mess with your distribution. I've added a duplicate row to the above dataset and added a count to the table to show what happens:
If you truly want to have individual rows passed into your visual, you'll need to add unique value to each row of your dataset (such as an auto-incrementing int) to prevent grouping, so it looks something like this:
index | x | y |
1 | a | 4 |
2 | a | 5 |
3 | b | 8 |
4 | b | 9 |
... | ... | ... |
Here's the table representation now (note that count = 1):
If you want further details, I've produced a violin plot custom visual, and have written an article on how grouping/sampling needs to work in these situations. Hopefully this will be useful for you.
Good luck!
Daniel
If my post helps, then please consider accepting as a solution to help other forum members find the answer more quickly 🙂
Proud to be a Super User!
My course: Introduction to Developing Power BI Visuals
On how to ask a technical question, if you really want an answer (courtesy of SQLBI)
Thanks a ton. This solved the problem. I appreciate the detailed response and for taking the time to read through the post.
User | Count |
---|---|
140 | |
113 | |
104 | |
77 | |
64 |
User | Count |
---|---|
135 | |
122 | |
101 | |
71 | |
61 |