Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more.
Get startedGrow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.
It would be really useful if we had a Pyspark forum
Im SQL through and through and learning Pyspark is a NIGHTMARE
I have the following code that finds all the contestants with more that one record in a list that should be unique
Solved! Go to Solution.
Hi @DebbieE ,
Thanks for using Fabric Community.
As I understand -
Spark SQL Code:
df = spark.sql("SELECT Min(CustomerID), CompanyName, Count(*) FROM gopi_lake_house.customer_table1 group by CompanyName having count(*)>1")
display(df)
Pyspark Code:
Can you please try below code -
from pyspark.sql.functions import *
result = dfcont.groupBy('CompanyName')\
.agg(min('CustomerID').alias('minCustomerID'), count('CustomerID').alias('TotalRecords'))\
.filter(col('TotalRecords') > 1)\
.show(1000)
Hope this is helpful. Please let me know incase of further queries.
Yey. It worked. thank you so much. the excercise is to try to do everything I usually do with pyspark so having these extra examples are gold too.
Hi @DebbieE ,
Glad to know that your query got resolved. Please continue using Fabric Community on your further queries.
Hi @DebbieE ,
Thanks for using Fabric Community.
As I understand -
Spark SQL Code:
df = spark.sql("SELECT Min(CustomerID), CompanyName, Count(*) FROM gopi_lake_house.customer_table1 group by CompanyName having count(*)>1")
display(df)
Pyspark Code:
Can you please try below code -
from pyspark.sql.functions import *
result = dfcont.groupBy('CompanyName')\
.agg(min('CustomerID').alias('minCustomerID'), count('CustomerID').alias('TotalRecords'))\
.filter(col('TotalRecords') > 1)\
.show(1000)
Hope this is helpful. Please let me know incase of further queries.
Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.
Ask questions in Data Engineering, Data Science, Data Warehouse and General Discussion.
Ask questions in Eventhouse and KQL, Eventstream, and Reflex.
User | Count |
---|---|
10 | |
5 | |
4 | |
3 | |
3 |