Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
I am looking to calculate the unique customer count from the groupby function.
Python code would be :
df = temp_df.groupby(['product','od_pairs']).agg(total_bookings=('booking_no_id', 'count'),
total_teus=('total_teus','sum'),
total_bl=('bl_amount', 'sum'),
total_cost=('cost', 'sum'),
total_customer=('control_party_name', 'nunique'),
)
How do i incorporate distinct count in the groupby function?
Solved! Go to Solution.
Hi @jeanL
I don't believe there's any way to calculate a distinct count using GROUPBY.
Depending where you need to use this table expression, I would recommend rewriting with SUMMARIZECOLUMNS (if you are materializing this table or running as a query) otherwise with SUMMARIZE:
(hopefully I interpreted your intended result correctly):
SUMMARIZECOLUMNS version
summary_table =
SUMMARIZECOLUMNS (
export_data[customer_segment],
export_data[product],
export_data[main_service],
export_data[vessel_direction],
export_data[od_pairs],
"booking_count", COUNT ( export_data[booking_no_id] ),
"total_teus", SUM ( export_data[total_teus] ),
"total_bl", SUM ( export_data[bl_amount] ),
"total_cost", SUM ( export_data[cost] ),
"total_cust", DISTINCTCOUNT ( export_data[control_party_name] )
)
ADDCOLUMNS/SUMMARIZE version
summary_table =
ADDCOLUMNS (
SUMMARIZE (
export_data,
export_data[customer_segment],
export_data[product],
export_data[main_service],
export_data[vessel_direction],
export_data[od_pairs]
),
"booking_count", CALCULATE ( COUNT ( export_data[booking_no_id] ) ),
"total_teus", CALCULATE ( SUM ( export_data[total_teus] ) ),
"total_bl", CALCULATE ( SUM ( export_data[bl_amount] ) ),
"total_cost", CALCULATE ( SUM ( export_data[cost] ) ),
"total_cust", CALCULATE ( DISTINCTCOUNT ( export_data[control_party_name] ) )
)
Regards,
Owen
Hi @jeanL
I don't believe there's any way to calculate a distinct count using GROUPBY.
Depending where you need to use this table expression, I would recommend rewriting with SUMMARIZECOLUMNS (if you are materializing this table or running as a query) otherwise with SUMMARIZE:
(hopefully I interpreted your intended result correctly):
SUMMARIZECOLUMNS version
summary_table =
SUMMARIZECOLUMNS (
export_data[customer_segment],
export_data[product],
export_data[main_service],
export_data[vessel_direction],
export_data[od_pairs],
"booking_count", COUNT ( export_data[booking_no_id] ),
"total_teus", SUM ( export_data[total_teus] ),
"total_bl", SUM ( export_data[bl_amount] ),
"total_cost", SUM ( export_data[cost] ),
"total_cust", DISTINCTCOUNT ( export_data[control_party_name] )
)
ADDCOLUMNS/SUMMARIZE version
summary_table =
ADDCOLUMNS (
SUMMARIZE (
export_data,
export_data[customer_segment],
export_data[product],
export_data[main_service],
export_data[vessel_direction],
export_data[od_pairs]
),
"booking_count", CALCULATE ( COUNT ( export_data[booking_no_id] ) ),
"total_teus", CALCULATE ( SUM ( export_data[total_teus] ) ),
"total_bl", CALCULATE ( SUM ( export_data[bl_amount] ) ),
"total_cost", CALCULATE ( SUM ( export_data[cost] ) ),
"total_cust", CALCULATE ( DISTINCTCOUNT ( export_data[control_party_name] ) )
)
Regards,
Owen
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
115 | |
100 | |
88 | |
68 | |
61 |
User | Count |
---|---|
150 | |
120 | |
100 | |
87 | |
68 |