I am looking to calculate the unique customer count from the groupby function.
Python code would be :
df = temp_df.groupby(['product','od_pairs']).agg(total_bookings=('booking_no_id', 'count'),
total_teus=('total_teus','sum'),
total_bl=('bl_amount', 'sum'),
total_cost=('cost', 'sum'),
total_customer=('control_party_name', 'nunique'),
)
How do i incorporate distinct count in the groupby function?
Solved! Go to Solution.
Hi @jeanL
I don't believe there's any way to calculate a distinct count using GROUPBY.
Depending where you need to use this table expression, I would recommend rewriting with SUMMARIZECOLUMNS (if you are materializing this table or running as a query) otherwise with SUMMARIZE:
(hopefully I interpreted your intended result correctly):
SUMMARIZECOLUMNS version
summary_table =
SUMMARIZECOLUMNS (
export_data[customer_segment],
export_data[product],
export_data[main_service],
export_data[vessel_direction],
export_data[od_pairs],
"booking_count", COUNT ( export_data[booking_no_id] ),
"total_teus", SUM ( export_data[total_teus] ),
"total_bl", SUM ( export_data[bl_amount] ),
"total_cost", SUM ( export_data[cost] ),
"total_cust", DISTINCTCOUNT ( export_data[control_party_name] )
)
ADDCOLUMNS/SUMMARIZE version
summary_table =
ADDCOLUMNS (
SUMMARIZE (
export_data,
export_data[customer_segment],
export_data[product],
export_data[main_service],
export_data[vessel_direction],
export_data[od_pairs]
),
"booking_count", CALCULATE ( COUNT ( export_data[booking_no_id] ) ),
"total_teus", CALCULATE ( SUM ( export_data[total_teus] ) ),
"total_bl", CALCULATE ( SUM ( export_data[bl_amount] ) ),
"total_cost", CALCULATE ( SUM ( export_data[cost] ) ),
"total_cust", CALCULATE ( DISTINCTCOUNT ( export_data[control_party_name] ) )
)
Regards,
Owen
Hi @jeanL
I don't believe there's any way to calculate a distinct count using GROUPBY.
Depending where you need to use this table expression, I would recommend rewriting with SUMMARIZECOLUMNS (if you are materializing this table or running as a query) otherwise with SUMMARIZE:
(hopefully I interpreted your intended result correctly):
SUMMARIZECOLUMNS version
summary_table =
SUMMARIZECOLUMNS (
export_data[customer_segment],
export_data[product],
export_data[main_service],
export_data[vessel_direction],
export_data[od_pairs],
"booking_count", COUNT ( export_data[booking_no_id] ),
"total_teus", SUM ( export_data[total_teus] ),
"total_bl", SUM ( export_data[bl_amount] ),
"total_cost", SUM ( export_data[cost] ),
"total_cust", DISTINCTCOUNT ( export_data[control_party_name] )
)
ADDCOLUMNS/SUMMARIZE version
summary_table =
ADDCOLUMNS (
SUMMARIZE (
export_data,
export_data[customer_segment],
export_data[product],
export_data[main_service],
export_data[vessel_direction],
export_data[od_pairs]
),
"booking_count", CALCULATE ( COUNT ( export_data[booking_no_id] ) ),
"total_teus", CALCULATE ( SUM ( export_data[total_teus] ) ),
"total_bl", CALCULATE ( SUM ( export_data[bl_amount] ) ),
"total_cost", CALCULATE ( SUM ( export_data[cost] ) ),
"total_cust", CALCULATE ( DISTINCTCOUNT ( export_data[control_party_name] ) )
)
Regards,
Owen
Welcome to the Power BI Community Show! Jeroen ter Heerdt talks about the importance of Data Modeling.
Put your data visualization and design skills to the test! This exciting challenge is happening now through May 31st!
User | Count |
---|---|
368 | |
100 | |
67 | |
56 | |
48 |
User | Count |
---|---|
328 | |
112 | |
86 | |
75 | |
61 |