Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
LHENRIQUESILVA
Regular Visitor

Group by and count - How to translate this from python

Hi everyone, im new to powerbi DAX and im with a problem to solve...

 

i have a table like this:

COD_EVENTDEVICE_IDCITYDATECAUSE

3002

134

CAMPINAS01/02/2020ELECTRICAL DISCHARGE

3005

147

CAMPINAS02/10/2020ANIMALS

4008

128

SAO PAULO03/11/2020WIND
4009159SAO PAULO01/02/2021

ANIMALS

4013128SAO PAULO02/05/2021ELECTRICAL DISCHARGE
7892178CAMPINAS01/11/2019DEVICE FAIL
... .........

 

i need to count from a date(m/y), for example dez/2020,  the events in last 12 months, after this, i need groupby DEVICE, and show the quantity of events in devices that had more than one in last 12 months e the quantity of events in devices that had only one event in last 12 months.

 

An example of what i expect to get:

 

DATECITYRECURRENTUNIQUE
DEZ/2020CAMPINAS31
JAN/2020CAMPINAS32
FEV/2020CAMPINAS41

DEZ/2020

SAO PAULO23
JAN/2020SAO PAULO24
FEV/2020SAO PAULO15
............

 

Where RECURRENT calculates the number of events in devices that had more than one event in the last 12 months since the date described in DATE, for the city described in CITY. And UNIQUE calculate de amount of events in devices that had only 1 event in last 12 months for that city.

 

I make this work in python, and i did something like this:

 

 

city = events_df['CITY'].unique() #Extract all cities present in dataframe
tab_fin = pandas.DataFrame() #Create a df to append each iteration

for i in range(0,len(city)):
    
    events_by_city = events_df[events_df['CITY'] == city[i]] 
    #Create a table inside the loop that contains only the events of a especific city

    date_ini = pandas.to_datetime('2020-12-31') #first month of interest
    last_date = date.today().replace(day=1) - timedelta(days=1) #last month of interest
    num_months = (last_date.year - date_ini.year) * 12 + (last_date.month - date_ini.month)
    #number of dates between date_ini e last_date

        for j in range (0,num_months+1):
            events_12months = events_by_city[(events_by_city['DATE'] > date_ini + relativedelta(months=-12)) & (ocorr_conj['DT_INICIO'] <= data_ini)]
            #Extract the events in last 12 months
            
            events_grouped = events_12months.groupby('DEVICE_ID')['COD_EVENT'].count().reset_index()
            
            recurrent = sum(events_grouped[events_grouped['COD_EVENT']>1]['COD_EVENT']) #events in devices that had more than 1
            unique = len(events_grouped[events_grouped['COD_EVENT']==1]) #events in devices that had only 1
            tab_aux = pandas.DataFrame({'CITY':city[i],
                                        'DATE':str(date_ini.month) +'/'+ str(date_ini.year),
                                        'RECURRENT':recurrent,
                                        'UNIQUE':unique}, index=[0])
            tab_fin = tab_finappend(tab_aux) #Append in tab_fin the info of the i-city and j-month


            date_ini =  date_ini + relativedelta(months=+1) #add 1 month to date

 

 

 

 

1 REPLY 1
v-yetao1-msft
Community Support
Community Support

Hi @LHENRIQUESILVA 

I'm not very good at python . From the two tables you provided, I have no way to find out the pattern. Can you create a simple sample, and then describe your calculation logic according to this sample, and give the correct result?

 

Best Regards,
Community Support Team _ Ailsa Tao
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors