I'm new to object-oriented programming in Python and am trying to analyze text message data from imessage over the past few years. I'm running python 3.8.
I've created a dataframe (called messages) with contact names, month, year, text message string, etc. Now I'm trying to create a new dataframe (called monthly_counts) that organizes the number of texts per month from each respective contact.
Below is the code I've written to try and do this:
Y = [2016, 2017, 2018, 2019, 2020]
M = [1, 2, 3, 4, 5, 6, 7, 8 , 9 , 10, 11, 12]
all_dates = []
for year in range(len(Y)):
for month in range(len(M)):
date = f"{int(M[month])}/{int(Y[year])}"
all_dates.append(date)
#dataframe to be built
monthly_counts = pd.DataFrame(index=[all_dates], columns=[contacts.Name])
total = []
values = []
for year in range(len(Y)):
for month in range(len(M)):
date = f"{int(M[month])}/{int(Y[year])}"
monthly_total = 0
for name in contacts['Display Name'].to_list():
data = messages[messages.year == Y[year]]
data = data[data.month == M[month]]
data = data[data.Name == name]
values.append(len(data)) #number of texts /year/month/contact
monthly_total += len(data)
monthly_counts.loc[date] = pd.Series(values).T
total.append(monthly_total)
monthly_counts['total'] = total
Right now, it doesn't throw any errors. But all elements are still 'nan' at the end.
I'm sure there is a better way to do this? Would it be better to build the dataframe a row at a time rather than filling in the prebuilt rows?