74

I have an array of objects of this class

class CancerDataEntity(Model):

    age = columns.Text(primary_key=True)
    gender = columns.Text(primary_key=True)
    cancer = columns.Text(primary_key=True)
    deaths = columns.Integer()
    ...

When printed, array looks like this

[CancerDataEntity(age=u'80-85+', gender=u'Female', cancer=u'All cancers (C00-97,B21)', deaths=15306), CancerDataEntity(...

I want to convert this to a data frame so I can play with it in a more suitable way to me - to aggregate, count, sum and similar. How I wish this data frame to look, would be something like this:

     age     gender     cancer     deaths
0    80-85+  Female     ...        15306
1    ...

Is there a way to achieve this using numpy/pandas easily, without manually processing the input array?

ezamur
  • 1,847
  • 2
  • 21
  • 38

5 Answers5

86

A much cleaner way to to this is to define a to_dict method on your class and then use pandas.DataFrame.from_records

class Signal(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def to_dict(self):
        return {
            'x': self.x,
            'y': self.y,
        }

e.g.

In [87]: signals = [Signal(3, 9), Signal(4, 16)]

In [88]: pandas.DataFrame.from_records([s.to_dict() for s in signals])
Out[88]:
   x   y
0  3   9
1  4  16
OregonTrail
  • 7,916
  • 6
  • 38
  • 56
  • 2
    Great answer! Note, however, that I get the same results without using `from_records`: `pandas.DataFrame([s.to_dict() for s in signals])` – ChaimG Mar 17 '17 at 05:37
  • 21
    For simple classes without any `__dict__` trickery, this can be simplified to `pandas.DataFrame([vars(s) for s in signals])` without implementing a custom `to_dict` function. – Jim Hunziker Mar 09 '18 at 15:59
43

Just use:

DataFrame([o.__dict__ for o in my_objs])

Full example:

import pandas as pd

# define some class
class SomeThing:
    def __init__(self, x, y):
        self.x, self.y = x, y

# make an array of the class objects
things = [SomeThing(1,2), SomeThing(3,4), SomeThing(4,5)]

# fill dataframe with one row per object, one attribute per column
df = pd.DataFrame([t.__dict__ for t in things ])

print(df)

This prints:

   x  y
0  1  2
1  3  4
2  4  5
Shital Shah
  • 55,892
  • 12
  • 218
  • 175
  • This works great except it seems it doesn't work exactly well with inherited classes. I tried to build a collection of objects that have an inherited base class, and the only attributes returned in the data frame are those from the parent class, not the child class, even though all members of the collection are from the child class. – Mark Jan 05 '21 at 13:17
  • perfect, event works fine if another object inside SomeThing – Levin Sep 18 '21 at 08:28
25

Code that leads to desired result:

variables = arr[0].keys()
df = pd.DataFrame([[getattr(i,j) for j in variables] for i in arr], columns = variables)

Thanks to @Serbitar for pointing me to the right direction.

ezamur
  • 1,847
  • 2
  • 21
  • 38
21

I would like to emphasize Jim Hunziker's comment.

pandas.DataFrame([vars(s) for s in signals])

It is far easier to write, less error-prone and you don't have to change the to_dict() function every time you add a new attribute.

If you want the freedom to choose which attributes to keep, the columns parameter could be used.

pandas.DataFrame([vars(s) for s in signals], columns=['x', 'y'])

The downside is that it won't work for complex attributes, though that should rarely be the case.

typhon04
  • 1,714
  • 19
  • 22
  • You are the man. This is the absolute best one-liner solution searching many threads for a solution! – Andrej Aug 09 '20 at 16:49
13

try:

variables = list(array[0].keys())
dataframe = pandas.DataFrame([[getattr(i,j) for j in variables] for i in array], columns = variables)
Serbitar
  • 1,890
  • 18
  • 24
  • 2
    http://meta.stackoverflow.com/questions/262695/new-answer-deletion-option-code-only-answer – ivan_pozdeev Jan 25 '16 at 17:42
  • I guess I should not accept the answer as true since I had to tweak it to make it work but I am upvoting it since it pointed me to the right direction. – ezamur Jan 25 '16 at 20:52