34

I have read the docs of DataFrame.apply

DataFrame.apply(func, axis=0, broadcast=False, raw=False, reduce=None, args=(), **kwds)¶ Applies function along input axis of DataFrame.

So, How can I apply a function to a specific column?

In [1]: import pandas as pd
In [2]: data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
In [3]: df = pd.DataFrame(data)
In [4]: df
Out[4]: 
   A  B  C
0  1  4  7
1  2  5  8
2  3  6  9
In [5]: def addOne(v):
...:        v += 1
...:        return v
...: 
In [6]: df.apply(addOne, axis=1)
Out[6]: 
   A  B   C
0  2  5   8
1  3  6   9
2  4  7  10

I want to addOne to every value in df['A'], not all columns. How can I do that with DataFrame.apply.

Thanks for help!

GoingMyWay
  • 15,546
  • 27
  • 90
  • 132
  • 1
    Avoid using `apply` as much as possible. If you're not sure you need to use it, you probably don't. I recommend taking a look at [When should I ever want to use pandas apply() in my code?](https://stackoverflow.com/q/54432583/4909087). – cs95 Jan 30 '19 at 10:22
  • 1
    @coldspeed That is nice, good question and answers in depth. – GoingMyWay Jan 30 '19 at 11:34

4 Answers4

49

The answer is,

df['A'] = df['A'].map(addOne)

and maybe you would be better to know about the difference of map, applymap, apply.

but if you insist to use apply, you could try like below.

def addOne(v):
    v['A'] += 1
    return v

df.apply(addOne, axis=1)
su79eu7k
  • 6,415
  • 2
  • 33
  • 40
14

One simple way would be:

df['A'] = df['A'].apply(lambda x: x+1)
gustafbstrom
  • 1,352
  • 2
  • 18
  • 41
felix_as
  • 219
  • 3
  • 5
  • I did your suggestion by doing: df['A'] = df['A'].apply(lambda x: datetime.fromtimestamp(float(x)/1000.)) and I got: "A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead. " Any suggestions? – Catarina Nogueira Apr 10 '20 at 11:42
  • 1
    @Catarina Nogueira Try adding .copy() at the very end e.g. apply(...).copy() – Nosey Jul 02 '20 at 14:40
1

you can use .apply() with lambda function to solve this kind of problems.

Consider, your dataframe is something like this,

A | B | C
----------
1 | 4 | 7
2 | 5 | 8
3 | 6 | 9

The function which you want to apply:

def addOne(v):
v += 1
return v

So if you write your code like this,

df['A'] = df.apply(lambda x: addOne(x.A), axis=1)

You will get:

A | B | C
----------
2 | 4 | 7
3 | 5 | 8
4 | 6 | 9
Tejas Shah
  • 11
  • 3
1

For anyone else looking for a solution that allows for pipe-ing:

identity = lambda x: x

def transform_columns(df, mapper):
    return df.transform(
        {
            **{
                column: identity
                for column in df.columns
            },
            **mapper
        }
    )

# you can monkey-patch it on the pandas DataFrame (but don't have to, see below)
pd.DataFrame.transform_columns = transform_columns

(
    pd.DataFrame(data)
    .rename(columns={'A': 'A1'})   # just to demonstrate the motivation
    .transform_columns({'A1': add_one})
)

This also allows to:

pd.DataFrame(data).transform_columns({
    'A': add_one,
    'B': add_two,
})

And if you do not want to monkey-patch DataFrame, you can always use it with pipe:

pd.DataFrame(data).pipe(transform_columns, {'A': add_one})

It would be great if this was naively supported by pandas though.

The snippets above are CC0.

krassowski
  • 10,402
  • 3
  • 47
  • 75