2

I understand the usage of map for a pd.Series and apply for a pd.DataFrame, but what is the difference between using map and apply for a pd.Series ? It seems to me that they essentially do the same thing:

>>> df['title'].map(  lambda value: str(value) + 'x')
>>> df['title'].apply(lambda value: str(value) + 'x')

It seems both just send a value to a function/map. Is there an actual difference between the two, and if so what would be an example showing it? Or, are these interchangeable when applied to pd.Series ?


For reference, from the docs:

For the examples map uses a dict and apply uses a func, but really, they seem the same? Both can use a function.

David542
  • 101,766
  • 154
  • 423
  • 727

1 Answers1

2

The See also paragraph of Series.map says that Series.apply is For applying more complex functions on a Series.

Series.map if for a one to one relation, that can be represented by a dictionary or a function of one parameter returning one value.

Series.apply can use functions returning more than one single parameter (in fact a whole Series). In that case, the result of Series.apply will be a DataFrame.

Said differently you can always use apply where you use map. If you pass a dict (say d) to map, you can pass a trivial lambda to apply: lambda x: d[x]. But if you use apply to transform a Series into a DataFrame, then map cannot be used.

As a result, map is likely to be more optimized that apply for one to one transformation, and should be used instead of apply wherever possible.

marc_s
  • 704,970
  • 168
  • 1,303
  • 1,425
Serge Ballesta
  • 136,215
  • 10
  • 111
  • 230