2

Say I want to compute the relative complement df2 - df1 between two MultiIndex dataframes. Assuming that they have the same indexing schema, based on what I saw in this answer from Andy Hayden, I could do the following:

diff_indices = df2.index - df1.index

And then either:

  1. df2.reindex(diff_indices, inplace=True)

    or

  2. df2 = df2.loc[diff_indices]

What would be the difference between 1. and 2. above? What is the difference between df.reindex and df.loc?

Community
  • 1
  • 1
Amelio Vazquez-Reina
  • 83,134
  • 124
  • 340
  • 545

1 Answers1

8

Both approaches return a new series/dataframe, and basically do the same thing.

The reason for the seeming redundancy is that, while using loc is syntacticly limiting (you can only pass a single argument to __getitem__), reindex is a method, which supports taking various optional parameters. (docs)

David Nehme
  • 21,138
  • 8
  • 77
  • 116
shx2
  • 57,957
  • 11
  • 121
  • 147