162

I have a dataframe df:

20060930  10.103       NaN     10.103   7.981
20061231  15.915       NaN     15.915  12.686
20070331   3.196       NaN      3.196   2.710
20070630   7.907       NaN      7.907   6.459

Then I want to select rows with certain sequence numbers which indicated in a list, suppose here is [1,3], then left:

20061231  15.915       NaN     15.915  12.686
20070630   7.907       NaN      7.907   6.459

How or what function can do that?

xxx
  • 1,065
  • 11
  • 22
user2806761
  • 2,467
  • 3
  • 16
  • 7

7 Answers7

200
ind_list = [1, 3]
df.ix[ind_list]

should do the trick! When I index with data frames I always use the .ix() method. Its so much easier and more flexible...

UPDATE This is no longer the accepted method for indexing. The ix method is deprecated. Use .iloc for integer based indexing and .loc for label based indexing. See below example:

ind_list = [1, 3]
df.iloc[ind_list]
tpk
  • 1,228
  • 4
  • 16
  • 32
Woody Pride
  • 12,263
  • 8
  • 46
  • 60
135

you can also use iloc:

df.iloc[[1,3],:]

This will not work if the indexes in your dataframe do not correspond to the order of the rows due to prior computations. In that case use:

df.index.isin([1,3])

... as suggested in other responses.

Community
  • 1
  • 1
yemu
  • 22,689
  • 10
  • 30
  • 29
94

Another way (although it is a longer code) but it is faster than the above codes. Check it using %timeit function:

df[df.index.isin([1,3])]

PS: You figure out the reason

enter image description here

Community
  • 1
  • 1
Amruth Lakkavaram
  • 1,257
  • 8
  • 11
15

If index_list contains your desired indices, you can get the dataframe with the desired rows by doing

index_list = [1,2,3,4,5,6]
df.loc[df.index[index_list]]

This is based on the latest documentation as of March 2021.

user42
  • 568
  • 4
  • 16
  • 1
    This is a great answer. The advantage of this method is that you can use the full power of df.loc. For example you can select the column you want with df.loc[df.index[index_list], "my_column"] and even set values with df.loc[df.index[index_list], "my_column"] = "my_value" – Gabriel Apr 05 '21 at 14:19
5

For large datasets, it is memory efficient to read only selected rows via the skiprows parameter.

Example

pred = lambda x: x not in [1, 3]
pd.read_csv("data.csv", skiprows=pred, index_col=0, names=...)

This will now return a DataFrame from a file that skips all rows except 1 and 3.


Details

From the docs:

skiprows : list-like or integer or callable, default None

...

If callable, the callable function will be evaluated against the row indices, returning True if the row should be skipped and False otherwise. An example of a valid callable argument would be lambda x: x in [0, 2]

This feature works in version pandas 0.20.0+. See also the corresponding issue and a related post.

Community
  • 1
  • 1
pylang
  • 34,585
  • 11
  • 114
  • 108
2

There are many ways of solving this problem, and the ones listed above are the most commonly used ways of achieving the solution. I want to add two more ways, just in case someone is looking for an alternative.

index_list = [1,3]

df.take(pos)

#or

df.query('index in @index_list')
Loochie
  • 2,066
  • 10
  • 18
  • this is the correct answer if you have say a named index like: `pd.DataFrame({'num_legs': [2, 4, 8, 0, 6, 10], 'num_wings': [2, 0, 0, 0, 4, 0], 'num_specimen_seen': [10, 2, 1, 8, 3, 0], 'do_I_like_it': [0, 1, 1, 1, 0, 0]}, index=['falcon', 'dog', 'spider', 'fish', 'dragonfly', 'limulus'])` – user27221 Mar 11 '21 at 16:15
  • @user27221 could you please take your DataFrame, transpose it, and then explain me how to select `num_legs` based on `num_wings == 0` and `do_I_like_it == 1`? – vault Sep 08 '21 at 11:16
0

What you are trying to do is to filter your dataframe by index. The best way to do that in pandas at the moment is the following:

Single Index

desired_index_list = [1,3]
df[df.index.isin(desired_index_list)]

Multiindex

desired_index_list = [1,3]
index_level_to_filter = 0
df[df.index.get_level_values(index_level_to_filter).isin(desired_index_list)]
Julio
  • 799
  • 2
  • 9
  • 17