Remove last N rows in data frame with the arbitrary number of rows

Question

I have a data frame and I want to remove last N rows from it. If I want to remove 5 rows, I currently use the following command, which in my opinion is rather convoluted:

df<- df[-seq(nrow(df),nrow(df)-4),]

How would you accomplish task, is there a convenient function that I can use in R?

In unix, I would use:

tac file | sed '1,5d' | tac

In unix, I would use: `head -n -5 file` – zx8754 Feb 18 '16 at 11:27 — zx8754, Feb 18 '16 at 11:27

Simon O'Hanlon · Accepted Answer · 2014-01-15T21:32:19.873

97

head with a negative index is convenient for this...

df <- data.frame( a = 1:10 )
head(df,-5)
#  a
#1 1
#2 2
#3 3
#4 4
#5 5

p.s. your seq() example may be written slightly less(?) awkwardly using the named arguments by and length.out (shortened to len) like this -seq(nrow(df),by=-1,len=5).

edited Jan 15 '14 at 21:32

answered Jan 15 '14 at 21:25

Simon O'Hanlon

56,833
13
136
180

1

There's an edge case! `head(df, -0) == head(df,0) != df` – peer Nov 23 '18 at 13:41
@peer sorry, I don't think I understand your comment. Can you illustrate the edge case more fully? – Simon O'Hanlon Nov 23 '18 at 15:41
4

I'm switching from `df[0:(nrow(df)-n),]` to `head`. In my case the user moves a slider to indicate `n` last rows are to be removed. But there's a catch! When the user sets `n=0` we would expect no rows to be removed. But with `head(df, -n)` all rows are removed because negative zero is resolved to positive zero -> take the first 0 rows. So I want to warn others who set `n` dynamically and allow `n=0`: You'll need `if (n > 0) df=head(df, -n)` – peer Nov 23 '18 at 17:33

score 26 · Answer 2 · answered Jan 15 '14 at 22:54

26

This one takes one more line, but is far more readable:

n<-dim(df)[1]
df<-df[1:(n-5),]

Of course, you can do it in one line by sticking the dim command directly into the re-assignment statement. I assume this is part of a reproducible script, and you can retrace your steps... Otherwise, strongly recommend in such cases to save to a different variable (e.g., df2) and then remove the redundant copy only after you're sure you got what you wanted.

answered Jan 15 '14 at 22:54

Assaf

515
5
6

7

While the `head` solution is probably preferable, you could also use `nrow(df)` instead of `dim(df)[1]`. – thelatemail Jan 16 '14 at 00:31
3

intuitive one-liner based on your suggestion: ``d – PatrickT Oct 31 '17 at 15:52
This solution actually removes the rownames of the data frame for me, while the accepted answer (using `head()`) doesn't, so I would not recommend this option. – Brunox13 May 17 '21 at 18:30

score 22 · Answer 3 · answered Oct 15 '18 at 23:33

Adding a dplyr answer for completeness:

test_df <- data_frame(a = c(1,2,3,4,5,6,7,8,9,10), 
                      b = c("a","b","c","d","e","f","g","h","i","j"))
slice(test_df, 1:(n()-5))

## A tibble: 5 x 2
#      a b    
#  <dbl> <chr>
#1     1 a    
#2     2 b    
#3     3 c    
#4     4 d    
#5     5 e

score 19 · Answer 4 · answered Oct 14 '20 at 10:37

19

Another dplyr answer which is even more readable:

df %>% filter(row_number() <= n()-5)

answered Oct 14 '20 at 10:37

Edgar

402
1
6
12

Remove last N rows in data frame with the arbitrary number of rows

4 Answers4

Linked

Related