1

Let's assume I have a following data frame:

xx2xx30x4xx <- rep(5,30)
yyyy3yy50y5yyy <- rep(4,30)
zz12zzzz70z8zz <- rep(7,30)
df <- data.frame(xx2xx30x4xx,yyyy3yy50y5yyy,zz12zzzz70z8zz)

I would like to rename column names, so that they would consist of only the biggest number in between. I thought of doing it with gsub/grep and a loop, for example: This returns me the column names

grep(pattern = "[50-100]", x = colnames(df), value= T )

Now, I would want the column name to be equal to the pattern, by which they were matched, which is the number from 50-100 and not smaller numbers. Is this possible? If not, do you know other generic way to rename the columns as described? Thanks in advance.

Yaahtzeck
  • 177
  • 1
  • 12

1 Answers1

1
xxxxxx30xxxx <- rep(5,30)
yyyyyyy50yyyyy <- rep(4,30)
zzzzzzz70zzzz <- rep(7,30)
df <- data.frame(zzzzzzz70zzzz,yyyyyyy50yyyyy,xxxxxx30xxxx)

grep(pattern = "[0-100]", x = colnames(df), value= T )

new_colnames <- gsub("\\D", "", colnames(df))
colnames(df) <- new_colnames

I hope i understood you correctly. The gsub command erases everything that is not a digit from the column names, so you're left with the numbers inbetween.

EDIT:

This code matches a two-digit number in your string between 30 and 70, and extracts it.

xxxxxx30xxxx <- rep(5,30)
yyyyyyy50yyyyy <- rep(4,30)
zzzzzzz70zzzz <- rep(7,30)
df <- data.frame(zzzzzzz70zzzz,yyyyyyy50yyyyy,xxxxxx30xxxx)

grep(pattern = "[0-100]", x = colnames(df), value= T )

# new_colnames <- gsub("\\D", "", colnames(df))

new_colnames <- regmatches(colnames(df), regexpr("([3-6][0-9])|([7][0])",colnames(df)))

colnames(df) <- new_colnames

Here's some information on regular expressions and string operations:

https://stat.ethz.ch/R-manual/R-devel/library/base/html/regex.html

https://www.regular-expressions.info/rlanguage.html

brettljausn
  • 3,047
  • 1
  • 15
  • 31
  • Yes, this works (almost) fine! What if columns name consist of several numbers, for example zz2z3z70zzz5z etc and I want only number that belongs to a certain range, lets say from 50 to 100. In this case it would eliminate 2 3 and 5 also. Thanks! – Yaahtzeck Oct 02 '17 at 14:33
  • Check out my edited answer :) – brettljausn Oct 03 '17 at 06:31