5

I have a string myFunction(arg1=\"hop\",arg2=TRUE). I want to isolate what is in between quotes (\"hop\" in this example)

I have tried so far with no success:

gsub(pattern="(myFunction)(\\({1}))(.*)(\\\"{1}.*\\\"{1})(.*)(\\){1})",replacement="//4",x="myFunction(arg1=\"hop\",arg2=TRUE)")

Any help by a regex guru would be welcome!

RockScience
  • 16,834
  • 25
  • 82
  • 121

4 Answers4

10

Try

 sub('[^\"]+\"([^\"]+).*', '\\1', x)
 #[1] "hop"

Or

 sub('[^\"]+(\"[^\"]+.).*', '\\1', x)
 #[1] "\"hop\""

The \" is not needed as " would work too

 sub('[^"]*("[^"]*.).*', '\\1', x)
 #[1] "\"hop\""

If there are multiple matches, as @AvinashRaj mentioned in his post, sub may not be that useful. An option using stringi would be

 library(stringi)
 stri_extract_all_regex(x1, '"[^"]*"')[[1]]
 #[1] "\"hop\""  "\"hop2\""

data

 x <- "myFunction(arg1=\"hop\",arg2=TRUE)"
 x1 <- "myFunction(arg1=\"hop\",arg2=TRUE arg3=\"hop2\", arg4=TRUE)"
akrun
  • 789,025
  • 32
  • 460
  • 575
  • 1
    many thanks, this works great. Could you explain the rationale for the first solution? – RockScience Apr 08 '15 at 08:09
  • 1
    @RockScience The first solution matches all characters that are not `\"` i.e `[^\"]+`, followed by a `\"`, and then use capture groups (within parentheses) to get the characters that not `\"`, use `\\1` to extract the capture group. – akrun Apr 08 '15 at 08:11
8

You could use regmatches function also. Sub or gsub only works for a particular input , for general case you must do grabing instead of removing.

> x <- "myFunction(arg1=\"hop\",arg2=TRUE)"
> regmatches(x, gregexpr('"[^"]*"', x))[[1]]
[1] "\"hop\""

To get only the text inside quotes then pass the result of above function to a gsub function which helps to remove the quotes.

> x <- "myFunction(arg1=\"hop\",arg2=TRUE)"
> gsub('"', '', regmatches(x, gregexpr('"([^"]*)"', x))[[1]])
[1] "hop"
> x <- "myFunction(arg1=\"hop\",arg2=\"TRUE\")"
> gsub('"', '', regmatches(x, gregexpr('"([^"]*)"', x))[[1]])
[1] "hop"  "TRUE"
Avinash Raj
  • 166,785
  • 24
  • 204
  • 249
3

You can try:

str='myFunction(arg1=\"hop\",arg2=TRUE)'

gsub('.*(\\".*\\").*','\\1',str)
#[1] "\"hop\""
Colonel Beauvel
  • 29,465
  • 10
  • 43
  • 81
2
x <- "myFunction(arg1=\"hop\",arg2=TRUE)"
unlist(strsplit(x,'"'))[2]
# [1] "hop"
pogibas
  • 25,773
  • 19
  • 74
  • 108
  • 1
    with `paste0("\"",unlist(strsplit(x,'\"',perl=T))[2],"\"")` to get the desired result... (check the comments after the OP's question) – Cath Apr 08 '15 at 08:21