2

Similar questions have been asked here and here. However, I can't seem to get them to work for me.

If I have a character vector like:

myString <- c("5", "10", "100\abc\nx1\n1")

I want to remove everything after (and including) the first backslash. For example, my expected result would be:

>myString
"5" "10" "100"

I have tried using sub, gsub, and strsplit but I just can't seem to get it to work. Things I've tried:

gsub("\\\\*", "", myString)
sub("\\\\.*", "", myString)
gsub('\\"', "", myString, fixed = TRUE)
gsub("\\.*","", myString)

But I'm not great with regex stuff so I'm almost definitely not using these functions correctly! Any advice as to how I'd fix this?

Electrino
  • 2,254
  • 3
  • 15
  • 31
  • 1
    Your `myString` does not contain a `\ `. If it should contain a `\ ` you can use e.g.: `myString – GKi Sep 01 '21 at 15:45
  • Explanation of why this is happening: https://stackoverflow.com/questions/25424382/replace-single-backslash-in-r – Skaqqs Sep 01 '21 at 15:45
  • Ah ok... So by my understanding, what I really have in `myString[3]` is separate characters like "100", "\abc", "\nx1", and "\n1". The issue is, Im extracting a giant string from a model output and trying to separate it. So the single backslashes are part of the string I extract – Electrino Sep 01 '21 at 15:57

3 Answers3

2

Here is another way you could try:

gsub("(^\\d+)([\a-zA-Z0-9]*)", "\\1", myString)

[1] "5"   "10"  "100"
Anoushiravan R
  • 18,699
  • 3
  • 13
  • 36
1

Using the information from @Skaqqs, it led me to something helpful by @bartektartanus. It's not base R unfortunately, but I think this should work using the stringi package to escape the uniciode

library(stringi)
myString <- c("5", "10", "100\abc\nx1\n1")
gsub("\\\\.*", "", stri_escape_unicode(myString))

result:

 "5"   "10"  "100"
Silentdevildoll
  • 662
  • 4
  • 12
1

We could use parse_number

readr::parse_number(myString)
[1]   5  10 100
akrun
  • 789,025
  • 32
  • 460
  • 575