-1

I am strugling with regex.

I have this character vector bellow:

   texts <- c('I-have-text-2-and-text-8','I-have-text-1-and-text-2','I-have-text-7-and-text-8','I-have-text-2-and-text-1','I-have-text-4-and-text-5','I-have-text-11-and-text-12','I-have-text-13-and-text-32','I-have-text-8-and-text-6')

I have two words important to me: text-1and text-2. And I need them both, in any order.

I want to extract the text with them.

The output should be: [1]'I-have-text-1-and-text-2' [2]I-have-text-2-and-text-1

Ive been using str_subset from stringrbut I dont know the regex expression for this.

str_subset(texts, 'regex')

Any help

Laura
  • 1,381
  • 9
  • 28

3 Answers3

3

Using str_subset - regex would be to specify text-1 followed by characters (.*) and then text-2 or (|) in the reverse way

library(stringr)
str_subset(texts, 'text-1.*text-2|text-2.*text-1')
[1] "I-have-text-1-and-text-2" "I-have-text-2-and-text-1"
akrun
  • 789,025
  • 32
  • 460
  • 575
2

"Both patterns in any order" sounds complicated for a single regex pattern, but trivial to do in two separate patterns:

texts[str_detect(texts, "text-1") & str_detect(texts, "text-2")]
# [1] "I-have-text-1-and-text-2" "I-have-text-2-and-text-1"
Gregor Thomas
  • 119,032
  • 17
  • 152
  • 277
2

You can use an alternation pattern with | to alternate between text-1 followed by text-2and vice versa:

grep("text-1.*text-2|text-2.*text-1", texts, value = TRUE)
[1] "I-have-text-1-and-text-2" "I-have-text-2-and-text-1"

The stringrequivalent would be:

str_subset(texts, "text-1.*text-2|text-2.*text-1")
Chris Ruehlemann
  • 15,379
  • 3
  • 11
  • 27