0

I am trying to construct a regex patter that enables me to check if a specific combination of words appears within a sentence.

Text Example

In the body of your question, start by expanding on the summary you put in the title. Explain how you encountered the problem you're trying to solve, and any difficulties that have prevented you from solving it yourself. The first paragraph in your question is the second thing most readers will see, so make it as engaging and informative as possible.

Now i am trying to create a pattern that will tell me if any sentence in this text contains a combination of words in any order.

Example combination:
summary, question

Example code:

        Regex regex = new Regex(@"(summary|question).*\w\.");
        Match match = regex.Match("In the body of your question, start by expanding on the summary you put in the title. Explain how you encountered the problem you're trying to solve, and any difficulties that have prevented you from solving it yourself. The first paragraph in your question is the second thing most readers will see, so make it as engaging and informative as possible.");
        if (match.Success)
        {
            Console.WriteLine("Success");
        } else {
            Console.WRiteLine("Fail");
        }

Output:

Success

Example Code:

Regex regex = new Regex(@"(summary|question).*\w\.");
            Match match = regex.Match("Explain how you encountered the problem you're trying to solve, and any difficulties that have prevented you from solving it yourself. The first paragraph in your question is the second thing most readers will see, so make it as engaging and informative as possible.");
            if (match.Success)
            {
                Console.WriteLine("Success");
            } else {
                Console.WRiteLine("Fail");
            }

Output:

Fail

My ultimate goal is to read any number of words from user (1..n), construct them into regex pattern string and use that pattern to check against any text.

e.g. (please ignore the faulty pattern i am just using visual representation)

Words: question, summary    
pattern: (question|summary).*\w  
Words: user, new, start    
pattern: (user|new|start).*\w

I really hope this makes sense. I am relearning regex (haven't used it in over decade).

EDIT 1 (REOPEN JUSTIFICATION):

I have reviewed some answers that were done previously and am little bit closer.

My new pattern is as follows:

/^(?=.*Reference)(?=.*Cheatsheet)(?=.*Help).*[\..]/gmi

But as per example here https://regex101.com/r/m2HSVq/1 it doesn't fully work. It looks for the word combination within the whole paragraph, rather than sentence.

As per original text, I want to only return match if within sentence (delimited by full stop or end of text).

My fallback option is to split string at full stops, then do individual matches if i can't find solution.

Aeseir
  • 6,726
  • 9
  • 51
  • 98
  • @wiktor-stribiżew thanks for this. I saw that once and couldn't find it again. That will get me 50% of the way there. Thanks – Aeseir Oct 28 '18 at 02:37
  • You should not use regex for arbitrary paragraph splitting. I suggest using NLP tools to do that and then check for the words you need. Regex does not tell `Mr. John` from `I like this Mr.`, and all regex ways to do that are prone with error. – Wiktor Stribiżew Oct 28 '18 at 11:38
  • @WiktorStribiżew thanks appreciate the feedback. I'll do that. – Aeseir Oct 29 '18 at 00:27
  • See also [this thread](https://stackoverflow.com/questions/1936388/what-is-a-regular-expression-for-parsing-out-individual-sentences) on how to split text into sentences. – Wiktor Stribiżew Oct 29 '18 at 08:32
  • @WiktorStribiżew thanks mate, i just ran across that seconds ago. I've now setup a program that reads in the text, splits using split function and then does the matching. I think the regex way is better. – Aeseir Oct 29 '18 at 11:02
  • While https://regex101.com/r/m2HSVq/3 will work, https://regex101.com/r/m2HSVq/4 won't. You can't use a single regex anyway. – Wiktor Stribiżew Oct 29 '18 at 11:06
  • @WiktorStribiżew i agree. In saying that i think multi regex pattern search is the way to go. – Aeseir Oct 29 '18 at 11:08

0 Answers0