1

I am parsing a feed and need to exclude fields that consist of a string with the word "bad", in any combination of case.

For example "bad" or "Bad id" or "user has bAd id" would not pass the regular expression test,

but "xxx Badlands ddd" or "aaabad" would pass.

user2864740
  • 57,407
  • 13
  • 129
  • 202
Brian
  • 11
  • 1
  • 2
    probably useful: http://stackoverflow.com/questions/406230/regular-expression-to-match-string-not-containing-a-word?rq=1 – Marc B Aug 22 '14 at 18:00
  • What delimits the word "bad" ? You'll probably see a lot of `\bbad\b` but that's not really correct. A regex 'word' is not really a language word. –  Aug 22 '14 at 18:01
  • 1
    This question is somewhat interesting on two grounds (although I'm not sure it warrants being a non-duplicate of the linked question): 1) It asks for a *negative/inverted* result on the test ("would not pass"), 2) It asks only for finding "bad" as a whole/distinct word (such that "bad" and "badlands" yield different results). – user2864740 Aug 22 '14 at 18:12
  • 1
    In what programming language are you using regular expressions? What have you tried already? Please elaborate in your question. – Ruud Helderman Aug 22 '14 at 18:15
  • @Brian Regular expressions are generally better (easier to understand) when written in the "would pass/match" case - is it absolutely vital that the negated logic is *inside* the regular expression, or could `!match(re)` (to invert the result) simply be used in the programming language? – user2864740 Aug 22 '14 at 18:17
  • I am voting to close as a duplicate. I changed the example in the answer to the accepted answer of the linked question to `^((?!\bbad\b).)*$` (where `\bbad\b` is trivially "word-boundary,b,a,d,word-boundary") and it works per the rules in the question. Make sure to use the `/i` flag. – user2864740 Aug 22 '14 at 18:48

4 Answers4

1

Exclude anything that matches /\bbad\b/i

The \b matches word boundaries and the i modifier makes it case insensitive.

user3942918
  • 24,679
  • 11
  • 53
  • 67
  • This would match when it *is* true. – user2864740 Aug 22 '14 at 18:09
  • I don't disagree. However, such information (when it is pertinent) should be incorporated into replies. The poster - for whatever reason, of which I have asked for clarification in a comment - stated the requirement is find to find the inverted "would not pass" case. Honor this request by at least acknowledging it the answer. – user2864740 Aug 22 '14 at 18:42
  • @user2864740: I think derp recognized a potential [XY problem](http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem) here. +1 from me. – Ruud Helderman Aug 22 '14 at 19:02
  • @Ruud I have no issue with XY problems being answered for Y. I do have an issue with answers *ignoring* the X. How easy would it be to simply add *one sentence* to the (mutable) answer? Anyway, this answer is still wrong (wrt X) and coffee seems more exciting at the moment.. – user2864740 Aug 22 '14 at 19:03
0

For javascript, you can just put your word in the regex and do the match \b stnads for boundries, which means no character connected :

/\bbad\b/i.test("Badkamer") // i for case-insensitive
Niels
  • 46,575
  • 4
  • 58
  • 81
0

You may try this regex:

^(.*?(\bbad\b)[^$]*)$

REGEX DEMO

Rahul Tripathi
  • 161,154
  • 30
  • 262
  • 319
0

I think the easiest way to do this would be to split the string into words, then check each word for a match, It could be done with a function like this:

    private bool containsWord(string searchstring, string matchstring)
    {
        bool hasWord = false;

        string[] words = searchstring.split(new Char[] { ' ' });
        foreach (string word in words)
        {
            if (word.ToLower() == matchstring.ToLower())
                hasWord = true;
        }
        return hasWord;
    }

The code converts everything to lowercase to ignore any case mismatches. I think you can also use RegEx for this:

static bool ExactMatch(string input, string match)

{
    return Regex.IsMatch(input.ToLower(), string.Format(@"\b{0}\b", Regex.Escape(match.ToLower())));
}

\b is a word boundary character, as I understand it. These examples are in C#. You didn't specify the language

PAUL DUFRESNE
  • 308
  • 2
  • 4
  • 11