0

The urls I'm trying to pull are all in the format of www.domain.com. I want to pull them from text documents with a simple regex. It only needs to match www.domain.com, and not other url variations.

What is the simplest regex to use with preg_match_all()?

T. Brian Jones
  • 12,350
  • 22
  • 73
  • 112
  • 1
    Check out this post http://stackoverflow.com/questions/399250/going-where-php-parse-url-doesnt-parsing-only-the-domain/399316#399316 – Sean Barlow Nov 29 '11 at 05:33

3 Answers3

2
/w{3}\.\w{2,}\.\w{3}/

this will match www. any word with more than two letters dot + 3 letters

to match domains with hyphen or uppercase letters:

/w{3}\.[\w\-]{2,}\.\w{3}/i
Teneff
  • 26,872
  • 8
  • 62
  • 92
1

I don't do a whole lot with PHP, but the regex would be something like:

w{3}.([a-zA-Z0-9\~\!\@\#\$\%\^\&\*\(\)_\-\=\+\\\/\?\.\:\;\'\,]*)?

will return all domain names that start with "www.". It will ignore the protocol part of the tag (e.g. http://)

James Khoury
  • 19,850
  • 4
  • 34
  • 63
Greg
  • 3,362
  • 3
  • 28
  • 48
0
preg_match_all('%((mailto\\:|(news|(ht|f)tp(s?))\\://){1}\\S+)%m', $subject, $result, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($result[0]); $i++) {
    // $result[0][$i];
}

You can also use a class that I wrote, https://github.com/homer6/altumo/blob/master/source/php/String/Url.php if you want to easily pull parts of the url. See the unit test in the same directory for usage.

If you're looking for a good program to tweak your regex patterns, I highly recommend regexbuddy.

Hope that helps...

Homer6
  • 14,541
  • 11
  • 57
  • 79