0

Example domain: http://мппм.рф

After converting domain into ASCII, it become: Xn--l1aaia.xn--p1ai

But my existing PHP function to valid domain returns false.

Existing function to validate domain

function ValidateDomain($domain)
{
    if(!preg_match("/^([-a-z0-9]{2,100})\.([a-z\.]{2,24})$/i", $domain))
    {
        return false;
    }
    return $domain;
}

I have tried the following one to validate domain

if( !preg_match('/^(http|https|ftp):\/\/([A-Z0-9][A-Z0-9_-]*(?:\.[A-Z0-9][A-Z0-9_-]*)+):?(\d+)?\/?/i', $domain) )
Hassaan
  • 6,770
  • 5
  • 29
  • 47
  • Closing a question with one that has been closed itself and has weird answers is not very nice. I apologize for Lawrence Cherone's behavior. But he has earned the right to shut your question down. – KIKO Software Oct 05 '17 at 08:01

1 Answers1

1

I think that мппм.рф is already in UTF8, so forcing a conversion will not help. Your regular expression is quite simple, and can be replace by something like this:

function validateDomain($domain)
{
  $parts     = explode('.',$domain);
  $name      = array_shift($parts);
  $extension = implode('.',$parts);
  if ((strlen($name) >= 2) && (strlen($name) <= 100) && 
      (strlen($extension) >= 2) && (strlen($extension) <= 24)) return $domain;
  else return FALSE;      
}

It will work the same, but also for non-a-z characters, and it is easier to understand than when it uses a regular expression. You can make it slightly more compact and efficient by doing this:

function validateDomain($domain)
{
  $parts   = explode('.',$domain);
  $nameLen = strlen(array_shift($parts));
  $extLen  = strlen(implode('.',$parts));
  if( ($nameLen >= 2) && ($nameLen <= 100) && 
      ($extLen >= 2) && ($extLen <= 24) ) return $domain;
  else return FALSE;      
}

You could also use the multibyte string functions like this:

function validateDomain($domain)
{
  $point   = mb_strpos($domain,'.');
  $nameLen = mb_strlen(mb_substr($domain,0,$point));
  $extLen  = mb_strlen(mb_substr($domain,$point+1));
  if( ($nameLen >= 2) && ($nameLen <= 100) && 
      ($extLen >= 2) && ($extLen <= 24) ) return $domain;
  else return FALSE;      
}
KIKO Software
  • 12,609
  • 2
  • 15
  • 29
  • Sorry I mean converting into ASCII – Hassaan Oct 05 '17 at 07:57
  • Thanks but solution does not work. Have you tested? – Hassaan Oct 05 '17 at 08:01
  • I think that you're fully capable of understanding the intent, and correct any mistake I might have made. Anyway, "does not work" is always a bad description of a problem and should not be used on SO. I see I have one `<=` pointing the wrong way. I corrected that. – KIKO Software Oct 05 '17 at 08:03
  • Sure, I will keep this in mind. – Hassaan Oct 05 '17 at 08:06
  • As an alternative to using array functions you could use `mb_strpos()` to find the first point and then `mb_substr()` to extract the name and extension. That will be more efficient. – KIKO Software Oct 05 '17 at 08:26
  • Why don't you write new update in answer? I think that would make your answer more readable ;) – Hassaan Oct 05 '17 at 08:40
  • 1
    Added the multibyte string function version. I don't think it is more readable, but I'm sure it will execute quicker. – KIKO Software Oct 05 '17 at 09:16
  • I wish, I could do 1 more +1 to your answer ;) – Hassaan Oct 05 '17 at 09:39
  • 1
    Thank you. With multibyte functions if might be prudent to check the internal encoding setting. It should be UTF8, of course. See: http://php.net/manual/en/function.mb-internal-encoding.php – KIKO Software Oct 05 '17 at 10:22