110

I'm making a validator based on UUID generated by client browser, I use this to identify a certain type data that the user sends; and would like to validate that the UUID that client sends it is in fact a valid Version 4 UUID.

I found this PHP preg_match UUID v4, it's close but not exactly what I'm looking for. I wish to know if exists something similar to is_empty() or strtodate() Where if string is not valid Sends FALSE.

I could do based on the regular expression but I would like something more native to test it.

Any ideas?

11/23/2019 EDIT: About the duplicate tag, while the moderator is technicallly correct, this question was formulated with the goal of fibd something else to regex if existed, and in second place this question has become a reference to Pythoners and PHPers and has a different answers/approach to solve the problem and their answers are better explained in general. This is why I consider this question should be perserved

Rafael
  • 2,872
  • 6
  • 32
  • 49
  • 2
    The question is for PHP, but for those of you wanting to do this in Python, the second answer bellow is very nice. – jsbueno Jan 15 '16 at 02:00

5 Answers5

148

Version 4 UUIDs have the form xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx where x is any hexadecimal digit and y is one of 8, 9, A, or B.

^[0-9A-F]{8}-[0-9A-F]{4}-4[0-9A-F]{3}-[89AB][0-9A-F]{3}-[0-9A-F]{12}$

To allow lowercase letters, use i modifier →

$UUIDv4 = '/^[0-9A-F]{8}-[0-9A-F]{4}-4[0-9A-F]{3}-[89AB][0-9A-F]{3}-[0-9A-F]{12}$/i';
preg_match($UUIDv4, $value) or die('Not valid UUID');
Ωmega
  • 40,237
  • 31
  • 123
  • 190
  • 4
    Before performing the regex, I might add a quick simple `IF()` to test if the length of the string is 32 or 36 characters long. If not, it's not a UUID hex string. – Basil Bourque Nov 15 '13 at 08:21
  • Proper format of v4 UUID **should contain dashes**, as it stays in the answer. https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_.28random.29 – Ωmega Jul 07 '16 at 01:03
  • 2
    If you can't use the i modifier, you can use `^[0-9A-Fa-f]{8}-[0-9A-Fa-f]{4}-4[0-9A-Fa-f]{3}-[89AB][0-9A-Fa-f]{3}-[0-9A-Fa-f]{12}$` – Horsty Feb 18 '20 at 15:25
  • 1
    If you can't use the i modified, you can use ^[0-9A-Fa-f]{8}-[0-9A-Fa-f]{4}-4[0-9A-Fa-f]{3}-[89ABab][0-9A-Fa-f]{3}-[0-9A-Fa-f]{12}$ – Swaru Aug 04 '20 at 08:49
141

I found this question while I was looking for a Python answer. To help people in the same situation, I've added the Python solution.

You can use the uuid module:

#!/usr/bin/env python

from uuid import UUID

def is_valid_uuid(uuid_to_test, version=4):
    """
    Check if uuid_to_test is a valid UUID.
    
     Parameters
    ----------
    uuid_to_test : str
    version : {1, 2, 3, 4}
    
     Returns
    -------
    `True` if uuid_to_test is a valid UUID, otherwise `False`.
    
     Examples
    --------
    >>> is_valid_uuid('c9bf9e57-1685-4c89-bafb-ff5af830be8a')
    True
    >>> is_valid_uuid('c9bf9e58')
    False
    """
    
    try:
        uuid_obj = UUID(uuid_to_test, version=version)
    except ValueError:
        return False
    return str(uuid_obj) == uuid_to_test


if __name__ == '__main__':
    import doctest
    doctest.testmod()
Rafael
  • 491
  • 7
  • 17
Martin Thoma
  • 108,021
  • 142
  • 552
  • 849
  • 5
    It should be clear that what you are showing is not PHP which is what this question is asking about. So it isn't exactly trivial to drop this code in and have it work, some care will be needed to use the Python UUID module from PHP. – Garbee Oct 28 '15 at 19:25
  • 29
    @Garbee Thank you! I didn't notice that, because I came to this question via Google. Obviously, I didn't read the question carefully. I still think people who come to this question might be interested in my answer. I made it clear that my answer is a Python solution. – Martin Thoma Oct 28 '15 at 20:52
  • And, once the comments are removed, is an order of magnitude more readable and nice than the regexp one. Thanks! – jsbueno Jan 15 '16 at 01:59
  • 1
    @jsbueno You're welcome. I really like those comments as they add additional information and - just as you said - they can be ignored in case you don't need them :-) – Martin Thoma Jan 15 '16 at 09:11
  • 2
    I am in no way asking you to remove them - My intend was to remind anyone using this, that without those, this is a 4 liner. But while we are at it - is the comparison on the last line needed? – jsbueno Jan 15 '16 at 12:17
  • 3
    @jsbueno I'm actually not too sure atbout it. When I wrote this, I thought it was necessary. If I rember it correctly, it was related to UUID "fixing" some of the input mistakes. But I would have to look the UUID code up again. – Martin Thoma Jan 15 '16 at 16:42
  • This answer is wrong. Python won't always raise a ValueError. It will sometimes silently fix the input. uuid.UUID('6a3f6f5c-8df8-5b4e-8ec3-a8b2df62a40b', version=4) UUID('6a3f6f5c-8df8-4b4e-8ec3-a8b2df62a40b') mind the (5b4e-block in the middle!) – Bastian Venthur May 04 '20 at 07:27
  • Be aware that `str(uuid_obj) == uuid_to_test` will cause valid uuids (but without dashes) to be identified as invalid. Because `UUID(uuid_to_test)` will add them even if not provided at first. – Salma Hassan May 15 '22 at 12:55
  • Why do you think leaving the dashes away is valid? Is there a standard which defines them as optional? Can arbitrary many dashes be added? – Martin Thoma May 15 '22 at 20:39
  • 1
    @MartinThoma I'm not sure about the standards honestly but `uuid.UUID` itself considers it [valid](https://github.com/python/cpython/blob/730902c0ad997462d2567e48def5352fe75c0e2c/Lib/uuid.py#L153) and it actually removes any dashes before constructing the uuid object as you can see [here](https://github.com/python/cpython/blob/730902c0ad997462d2567e48def5352fe75c0e2c/Lib/uuid.py#L176) – Salma Hassan May 16 '22 at 09:37
36

All the existing answers use regex. If you're using Python, you might want to consider a try/except in case you don't want to use regex: (Bit shorter than the answer above).

Our validator would then be:

import uuid

def is_valid_uuid(val):
    try:
        uuid.UUID(str(val))
        return True
    except ValueError:
        return False

>>> is_valid_uuid(1)
False
>>> is_valid_uuid("123-UUID-wannabe")
False
>>> is_valid_uuid({"A":"b"})
False
>>> is_valid_uuid([1, 2, 3])
False
>>> is_valid_uuid(uuid.uuid4())
True
>>> is_valid_uuid(str(uuid.uuid4()))
True
>>> is_valid_uuid(uuid.uuid4().hex)
True
>>> is_valid_uuid(uuid.uuid3(uuid.NAMESPACE_DNS, 'example.net'))
True
>>> is_valid_uuid(uuid.uuid5(uuid.NAMESPACE_DNS, 'example.net'))
True
>>> is_valid_uuid("{20f5484b-88ae-49b0-8af0-3a389b4917dd}")
True
>>> is_valid_uuid("20f5484b88ae49b08af03a389b4917dd")
True
Pēteris Caune
  • 41,622
  • 6
  • 55
  • 79
slajma
  • 1,201
  • 13
  • 16
  • 1
    Watch out for UUID syntax variants that uuid.UUID() accepts: "{20f5484b-88ae-49b0-8af0-3a389b4917dd}", "20f5484b88ae49b08af03a389b4917dd" – Pēteris Caune Feb 20 '20 at 08:30
  • Thanks @PēterisCaune nice addition - the case with `.hex` essentially covers the last one you added but it's worth adding regardless. Wasn't aware of the "{uuid}" case. – slajma Mar 31 '20 at 05:07
  • 1
    Also this is true. `is_valid_uuid('-2b1eb780-8a03-4031-b1e5-2f7674c60df3')` >> True – Vikram Ray Aug 25 '21 at 05:30
6
import re

UUID_PATTERN = re.compile(r'^[\da-f]{8}-([\da-f]{4}-){3}[\da-f]{12}$', re.IGNORECASE)
uuid = '20f5484b-88ae-49b0-8af0-3a389b4917dd'

if UUID_PATTERN.match(uuid):
    return True
else:
    return False
Andrey Shipilov
  • 1,888
  • 12
  • 13
5

If you only need it for security (for example if you need to print it in a javascript code and you want to avoid XSS) it doesn't really matter the position of the dashes, so it's just:

 /^[a-f0-9\-]{36}$/i

https://regex101.com/r/MDqB2Z/11


(It's not specific for v4, but usually a well written application store them as BINARY(16) after having dropped the dashes, so if something is wrong it will simply not find the object and throw 404, overvalidation may not be needed).

the_nuts
  • 5,097
  • 1
  • 33
  • 58