103

I am wondering if there is a way to generate the same UUID based on a String. I tried with UUID, it looks like it does not provide this feature.

Raedwald
  • 43,666
  • 36
  • 142
  • 227
Adam Lee
  • 23,314
  • 47
  • 144
  • 221

4 Answers4

191

You can use UUID this way to get always the same UUID for your input String:

 String aString="JUST_A_TEST_STRING";
 String result = UUID.nameUUIDFromBytes(aString.getBytes()).toString();
uraimo
  • 17,617
  • 8
  • 45
  • 55
  • 11
    any js equivalent? – Abhijeet Feb 01 '17 at 05:10
  • any PHP equivalent? What is the algorithm behind this? – mika Jul 31 '17 at 07:46
  • 2
    @mika [This PHP UUID library](https://github.com/ramsey/uuid) is somewhat equivalent. You can generate the same UUID for the given namespace + string. You can do something like: `Uuid::uuid3(Uuid::NAMESPACE_DNS, 'TEST STRING')->toString();` It uses md5 hashing in this example. [Additional info on UUID namespaces](https://stackoverflow.com/a/28776880/1514049) – segFault Oct 20 '17 at 15:13
  • 4
    is there any way I can decode this UUID to original String ? – Mayur Aug 01 '18 at 07:59
  • If the original string is part of a known set of strings (stored in your db for example), you can generate the UUID for each string and compare with the UUID you want to decode. Otherwise, it is not "technically" possible – user108828 Feb 07 '19 at 13:38
  • 4
    what are the chances that the generated UUID from a given string will clash with a UUID generated from another string? – Groppe Mar 20 '19 at 01:02
  • 1
    @Groppe very small, similar to the chances that an MD5 (UUIDv3) or SHA1 (UUIDv5) hash clash for a given string – dtech Oct 09 '19 at 12:00
  • 1
    I know that we can create UUID from a string but I want to know if we can create String reverse back from UUID ? – VManoj Apr 07 '20 at 08:55
  • Why wouldn't you just do UUID.fromString? – opticyclic Apr 21 '21 at 05:59
  • @opticyclic UUID.fromString does not generate a new UUID based on the input, but expects an existing valid UUID string representation as input. – hp58 Aug 03 '21 at 06:56
  • `UUID.nameUUIDFromBytes` generates MD5 UUIDs. This does not work if the input String was generated by a SHA1 function. – kerner1000 Mar 08 '22 at 07:58
8

The UUID.nameUUIDFromBytes() method generates MD5 UUIDs. SHA1 is preferred over MD5, if backward compatibility is not an issue.

This is a utility class that generates MD5 and SHA1 UUIDs. It also supports namespaces, which the UUID.nameUUIDFromBytes() method does not support, although required by RFC-4122. Feel free to use and share.

package com.example;

import java.nio.charset.StandardCharsets;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.util.UUID;

/**
 * Utility class that creates UUIDv3 (MD5) and UUIDv5 (SHA1).
 *
 * It is fully compliant with RFC-4122.
 */
public class HashUuidCreator {

    // Domain Name System
    public static final UUID NAMESPACE_DNS = new UUID(0x6ba7b8109dad11d1L, 0x80b400c04fd430c8L);
    // Uniform Resource Locator
    public static final UUID NAMESPACE_URL = new UUID(0x6ba7b8119dad11d1L, 0x80b400c04fd430c8L);
    // ISO Object ID
    public static final UUID NAMESPACE_ISO_OID = new UUID(0x6ba7b8129dad11d1L, 0x80b400c04fd430c8L);
    // X.500 Distinguished Name
    public static final UUID NAMESPACE_X500_DN = new UUID(0x6ba7b8149dad11d1L, 0x80b400c04fd430c8L);

    private static final int VERSION_3 = 3; // UUIDv3 MD5
    private static final int VERSION_5 = 5; // UUIDv5 SHA1

    private static final String MESSAGE_DIGEST_MD5 = "MD5"; // UUIDv3
    private static final String MESSAGE_DIGEST_SHA1 = "SHA-1"; // UUIDv5

    private static UUID getHashUuid(UUID namespace, String name, String algorithm, int version) {

        final byte[] hash;
        final MessageDigest hasher;

        try {
            // Instantiate a message digest for the chosen algorithm
            hasher = MessageDigest.getInstance(algorithm);

            // Insert name space if NOT NULL
            if (namespace != null) {
                hasher.update(toBytes(namespace.getMostSignificantBits()));
                hasher.update(toBytes(namespace.getLeastSignificantBits()));
            }

            // Generate the hash
            hash = hasher.digest(name.getBytes(StandardCharsets.UTF_8));

            // Split the hash into two parts: MSB and LSB
            long msb = toNumber(hash, 0, 8); // first 8 bytes for MSB
            long lsb = toNumber(hash, 8, 16); // last 8 bytes for LSB

            // Apply version and variant bits (required for RFC-4122 compliance)
            msb = (msb & 0xffffffffffff0fffL) | (version & 0x0f) << 12; // apply version bits
            lsb = (lsb & 0x3fffffffffffffffL) | 0x8000000000000000L; // apply variant bits

            // Return the UUID
            return new UUID(msb, lsb);

        } catch (NoSuchAlgorithmException e) {
            throw new RuntimeException("Message digest algorithm not supported.");
        }
    }

    public static UUID getMd5Uuid(String string) {
        return getHashUuid(null, string, MESSAGE_DIGEST_MD5, VERSION_3);
    }

    public static UUID getSha1Uuid(String string) {
        return getHashUuid(null, string, MESSAGE_DIGEST_SHA1, VERSION_5);
    }

    public static UUID getMd5Uuid(UUID namespace, String string) {
        return getHashUuid(namespace, string, MESSAGE_DIGEST_MD5, VERSION_3);
    }

    public static UUID getSha1Uuid(UUID namespace, String string) {
        return getHashUuid(namespace, string, MESSAGE_DIGEST_SHA1, VERSION_5);
    }

    private static byte[] toBytes(final long number) {
        return new byte[] { (byte) (number >>> 56), (byte) (number >>> 48), (byte) (number >>> 40),
                (byte) (number >>> 32), (byte) (number >>> 24), (byte) (number >>> 16), (byte) (number >>> 8),
                (byte) (number) };
    }

    private static long toNumber(final byte[] bytes, final int start, final int length) {
        long result = 0;
        for (int i = start; i < length; i++) {
            result = (result << 8) | (bytes[i] & 0xff);
        }
        return result;
    }

    /**
     * For tests!
     */
    public static void main(String[] args) {

        String string = "JUST_A_TEST_STRING";
        UUID namespace = UUID.randomUUID(); // A custom name space

        System.out.println("Java's generator");
        System.out.println("UUID.nameUUIDFromBytes():      '" + UUID.nameUUIDFromBytes(string.getBytes()) + "'");
        System.out.println();
        System.out.println("This generator");
        System.out.println("HashUuidCreator.getMd5Uuid():  '" + HashUuidCreator.getMd5Uuid(string) + "'");
        System.out.println("HashUuidCreator.getSha1Uuid(): '" + HashUuidCreator.getSha1Uuid(string) + "'");
        System.out.println();
        System.out.println("This generator WITH name space");
        System.out.println("HashUuidCreator.getMd5Uuid():  '" + HashUuidCreator.getMd5Uuid(namespace, string) + "'");
        System.out.println("HashUuidCreator.getSha1Uuid(): '" + HashUuidCreator.getSha1Uuid(namespace, string) + "'");
    }
}

This is the output:

// Java's generator
UUID.nameUUIDFromBytes():      '9e120341-627f-32be-8393-58b5d655b751'

// This generator
HashUuidCreator.getMd5Uuid():  '9e120341-627f-32be-8393-58b5d655b751'
HashUuidCreator.getSha1Uuid(): 'e4586bed-032a-5ae6-9883-331cd94c4ffa'

// This generator WITH name space (as the standard requires)
HashUuidCreator.getMd5Uuid():  '2b098683-03c9-3ed8-9426-cf5c81ab1f9f'
HashUuidCreator.getSha1Uuid(): '1ef568c7-726b-58cc-a72a-7df173463bbb'

You can also use the uuid-creator library. See this example:

// Create a name based UUID (SHA1)
String name = "JUST_A_TEST_STRING";
UUID uuid = UuidCreator.getNameBasedSha1(name);

Project page: https://github.com/f4b6a3/uuid-creator

fabiolimace
  • 675
  • 7
  • 11
  • Why do you think that SHA1 should be preferred over MD5 when generating a UUID? – Zhro Feb 15 '21 at 20:27
  • I don't think it should always be preferred. It depends on the case. RFC-4122, in its section 4.3, says that If backward compatibility is not an issue, SHA-1 is preferred. I'll fix my comment. Thanks. – fabiolimace Feb 15 '21 at 21:03
2

If you are looking for a Javascript alternative, look at uuid-by-string which also gives option to use SHA-1 or MD5 hash functions.

vhtc
  • 760
  • 6
  • 12
2

You should use UUID v5.

Version-3 and version-5 UUIDs are generated by hashing a namespace identifier and name. Version 3 uses MD5 as the hashing algorithm, and version 5 uses SHA-1.1 - wikipedia

UUID v5 requires a namespace. That namespace should be a UUID v4, which you can just generate online. The namespace assures that for a given input, the output will always be the same.

A possible implementation of UUID v5 can be found here:

<!-- https://search.maven.org/artifact/com.github.f4b6a3/uuid-creator -->
<dependency>
  <groupId>com.github.f4b6a3</groupId>
  <artifactId>uuid-creator</artifactId>
  <version>3.6.0</version>
</dependency>

It can be used as follows:

UUID namespace = ; // todo generate a UUID v4.
String input = "input";
UUID uuid = UuidCreator.getNameBasedSha1(namespace, input);

(In a way, the namespace acts like a seed would, for a random number generator. By contrast, while a seed is supposed to be random, our namespace is a constant. And that forces our generator to always produce the same value for a given input.)

bvdb
  • 19,466
  • 7
  • 98
  • 111