-3

I'm building an app. Assume it is a messaging app and becomes as popular as whatsapp.

A GUID will be given for every message sent in the world.

There will be a problem if any 2 GUIDs of in the world are equal.

As of today 30billion!(official) whatsapp messages are sent in the world in a day.

I'm using C# (Xamarin)'s System.Guid.NewGuid method to generate GUIDs.

What is the chance for a 'problem' occur because the random numbers are not truly random?

(This question is different from others because it describes a situation where millions of people get billions of new GUIDs combined each day.)

Aron
  • 15,362
  • 3
  • 29
  • 62
SadeepDarshana
  • 989
  • 16
  • 32
  • 1
    Possible duplicate of [What are the chances to get a Guid.NewGuid () duplicate?](http://stackoverflow.com/questions/8642858/what-are-the-chances-to-get-a-guid-newguid-duplicate) – MichaelMao Jul 23 '16 at 18:10
  • Please do not put in random tags like crypto. Also, please try to Google your subject matter before posting. Pretty much every definition of GUID on the internet would give you your answer. – Aron Jul 23 '16 at 18:31
  • 1
    PS 30billion is not a big number. GUIDs are sufficient to randomly assign to each atom in the universe uniquely. Please come back when you find a way to store a message in each atom in the universe... – Aron Jul 23 '16 at 18:35
  • @Aron I know 2^128 is a big number comparing to my requirement but my problem was whether it will work fine when we consider the fact that C# does't generate true random numbers. By the way 2^128 < 10^78. – SadeepDarshana Jul 23 '16 at 19:05
  • @SadeepDarshana no true GUID implementation uses the full 128 bit of entropy. But all (ignoring bugs, which do get patched) will cause you problems. Deep dive into .net's implementation, you get over 100bits of entropy in the latest version. The previous version used your Mac address as part of it. But even if you had two computers using different version, they wouldn't collide. – Aron Jul 23 '16 at 19:39
  • 1
    An implementation of GUID which can collide is an embarrassing bug that is of the highest importance. – Aron Jul 23 '16 at 19:40

1 Answers1

0

I like this passage from Wikipedia:

They may or may not be generated from random (or pseudo-random) numbers. GUIDs generated from random numbers normally contain 6 fixed bits (these indicate that the GUID is random) and 122 random bits; the total number of unique such GUIDs is 2122 (approximately 5.3×1036). This number is so large that the probability of the same number being generated randomly twice is negligible;[citation needed] however other GUID versions have different uniqueness properties and probabilities, ranging from guaranteed uniqueness to likely duplicates. Assuming uniform probability for simplicity, the probability of one duplicate would be about 50% if every person on earth as of 2014 owned 600 million GUIDs.

https://en.wikipedia.org/wiki/Globally_unique_identifier

If you are truly concerned you always have the option and ability to create a collision detection method. Such as if you detect that a GUID is already in use then just assign a new random GUID and iterate this until no duplicate is detected. Reminds me of hashtables so to speak. There are performance penalties for this, but just know there is a solution for your obstacle!

Update

I can understand your concern about randomness, but if you take into account it is a structured algorithm it will almost assign in a sequential chronological matter (kind of). There is a minimal concern pertaining to this, I would just be more concerned with the performance degradation surrounding using a 128-bit value for a primary key resolvement.

Also from Wikipedia:

GUIDs are commonly used as the primary key of database tables, and with that, often the table has a clustered index on that attribute. This presents a performance issue when inserting records because a fully random GUID means the record may need to be inserted anywhere within the table rather than merely appended near the end of it.

BolletuH
  • 124
  • 4
  • Your post seems to completely miss the design of a GUID. GUIDs are used when collision detection is not possible. There are no GUID algorithms that are monotonically increasing, because that isn't a GUID that is Timestamp. Timestamps can't be used in a system which supports AP in the CAP model, since they require C in the CAP model. Also, to put into perspective the problem at hand, YouTube is a distributed storage platform of video messages. Each video is given a random 64bit id. – Aron Jul 23 '16 at 18:29