What is UUID - GUID? Advantages and Disadvantages
1- Introduction
In this article, we will talk about the definition of UUID, its features, and its restrictions.
2 - What is UUID?
UUID (A universally unique identifier (UUID) means Universally Unique Identifier. GUID (Globally Unique Identifier) is the same thing as UUID and its naming comes from Microsoft technologies.
3 - UUID Format
UUID format is about a character string that consists of 128 bits (16 bytes) and 32 hexadecimal characters, usually separated by four dashes "-" in the format of 8-4-4-4-12, with 5 groups and a total of 36 characters.
UUID examples;
4a81519d-805e-4fbe-b6ee-23ee6a56192e
882f2a2d-835f-48e9-8c15-0f9c924c8a5d
cc1f4935-efbe-4d5e-8531-cd113955140b
Note: Although the UUID version and variant values are different, All UUID format is the same.
4- Why is UUID preferable? What are the advantages?
4.1 - Difficult to Predict
- A UUID is safer than an id of the Sequential Increment Integer type. This is because sequential incremental values are easier to predict.
- For example; Let's assume that the ID values of our data in the database are integer. It would not be difficult for someone who has access to services to get a user with an id of "5" or a post with an id of "5323". It can even access all data if we consider that it receives it sequentially (5,6,7...) using a loop.
- If the ID value is UUID, there will be no sequential increment value between the two UUID values. Therefore, it will be much more difficult to predict. Thus, we increase the security for accessing data.
- Let "/api/user/{id}" be our service and fetch the user connected to the specified id value from the relevant service;
If {id} value is integer type then;
- In "/api/user/5353" request, it is easy to guess the "5353" integer type id value and fetch it from the service.
If {id} value is UUID type then;
- In "/api/user/e92daafd-4158-4c29-b284-b2863641941d" request, it is much more difficult to guess the "e92daafd-4158-4c29-b284-b2863641941d" UUID value.
4.2 UUID uniqueness and Data Merges;
- UUID value is unique among all databases and tables based on integer type id values. Thus, there will be no conflicts between the ID values when joins are made between databases or tables. As a result, merging or adding data will be much easier.
- For example: Let's say we're going to merge users and data from a relational table between two databases.
- If id values are kept as integers in these two databases, we will have to assign new id values to these data during merging.
- This process will require more effort in case of having relational tables.
- When we consider all database merges, not just users, this process will take even more time in case of there will be more than tens or hundreds of tables.
4.3 - UUID In Distributed Systems
- In distributed systems, generating unique ID values is a real challenge. Especially, when a lot of ID generation is required in a short period of time.
- For example: At the time of T, due to the same need, multiple application instances may need to generate a large number of IDs. All generated IDs have to be unique. But if there is no good/effective strategy for generating unique IDs then some ID collisions may occur or create a bottleneck due to high frequency insertions/generation of IDs. - Since, UUID value can be generated independently, this problem can be solved easily.
- Each running instance can generate the UUID values independently of the other application instances. So, At the time of T, It is very difficult to encounter an ID collision when the UUID value generated from many machines. (The possibilities will be discussed later in this article)
4.4 - Other Advantages:
- UUID can be generated regardless of application or platform.
- Especially, inserting data into relational data tables is much easier.
5- Disadvantages of UUID Usage
5.1- Storage
- It needs much more memory than the integer value ID value.
- A UUID takes up 16 bytes while an integer-type id takes up 4 bytes, and a big-int-type id takes up 8 bytes.
5.2- Speed
- Generating UUIDs is generally slower compared to generating incremental Integer type IDs.
- Because generating UUIDs is more complex and needs more time than incremental IDs. Thus, this complexity can lead to a slight performance loss during ID generation/insertions when compared to integer IDs. - Integer values are faster to scan and index than UUID strings.
- The memory increase in indexing is also high.
5. - Other Disadvantages:
- It creates some difficulty in Queries and Debugging.
- Human readability is difficult and they might not be sequential.
- They appear long in the URL.
6 - Is UUID really unique?
Although the "Unique" expression is included in the UUID ( GUID) definition, UUID value might be mathematically duplicated in generating the value with the algorithm. But this is a negligible probability.
These possibilities vary according to the uuid version type. If we look at UUID-4 version Qutoke from wiki;
The number of random version-4 UUIDs which need to be generated in order to have a 50% probability of at least one collision is 2.71 quintillion
This number is equivalent to generating 1 billion UUIDs per second for about 85 years. A file containing this many UUIDs, at 16 bytes per UUID, would be about 45 exabytes
Thus, the probability to find a duplicate within 103 trillion version-4 UUIDs is one in a billion.
Conclusion
In this article, we talked about the definition of the UUID, its advantages and disadvantages.
More About UUID
If you want to know more about the UUID versions and variants ->
References
https://en.wikipedia.org/wiki/Universally_unique_identifier
https://datatracker.ietf.org/doc/html/rfc4122
https://stackoverflow.com/questions/45399/advantages-and-disadvantages-of-guid-uuid-database-keys