When possible, are clustered indexes that are declared as unique more performant than clustered indexes that aren't declared as unique?

Question

When applicable, will using the UNIQUE constraint for a clustered index help improve performance in any way under any scenario?

Paul White · Answer 1 · 2019-11-26T09:27:16.873

Adding to Tibor's answer, from the optimizer's perspective, key information can be very useful. This sometimes manifests in obvious ways like avoiding a Stream or Hash Aggregate operator to perform a redundant grouping operation.

Less obviously, the presence of a declared and enforced key can allow the optimizer to explore logical alternatives that are only valid when a key or functional dependency is known to the optimizer. The optimizer does not report when it could not explore a particular option due to the lack of a unique index.

From an execution performance point of view, one example of uniqueness helping performance is when a b-tree seek becomes a singleton lookup rather than a seek plus range scan:

For a unique key or constraint, the execution engine knows it will either find a row at the b-tree seek point, or it will not. When the index is not unique, the engine must scan forward or backward from the seek point to find any further matches. There is an overhead to doing that as I demonstrate in my article Enforcing Uniqueness for Performance.

In general, you are much better off enforcing a key than not. This protects data integrity as well as the other benefits mentioned above.

Related: Should I mark a composite index as unique if it contains the primary key?

Tibor Karaszi · Accepted Answer · 2019-11-23T14:44:44.500

We have to assume that we are talking about the same column(s) in the index and the same data. (Else we need to know the two different options.) So, assuming that the only difference is the word UNIQUE in the create index command - every thing else is equal:

What is stored won't differ. Yes, SQL server will add a 4 byte hidden uniqueifier, but that won't be populated until you get duplicates. And since we assume the same data, there will be no difference at the storage level.

However, the optimizer has knowledge that there cannot be any duplicates, and that can aid in picking better plans. Examples are higher likelihood to get merge joins, knowledge of selectivity without depending on statistics (histogram or density) etc. How much, you ask? Impossible to say since it all depends on to what extent you will see any differences in the end - i.e., the plans.

Piotr Palka · Answer 3 · 2019-11-23T18:03:41.123

Yes. There are many cases. One interesting case I had noticed before (it was some time ago, on SQL Server 2008 R2):

Finding rows in a dimension scenario: having correct unique index declared on the dimension table improved query performance about 1.5 times in my case, because SQL Server were able to predict that it will find exactly 1 or zero rows. Looking at the memory usage I assume that it loaded raw fact data into memory only once and dimension lookups were filling only preserved memory spaces. When index was not unique, it looked like the whole row was copied in memory after every lookup to allow duplicates.

When possible, are clustered indexes that are declared as unique more performant than clustered indexes that aren't declared as unique?

3 Answers3