97

Saw the code snippet like

Set<Record> instances = new HashSet<Record>();

I am wondering if Hashset is a special kind of set. Any difference between them?

user2864740
  • 57,407
  • 13
  • 129
  • 202
user496949
  • 79,431
  • 144
  • 301
  • 419
  • 11
    You might want to check out the concept of [interfaces](http://download.oracle.com/javase/tutorial/java/concepts/interface.html) – Nikita Rybak Feb 28 '11 at 08:40

8 Answers8

119

A Set represents a generic "set of values". A TreeSet is a set where the elements are sorted (and thus ordered), a HashSet is a set where the elements are not sorted or ordered.

A HashSet is typically a lot faster than a TreeSet.

A TreeSet is typically implemented as a red-black tree (See http://en.wikipedia.org/wiki/Red-black_tree - I've not validated the actual implementation of sun/oracle's TreeSet), whereas a HashSet uses Object.hashCode() to create an index in an array. Access time for a red-black tree is O(log(n)) whereas access time for a HashSet ranges from constant-time to the worst case (every item has the same hashCode) where you can have a linear search time O(n).

JasonMArcher
  • 13,296
  • 21
  • 55
  • 51
Erik
  • 84,860
  • 12
  • 192
  • 185
  • Additionally, there are these general-purpose-implementations: LinkedHashSet (a variant of HashSet which preserves some order for the Iterator), ConcurrentSkipListSet (a threadsave SortedSet implementation), CopyOnWriteArraySet (a thread-safe variant optimized for "lots of reads, very seldom writes"), EnumSet (which works only on enum types for the elements, but then is even faster than HashSet). – Paŭlo Ebermann Feb 28 '11 at 22:36
  • 8
    @Erik: I request to edit your answer. TreeSet is sorted not ordered. HashSet = Unordered, TreeSet = sorted, LinkedHashSet = ordered. Please modify your answer accordingly – Rais Alam Jan 04 '13 at 05:40
  • Hashset can be slower if the implementation of hashCode is bad (e.g. always return the same hashcode) – Romain Hautefeuille Mar 30 '17 at 01:52
39

The HashSet is an implementation of a Set.

Joachim Sauer
  • 291,719
  • 55
  • 540
  • 600
vaugham
  • 1,763
  • 1
  • 17
  • 36
  • 34
    I don't understand this comment. The question is "what's the difference" and not "what's the relationship between". – jambox Apr 02 '16 at 14:50
  • 8
    He explained the difference, Set is the interface, HashSet is the implementation of that interface. Therefore they aren't different implementations, simply HashSet is one of the implementations of Set (the other implementation being TreeSet). – AggieDev Sep 09 '16 at 19:49
  • sounds like a valid answer to me – Romain Hautefeuille Mar 30 '17 at 01:52
  • 6
    Left you a downvote because you didn't answer the question at all. In the future I recommend you add some documentation, examples, and comparison. Just writing a single sentence, and most of the content is just links to elsewhere is *NOT* how you answer questions on Stack Overflow. – Urda Sep 21 '17 at 19:50
  • 3
    This question has been answered, 6 years ago, (see above) but thanks you. – vaugham Mar 01 '18 at 16:13
  • Time isn't a metric on something like this, so be it 6 seconds or 6 years it doesn't matter. *But thanks.* – Urda Aug 03 '18 at 18:18
  • 4
    I disagree, the post date is a relevant piece of information in this case : As a complete answer was already given at that time (years ago) ; there was nothing more to add. It would have been a doublon and it's against SO's etiquette. Please open a new anwser and contribute if you feel that something is missing. – vaugham Aug 04 '18 at 20:05
21

Set is a collection that contains no duplicate elements. Set is an interface.

HashSet implements the Set interface, backed by a hash table (actually a HashMap instance).

Since HashSet is one of the specific implementations of Set interface.

ASet can be any of following since it was implemented by below classes

ConcurrentSkipListSet : A scalable concurrent NavigableSet implementation based on a ConcurrentSkipListMap. The elements of the set are kept sorted according to their natural ordering, or by a Comparator provided at set creation time, depending on which constructor is used.

CopyOnWriteArraySet : A Set that uses an internal CopyOnWriteArrayList for all of its operations.

EnumSet : A specialized Set implementation for use with enum types. All of the elements in an enum set must come from a single enum type that is specified, explicitly or implicitly, when the set is created.

TreeSet :A NavigableSet implementation based on a TreeMap. The elements are ordered using their natural ordering, or by a Comparator provided at set creation time, depending on which constructor is used.

LinkedHashSet: ash table and linked list implementation of the Set interface, with predictable iteration order. This implementation differs from HashSet in that it maintains a doubly-linked list running through all of its entries.

But HashSet can be only LinkedHashSet since LinkedHashSet subclasses HashSet

Ravindra babu
  • 45,953
  • 8
  • 231
  • 206
18

The question has been answered, but I haven't seen the answer to why the code mentions both types in the same code.

Typically, you want to code against interfaces which in this case is Set. Why? Because if you reference your object through interfaces always (except the new HashSet()) then it is trivial to change the implementation of the object later if you find it would be better to do so because you've only mentioned it once in your code base (where you did new HashSet()).

Tony L.
  • 15,516
  • 8
  • 65
  • 65
MeBigFatGuy
  • 27,804
  • 7
  • 59
  • 64
9

Set is the general interface to a set-like collection, while HashSet is a specific implementation of the Set interface (which uses hash codes, hence the name).

gmw
  • 407
  • 3
  • 13
3

Set is a parent interface of all set classes like TreeSet, LinkedHashSet etc.

HashSet is a class implementing Set interface.

Umesh K
  • 12,846
  • 22
  • 85
  • 126
0

HashSet is a class derived from Set interface. As a derived class of Set, the HashSet attains the properties of Set. Important and the most frequently used derived classes of Set are HashSet and TreeSet.

Hemlata Gehlot
  • 349
  • 2
  • 12
-1

**

  • Set:

** It is an interface which is a subtype of Collection interface, just like LIST and QUEUE.

Set has below 3 subclasses, it is used to store multiple objects without duplicates.

  1. HashSet
  2. LinkedHashSet
  3. TreeSet(which implements SortedSet interface)

**

  • HashSet:

**

Can use one NULL value(as Duplicate is not allowed), data is stored randomly as it does not maintain sequence.