Does anyone know if there is a good equivalent to Java's Set collection in C#? I know that you can somewhat mimic a set using a Dictionary or a HashTable by populating but ignoring the values, but that's not a very elegant way.
- 52,899
- 67
- 179
- 237
7 Answers
If you're using .NET 3.5, you can use HashSet<T>. It's true that .NET doesn't cater for sets as well as Java does though.
The Wintellect PowerCollections may help too.
-
3does anyone know why it's called HashSet and not just Set? – Wouter Jun 24 '09 at 07:57
-
17I suspect that Set is a keyword in some languages, which could cause issues. – Jon Skeet Jun 24 '09 at 08:10
-
1`set` is also a keyword in C# – Manish Sinha Apr 17 '10 at 05:42
-
5@Manish: No it's not. See section 2.4.3 of the C# 3 spec. It only has special meaning for properties. – Jon Skeet Apr 17 '10 at 06:40
-
1@Jon Thanks for the enlightenment. I always thought it was a keyword. Downloaded the C# 3 spec. – Manish Sinha Apr 17 '10 at 07:03
-
32The reason for calling it HashSet, instead of just Set, is the same as in Java- "Set" describes an interface, whereas "HashSet" describes an implementation- specifically, this is a Set backed by a Hash Map. In this way, we know (or should strongly expect) that insert and access should take O(1) access time, vs a "LinkedListSet" which would lead us to expect insert and access to take O(n) time. – David Souther Jul 16 '10 at 16:16
-
6what do you mean ".NET doesn't cater for sets as well as Java does though."? Is this Set somehow imperfect compared to Java's ? – Louis Rhys Feb 28 '11 at 03:13
-
38@Louis: Which Set are you talking about? Java has *lots* of different implementations of Set for various situations. .NET had one in .NET 3.5 (HashSet) and two in .NET 4 (HashSet and SortedSet). The fact that we had to wait until .NET 3.5 to start with is pretty surprising. – Jon Skeet Feb 28 '11 at 08:20
Try HashSet:
The HashSet(Of T) class provides high-performance set operations. A set is a collection that contains no duplicate elements, and whose elements are in no particular order...
The capacity of a HashSet(Of T) object is the number of elements that the object can hold. A HashSet(Of T) object's capacity automatically increases as elements are added to the object.
The HashSet(Of T) class is based on the model of mathematical sets and provides high-performance set operations similar to accessing the keys of the Dictionary(Of TKey, TValue) or Hashtable collections. In simple terms, the HashSet(Of T) class can be thought of as a Dictionary(Of TKey, TValue) collection without values.
A HashSet(Of T) collection is not sorted and cannot contain duplicate elements...
- 6,205
- 103
- 52
- 72
- 2,761
- 2
- 19
- 17
-
9Unfortunately, HashSets weren't added until just recently. If you're working in an older version of the framework, you're going to have to stick with your munged Dictionary<> or Hashtable. – Greg D Oct 08 '08 at 16:36
If you're using .NET 4.0 or later:
In the case where you need sorting then use SortedSet<T>. Otherwise if you don't, then use HashSet<T> since it's O(1) for search and manipulate operations. Whereas SortedSet<T> is O(log n) for search and manipulate operations.
- 9,368
- 5
- 53
- 66
I use Iesi.Collections http://www.codeproject.com/KB/recipes/sets.aspx
It's used in lot of OSS projects, I first came across it in NHibernate
- 4,808
- 9
- 33
- 45
I use a wrapper around a Dictionary<T, object>, storing nulls in the values. This gives O(1) add, lookup and remove on the keys, and to all intents and purposes acts like a set.
- 44,124
- 18
- 127
- 185
-
2You must mean it's roughly equivalent to std::unordered_set. std::set is ordered. For example, you can quickly find the start and end point of a range and iterate from the start to the end, visiting items in key-order. SortedDictionary *is* roughly equivalent to std::set. – doug65536 Feb 04 '13 at 07:43
Have a look at PowerCollections over at CodePlex. Apart from Set and OrderedSet it has a few other usefull collection types such as Deque, MultiDictionary, Bag, OrderedBag, OrderedDictionary and OrderedMultiDictionary.
For more collections, there is also the C5 Generic Collection Library.
- 5,332
- 2
- 25
- 28
I know this is an old thread, but I was running into the same problem and found HashSet to be very unreliable because given the same seed, GetHashCode() returned different codes. So, I thought, why not just use a List and hide the add method like this
public class UniqueList<T> : List<T>
{
public new void Add(T obj)
{
if(!Contains(obj))
{
base.Add(obj);
}
}
}
Because List uses the Equals method solely to determine equality, you can define the Equals method on your T type to make sure you get the desired results.
- 15
- 2
-
13The reason you wouldn't want to use this is because `List.Contains` is of `O(n)` complexity which means that your `Add` method now becomes `O(n)` complexity as well. Assuming the inner collection doesn't need to be resized, `Add` for both `List` and `HashMap` should be of `O(1)` complexity. TLDR: This will work, but it's hacky and less efficient. – Richard Marskell - Drackir Feb 24 '12 at 20:28
-
6Sure, if your objects don't return an appropriate value for GetHashCode, you should not put them into a hash-based container. It would be better to fix GetHashCode than to use a less efficient container. – bmm6o Mar 05 '12 at 17:40