I was about to suggest using the UniqueBy operator from the MoreLinq library, but paradoxically this operator is missing. It seems that even when you have more Linq, you can never have enough.
The uniqueness is not the same as distinctness. Although the result of both DistinctBy and UniqueBy is a sequence that contains both distinct and unique elements, the difference is about what happened with the items that were not distinct. The DistinctBy keeps the first of the duplicates, while the UniqueBy eliminates them alltogether.
Below is my implementation of the UniqueBy extension method. It should be more efficient than using the GroupBy operator, because it doesn't create IGrouping objects during the enumeration of the source.
/// <summary>
/// Returns all unique elements of the given source, where uniqueness is
/// determined according to a given key selector. Duplicate elements are removed.
/// </summary>
public static IEnumerable<TSource> UniqueBy<TSource, TKey>(
this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector,
IEqualityComparer<TKey> keyComparer = default)
{
// Arguments validation omitted
keyComparer = keyComparer ?? EqualityComparer<TKey>.Default;
var dictionary = new Dictionary<TKey, (TSource Item, bool Unique)>(keyComparer);
foreach (TSource item in source)
{
var key = keySelector(item);
if (!dictionary.TryGetValue(key, out var entry))
{
dictionary[key] = (item, true);
}
else
{
if (entry.Unique) dictionary[key] = (default, false);
}
}
foreach (var (item, unique) in dictionary.Values)
{
if (unique) yield return item;
}
}