Starting from .NET 6, a new DistinctBy LINQ operator is available:
public static IEnumerable<TSource> DistinctBy<TSource,TKey> (
this IEnumerable<TSource> source,
Func<TSource,TKey> keySelector);
Returns distinct elements from a sequence according to a specified key selector function.
Usage example:
List<Item> distinctList = listWithDuplicates
.DistinctBy(i => i.Id)
.ToList();
There is also an overload that has an IEqualityComparer<TKey> parameter.
Alternative: In case creating a new List<T> is not desirable, here is a RemoveDuplicates extension method for the List<T> class:
/// <summary>
/// Removes all the elements that are duplicates of previous elements,
/// according to a specified key selector function.
/// </summary>
/// <returns>
/// The number of elements removed.
/// </returns>
public static int RemoveDuplicates<TSource, TKey>(
this List<TSource> source,
Func<TSource, TKey> keySelector,
IEqualityComparer<TKey> keyComparer = null)
{
var hashSet = new HashSet<TKey>(keyComparer);
return source.RemoveAll(item => !hashSet.Add(keySelector(item)));
}
This method is efficient (O(n)) but a bit dangerous, because it has the potential to corrupt the contents of the List<T> in case the keySelector lambda fails for some item. The same problem exists with the built-in RemoveAll method¹. So in case the keySelector lambda is not fail-proof, the RemoveDuplicates method should be invoked in a try block that has a catch block where the potentially corrupted list is discarded.
¹ The List<T> class is backed by an internal _items array. The RemoveAll method invokes the Predicate<T> match for each item in the list, moving values stored in the _items along the way (source code). In case of an exception the RemoveAll just exits immediately, leaving the _items in a corrupted state. I've posted an issue on GitHub regarding the corruptive behavior of this method, and the feedback that I've got was that neither the implementation should be fixed, nor the behavior should be documented.