Remove objects with a duplicate property from List

Question

I have a List of objects in C#. All of the objects contain a property ID. There are several objects that have the same ID property.

How can I trim the List (or make a new List) where there is only one object per ID property?

[Any additional duplicates are dropped out of the List]

score 199 · Accepted Answer · edited Sep 29 '20 at 09:39

199

If you want to avoid using a third-party library, you could do something like:

var bar = fooArray.GroupBy(x => x.Id).Select(x => x.First()).ToList();

That will group the array by the Id property, then select the first entry in the grouping.

edited Sep 29 '20 at 09:39

Daniel Lord

687
5
17

answered Apr 03 '12 at 12:25

Daniel Mann

53,152
13
97
112

10

This worked perfectly here is my implementation: List uniqueRows = inputRows.GroupBy(x => x.Id).Select(x => x.First()).ToList(); – Baxter Apr 03 '12 at 13:24
6

Glad to help! One note: The `` on your `ToList()` is redundant. You should be able to just do `.ToList()` – Daniel Mann Apr 03 '12 at 13:43
1

You are right it works with just ToList() instead of ToList() – Baxter Apr 03 '12 at 16:20
a good alternatif than trying to figure out why using distinct and iquatable not working. – Ariwibawa Nov 19 '20 at 04:55

score 30 · Answer 2 · edited Nov 30 '18 at 14:26

30

MoreLINQ DistinctBy() will do the job, it allows using object proeprty for the distinctness. Unfortunatly built in LINQ Distinct() not flexible enoght.

var uniqueItems = allItems.DistinctBy(i => i.Id);

DistinctBy()

Returns all distinct elements of the given source, where "distinctness" is determined via a projection and the default eqaulity comparer for the projected type.

PS: Credits to Jon Skeet for sharing this library with community

edited Nov 30 '18 at 14:26

Kolappan N

2,875
2
31
38

answered Apr 03 '12 at 12:26

sll

59,352
21
103
153

1

I think this is a great solution but am trying to avoid using a 3rd party library for this. Thank You. – Baxter Apr 03 '12 at 13:26
2

Fortunately you can see how it is implemented – sll Apr 03 '12 at 13:40

score 7 · Answer 3 · edited Nov 30 '18 at 10:22

var list = GetListFromSomeWhere();
var list2 = GetListFromSomeWhere();
list.AddRange(list2);

....
...
var distinctedList = list.DistinctBy(x => x.ID).ToList();

More LINQ at GitHub

Or if you don't want to use external dlls for some reason, You can use this Distinct overload:

public static IEnumerable<TSource> Distinct<TSource>(
    this IEnumerable<TSource> source, IEqualityComparer<TSource> comparer)

Usage:

public class FooComparer : IEqualityComparer<Foo>
{
    // Products are equal if their names and product numbers are equal.
    public bool Equals(Foo x, Foo y)
    {

        //Check whether the compared objects reference the same data.
        if (Object.ReferenceEquals(x, y)) return true;

        //Check whether any of the compared objects is null.
        if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
            return false;

        return x.ID == y.ID
    }
}



list.Distinct(new FooComparer());

Theodor Zoulias · Answer 4 · 2022-04-20T14:20:55.623

Starting from .NET 6, a new DistinctBy LINQ operator is available:

public static IEnumerable<TSource> DistinctBy<TSource,TKey> (
    this IEnumerable<TSource> source,
    Func<TSource,TKey> keySelector);

Returns distinct elements from a sequence according to a specified key selector function.

Usage example:

List<Item> distinctList = listWithDuplicates
    .DistinctBy(i => i.Id)
    .ToList();

There is also an overload that has an IEqualityComparer<TKey> parameter.

Alternative: In case creating a new List<T> is not desirable, here is a RemoveDuplicates extension method for the List<T> class:

/// <summary>
/// Removes all the elements that are duplicates of previous elements,
/// according to a specified key selector function.
/// </summary>
/// <returns>
/// The number of elements removed.
/// </returns>
public static int RemoveDuplicates<TSource, TKey>(
    this List<TSource> source,
    Func<TSource, TKey> keySelector,
    IEqualityComparer<TKey> keyComparer = null)
{
    var hashSet = new HashSet<TKey>(keyComparer);
    return source.RemoveAll(item => !hashSet.Add(keySelector(item)));
}

This method is efficient (O(n)) but a bit dangerous, because it has the potential to corrupt the contents of the List<T> in case the keySelector lambda fails for some item. The same problem exists with the built-in RemoveAll method¹. So in case the keySelector lambda is not fail-proof, the RemoveDuplicates method should be invoked in a try block that has a catch block where the potentially corrupted list is discarded.

¹ _{The List<T> class is backed by an internal _items array. The RemoveAll method invokes the Predicate<T> match for each item in the list, moving values stored in the _items along the way (source code). In case of an exception the RemoveAll just exits immediately, leaving the _items in a corrupted state. I've posted an issue on GitHub regarding the corruptive behavior of this method, and the feedback that I've got was that neither the implementation should be fixed, nor the behavior should be documented.}

score 3 · Answer 5 · edited Jul 28 '20 at 15:29

Not sure if anyone is still looking for any additional ways to do this. But I've used this code to remove duplicates from a list of User objects based on matching ID numbers.

private ArrayList RemoveSearchDuplicates(ArrayList SearchResults)
{
    ArrayList TempList = new ArrayList();

    foreach (User u1 in SearchResults)
    {
        bool duplicatefound = false;
        foreach (User u2 in TempList)
            if (u1.ID == u2.ID)
                duplicatefound = true;

        if (!duplicatefound)
            TempList.Add(u1);
    }
    return TempList;
}

Call: SearchResults = RemoveSearchDuplicates(SearchResults);

This is pointlessly O(n ^2) when regular GroupBy is just O(n)... — Alexei Levenkov, Nov 17 '20 at 21:42

Remove objects with a duplicate property from List

5 Answers5

Linked

Related