23

I have a collection of person objects (IEnumerable) and each person has an age property.

I want to generate stats on the collection such as Max, Min, Average, Median, etc on this age property.

What is the most elegant way of doing this using LINQ?

Edward Karak
  • 10,658
  • 8
  • 44
  • 70
leora
  • 177,207
  • 343
  • 852
  • 1,351
  • 1
    Be careful if the data is comming from a database, as Linq may read the data more then once, however in your case Linq should be a good tool as you seem to have your collection in ram. – Ian Ringrose Nov 09 '10 at 14:14

4 Answers4

44

Here is a complete, generic implementation of Median that properly handles empty collections and nullable types. It is LINQ-friendly in the style of Enumerable.Average, for example:

    double? medianAge = people.Median(p => p.Age);

This implementation returns null when there are no non-null values in the collection, but if you don't like the nullable return type, you could easily change it to throw an exception instead.

public static double? Median<TColl, TValue>(
    this IEnumerable<TColl> source,
    Func<TColl, TValue>     selector)
{
    return source.Select<TColl, TValue>(selector).Median();
}

public static double? Median<T>(
    this IEnumerable<T> source)
{
    if(Nullable.GetUnderlyingType(typeof(T)) != null)
        source = source.Where(x => x != null);

    int count = source.Count();
    if(count == 0)
        return null;

    source = source.OrderBy(n => n);

    int midpoint = count / 2;
    if(count % 2 == 0)
        return (Convert.ToDouble(source.ElementAt(midpoint - 1)) + Convert.ToDouble(source.ElementAt(midpoint))) / 2.0;
    else
        return Convert.ToDouble(source.ElementAt(midpoint));
}
Rand Scullard
  • 2,707
  • 1
  • 21
  • 18
  • This, AFAIK, enumerates the source 2 or 3 times: first when `Count()` is called, second (and possibly third time) - when ElementAt is called. – DarkWanderer May 24 '14 at 09:51
  • 8
    You're absolutely right. And -- as always with LINQ -- the impact of this will range from trivial to prohibitive, depending on the nature of your collection. Remember the Rules of Optimization: Rule 1: Don't do it. Rule 2 (for experts only): Don't do it yet. (See http://blog.codinghorror.com/why-arent-my-optimizations-optimizing/) – Rand Scullard May 24 '14 at 14:22
31
var max = persons.Max(p => p.age);
var min = persons.Min(p => p.age);
var average = persons.Average(p => p.age);

Fix for median in case of even number of elements

int count = persons.Count();
var orderedPersons = persons.OrderBy(p => p.age);
float median = orderedPersons.ElementAt(count/2).age + orderedPersons.ElementAt((count-1)/2).age;
median /= 2;
Itay Karo
  • 17,370
  • 4
  • 38
  • 56
10

Max, Min, Average are part of Linq:

int[] ints = new int[]{3,4,5};
Console.WriteLine(ints.Max());
Console.WriteLine(ints.Min());
Console.WriteLine(ints.Average());

Median is easy:

UPDATE

I have added order:

ints.OrderBy(x=>x).Skip(ints.Count()/2).First();

BEWARE

All these operations are done in a loop. For example, ints.Count() is a loop so if you already get ints.Length and stored to a variable or simply use it as it is, would be better.

Aliostad
  • 78,844
  • 21
  • 155
  • 205
  • 1
    `ints.ElementAt(ints.Count()/2)` – Cheng Chen Nov 09 '10 at 13:58
  • for the Median to be right you need to order the array first. – Itay Karo Nov 09 '10 at 13:59
  • Your implementation of `Median` also assumes that the input has an odd number of elements. It will fail for `{0, 1}` (it will give 0 instead of 0.5). – Mark Byers Nov 09 '10 at 14:00
  • 1
    @Mark - AFAIK Median will always return one of the source array elements – Itay Karo Nov 09 '10 at 14:04
  • 4
    @Itay: No, that's usually called the medoid. *A related concept, in which the outcome is forced to correspond to a member of the sample is the medoid.* http://en.wikipedia.org/wiki/Median The way you are calculating the medoid is biased as it will often return a value that is lower than the median, and never a value that is higher. – Mark Byers Nov 09 '10 at 14:04
4

Get median using Linq (works for even or odd number of elements)

int count = persons.Count();

if (count % 2 == 0)
    var median = persons.Select(x => x.Age).OrderBy(x => x).Skip((count / 2) - 1).Take(2).Average();
else
    var median = persons.Select(x => x.Age).OrderBy(x => x).ElementAt(count / 2);
Josh Withee
  • 8,102
  • 3
  • 34
  • 54