Deborah's Developer MindScape






         Tips and Techniques for Web and .NET developers.

May 7, 2010

LINQ: Mean, Median, and Mode

Filed under: C#,Lambda Expressions,LINQ,VB.NET @ 2:23 am

If you are doing any type of statistical analysis, you probably need to calculate mean, median and mode. There are lots of places on the Web you can find the calculations. This post is different than most in that it uses LINQ and Lambda expressions.

Mean is the statistical average of a set of numbers. This one is easy with LINQ because of the Average function.

In C#:

int[] numbers = { 4, 4, 4, 4, 3, 2, 2, 2, 1 };

double mean = numbers.Average();
Debug.WriteLine(("Mean: " + mean));

In VB:

Dim numbers() As Integer = {4, 4, 4, 4, 3, 2, 2, 2, 1}

Dim mean As Double = numbers.Average()
Debug.WriteLine("Mean: " & mean)

The result is:

Mean: 2.88888888888889

This code uses the Average extension method on the IEnumerable class to calculate the mean, or average, of the numbers.

Median is the middle number of a set of numbers. If there is an even number of entries, it is the average of the two middle numbers.

In C#:

int[] numbers = { 4, 4, 4, 4, 3, 2, 2, 2, 1 };

int numberCount = numbers.Count();
int halfIndex = numbers.Count()/2;
var sortedNumbers = numbers.OrderBy(n=>n);
double median;
if ((numberCount % 2) == 0)
{
    median = ((sortedNumbers.ElementAt(halfIndex) +
        sortedNumbers.ElementAt((halfIndex – 1)))/ 2);
} else {
    median = sortedNumbers.ElementAt(halfIndex);
}
Debug.WriteLine(("Median is: " + median));

In VB:

Dim numbers() As Integer = {4, 4, 4, 4, 3, 2, 2, 2, 1}

Dim numberCount As Integer = numbers.Count
Dim halfIndex As Integer = numbers.Count \ 2
Dim sortedNumbers = numbers.OrderBy(Function(n) n)
Dim median As Double
If (numberCount Mod 2 = 0) Then
    median = (sortedNumbers.ElementAt(halfIndex) +
       sortedNumbers.ElementAt(halfIndex – 1)) / 2
Else
    median = sortedNumbers.ElementAt(halfIndex)
End If
Debug.WriteLine("Median is: " & median)

The result is:

Median is: 3

This code first counts the numbers and divides the count by 2 to find the middle of the list. Note that the VB code uses the backslash (\) to perform an integer division where the C# code uses a forward slash (/) for the division.

It then sorts the numbers in order using the OrderBy extension method and a Lambda expression that simply orders by the numbers.

The last step is to get the element at the middle (if odd) or the average of the two middle elements (if even). The result is the median.

Mode is the number that occurs the largest number of times.

In C#:

int[] numbers = { 4, 4, 4, 4, 3, 2, 2, 2, 1 };

var mode = numbers.GroupBy(n=> n).
    OrderByDescending(g=> g.Count()).
    Select(g => g.Key).FirstOrDefault();
Debug.WriteLine(("Mode is: " + mode));

In VB:

Dim numbers() As Integer = {4, 4, 4, 4, 3, 2, 2, 2, 1}

Dim mode = numbers.GroupBy(Function(n) n).
     OrderByDescending(Function(g) g.Count).
     Select(Function(g) g.Key).FirstOrDefault
Debug.WriteLine("Mode is: " & mode)

The result is:

Mode is: 4

This code uses the GroupBy extension method on IEnumerable to group the numbers by number. It then orders them by the count and selects the first one. This provides the number that occurs the most times.

Use these techniques whenever you need to calculate the mean, median, or mode.

Enjoy!

5 Comments

  1.   Andrew Morton — May 13, 2010 @ 3:02 am    Reply

    Hello Deborah,

    I don’t know if you noticed my post in the MS forums regarding the mode, but it is worth noting that a set of numbers can have more than one mode, so in the other style of LINQ:

    Dim numbers() As Integer = {4, 4, 4, 4, 3, 2, 2, 2, 1, 1, 1, 1}

    Dim modes = From a In _
    (From n In numbers _
    Group n By n Into g = Count() _
    Select g, n) _
    Where a.g = _
    (From n In numbers _
    Group n By n Into g = Count() Select g).Max _
    Select a.n

    ‘ gives 4 and 1

    Regards,

    Andrew

  2.   DeborahK — May 14, 2010 @ 10:27 am    Reply

    Thanks, Andrew.

  3.   clint — October 3, 2010 @ 7:58 pm    Reply

    thanks for the codes

  4.   Shahar — April 17, 2014 @ 8:14 am    Reply

    I am pretty sure Mean and average mean different things. At least in statistics…

  5.   DeborahK — April 17, 2014 @ 6:47 pm    Reply

    Hi Shahar –

    Yes, check this out:
    http://www.differencebetween.net/science/difference-between-average-and-mean/

    And despite the fact that Microsoft selected to name the LINQ method “Average”, the calculation is indeed the mean.

RSS feed for comments on this post. TrackBack URI

Leave a comment

*

© 2014 Deborah's Developer MindScape   Provided by WPMU DEV -The WordPress Experts   Hosted by Microsoft MVPs