One book that I recommend the reading is Clean Code, by Robert Martin. It is a well written book with wonderful techniques to create better code and improve your current programs, so they become easier to read, maintain and understand.

While going through it again, I found an excellent opportunity to improve my skills trying to do some refactoring: in listing 4.7 there is a prime generator function that he uses to show some refactoring concepts and turn int listing 4.8. I then thought do do the same and show my results here.

We can start with the listing converted to C#. This is a very easy task. The original program is written in  Java, but converting it to C# is just a matter of one or two small fixes:

using System;

namespace PrimeNumbers
{
/**
* This class Generates prime numbers up to a user specified
* maximum. The algorithm used is the Sieve of Eratosthenes.
* <p>
* Eratosthenes of Cyrene, b. c. 276 BC, Cyrene, Libya --
* d. c. 194, Alexandria. The first man to calculate the
* circumference of the Earth. Also known for working on
* calendars with leap years and ran the library at Alexandria.
* <p>
* The algorithm is quite simple. Given an array of integers
* starting at 2. Cross out all multiples of 2. Find the next
* uncrossed integer, and cross out all of its multiples.
* Repeat untilyou have passed the square root of the maximum
* value.
*
* @author Alphonse
* @version 13 Feb 2002 atp
*/
    public class GeneratePrimes
    {
        /**
        * @param maxValue is the generation limit.
*/
        public static int[] generatePrimes(int maxValue)
        {
            if (maxValue >= 2) // the only valid case
            {
                // declarations
                int s = maxValue + 1; // size of array
                bool[] f = new bool[s];
                int i;

                // initialize array to true.
                for (i = 0; i < s; i++)
                    f[i] = true;
                // get rid of known non-primes
                f[0] = f[1] = false;
                // sieve
                int j;
                for (i = 2; i < Math.Sqrt(s) + 1; i++)
                {
                    if (f[i]) // if i is uncrossed, cross its multiples.
                    {
                        for (j = 2 * i; j < s; j += i)
                            f[j] = false; // multiple is not prime
                    }
                }
                // how many primes are there?
                int count = 0;
                for (i = 0; i < s; i++)
                {
                    if (f[i])
                        count++; // bump count.
                }
                int[] primes = new int[count];
                // move the primes into the result
                for (i = 0, j = 0; i < s; i++)
                {
                    if (f[i]) // if prime
                        primes[j++] = i;
                }
                return primes; // return the primes
            }
            else // maxValue < 2
                return new int[0]; // return null array if bad input.
        }
    }
}

The first step is to put in place some tests, so we can be sure that we are not breaking anything while refactoring the code. In the solution, I added a new Class Library project, named it GeneratePrimes.Tests and added the packages NUnit, NUnit3TestAdapter and FluentAssertions to get fluent assertions in a NUnit test project. Then I added these tests:

using NUnit.Framework;
using FluentAssertions;

namespace PrimeNumbers.Tests
{
    [TestFixture]
    public class GeneratePrimesTests
    {
        [Test]
        public void GeneratePrimes0ReturnsEmptyArray()
        {
            var actual = GeneratePrimes.generatePrimes(0);
            actual.Should().BeEmpty();
        }

        [Test]
        public void GeneratePrimes1ReturnsEmptyArray()
        {
            var actual = GeneratePrimes.generatePrimes(1);
            actual.Should().BeEmpty();
        }

        [Test]
        public void GeneratePrimes2ReturnsArrayWith2()
        {
            var actual = GeneratePrimes.generatePrimes(2);
            actual.Should().BeEquivalentTo(new[] { 2 });
        }

        [Test]
        public void GeneratePrimes10ReturnsArray()
        {
            var actual = GeneratePrimes.generatePrimes(10);
            actual.Should().BeEquivalentTo(new[] { 2,3,5,7 });
        }

        [Test]
        public void GeneratePrimes10000ReturnsArray()
        {
            var actual = GeneratePrimes.generatePrimes(10000);
            actual.Should().HaveCount(1229).And.EndWith(9973);
        }
    }
}

These tests check that there are no primes for 0 and 1, one prime for 2, the primes for 10 are 2, 3, 5, 7 and that there are 1229 primes less than 10,000 and the largest one is 9973. Once we run the tests, we can see that the pass and we can start doing our changes.

The easiest fix we can do is to revise the comments at the beginning. We don’t need the history of Erasthotenes (you can go to Wikipedia for that). We don’t need the author and version, thanks to source control technology :-). We don’t need either the initial comment:

/**
    * This class Generates prime numbers up to a user specified
    * maximum. The algorithm used is the Sieve of Eratosthenes.
    *  https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes   
*/
public class GeneratePrimes
{
    public static int[] generatePrimes(int maxValue)

Then we can invert the initial test, to reduce nesting. If we hover the mouse in the line of the first if, an arrow appears at the border, indicating a quick fix:

We can do the quick fix, then eliminate the else clause (don’t forget to remove the extra comments that are not needed):

public static int[] generatePrimes(int maxValue)
{
    if (maxValue < 2) 
        return new int[0]; 

    // declarations
    int s = maxValue + 1; // size of array
    bool[] f = new bool[s];
    int i;

Save the code and check that all tests pass. The next step is to rename the variables:

  • s can be renamed to sizeOfArray
  • f can be renamed as isPrimeArray

Go to the declaration of s and press Ctrl-R-R to rename and rename it to sizeOfArray. Do the same with the f variable. Don’t forget to remove the comments (and to run the tests):

int sizeOfArray = maxValue + 1; 
bool[] isPrimeArray = new bool[sizeOfArray];
int i;

To go to the next refactorings, we can use the comments as indicators for extracting methods. We can extract the InitializeArray method:

The extracted code isn’t what I expected, so I change it to:

private static bool[] InitializeArray(int sizeOfArray)
{
    bool[] isPrimeArray = new bool[sizeOfArray];
    // initialize array to true.
    for (var i = 0; i < sizeOfArray; i++)
        isPrimeArray[i] = true;
    return isPrimeArray;
}

I can use the code like this:

var isPrimeArray = InitializeArray(sizeOfArray);

After passing the tests, I can refactor the code of InitializeArray to:

private static bool[] InitializeArray(int sizeOfArray)
{
    return Enumerable
        .Range(0, sizeOfArray)
        .Select(n => true)
        .ToArray();
}

The next step is the sieve:

The code for the sieve is really bad:

private static void Sieve(int sizeOfArray, bool[] isPrimeArray, 
    out int i, out int j)
{
    // get rid of known non-primes
    isPrimeArray[0] = isPrimeArray[1] = false;
    for (i = 2; i < Math.Sqrt(sizeOfArray) + 1; i++)
    {
        if (isPrimeArray[i]) // if i is uncrossed, cross its multiples.
        {
            for (j = 2 * i; j < sizeOfArray; j += i)
                isPrimeArray[j] = false; // multiple is not prime
        }
    }
}

It has two out parameters (which, for me, is a code smell), and has an error (the out parameter j must be assigned) before exiting the method. So we can change it to remove the out parameters and remove the sizeOfArray parameter:

private static void Sieve(bool[] isPrimeArray)
{
    var sizeOfArray = isPrimeArray.Length;

    isPrimeArray[0] = isPrimeArray[1] = false;

    for (int i = 2; i < Math.Sqrt(sizeOfArray) + 1; i++)
    {
        if (isPrimeArray[i]) // if i is uncrossed, cross its multiples.
        {
            for (int j = 2 * i; j < sizeOfArray; j += i)
                isPrimeArray[j] = false; 
        }
    }

Then, we can extract the method to count primes:

CountPrimes has the same flaws as Sieve, so we change it to:

private static int CountPrimes(bool[] isPrimeArray)
{
    var sizeOfArray = isPrimeArray.Length;
    var count = 0;
    for (var i = 0; i < sizeOfArray; i++)
    {
        if (isPrimeArray[i])
            count++; 
    }
    return count;
}

We can refactor it to:

private static int CountPrimes(bool[] isPrimeArray) => 
    isPrimeArray.Count(i => i);

The next step is MovePrimes:

After we tweak the MovePrimes code, we get:

private static int[] MovePrimes(bool[] isPrimeArray, int count)
{
    var sizeOfArray = isPrimeArray.Length;
    var primes = new int[count];
    for (int i = 0, j = 0; i < sizeOfArray; i++)
    {
        if (isPrimeArray[i]) // if prime
            primes[j++] = i;
    }
    return primes;
}

Then we can refactor MovePrimes:

 private static int[] MovePrimes(bool[] isPrimeArray, int count) =>
     isPrimeArray
         .Select((p, i) => new { Index = i, IsPrime = p })
         .Where(v => v.IsPrime)
         .Select(v => v.Index)
         .ToArray();

Notice that we aren’t using the primes count in this case, so we can remove the calculation of the count and the parameter. After some cleaning and name changing, we get:

public static int[] GetPrimes(int maxValue)
{
    if (maxValue < 2)
        return new int[0];

    bool[] isPrimeArray = InitializeArray(maxValue);
    Sieve(isPrimeArray);
    return MovePrimes(isPrimeArray);
}

Much cleaner, no? Now, it’s easier to read the method, the details are hidden, but the code still runs the same way. We have a more maintainable method, and it shows clearly what it does.

But there is a change we can do here: we are using static methods only. We can then use extension methods and add the keyword this to allow the methods to be used as extension methods. For example, if we change MovePrimes and Sieve to:

private static int[] MovePrimes(this bool[] isPrimeArray) =>
    isPrimeArray
        .Select((p, i) => new { Index = i, IsPrime = p })
        .Where(v => v.IsPrime)
        .Select(v => v.Index)
        .ToArray();

private static bool[] Sieve(this bool[] isPrimeArray)
{
    var sizeOfArray = isPrimeArray.Length;

    isPrimeArray[0] = isPrimeArray[1] = false;

    for (int i = 2; i < Math.Sqrt(sizeOfArray) + 1; i++)
    {
        if (isPrimeArray[i]) // if i is uncrossed, cross its multiples.
        {
            for (int j = 2 * i; j < sizeOfArray; j += i)
                isPrimeArray[j] = false;
        }
    }
    return isPrimeArray;

We can have the GetPrimes method to be changed to:

public static int[] PrimesSmallerOrEqual(this int maxValue)
{
    if (maxValue < 2)
        return new int[0];

    return maxValue.InitializeArray()
        .Sieve()
        .MovePrimes();
}

Cool, no? With this change, the tests become:

public class GeneratePrimesTests
{
    [Test]
    public void GeneratePrimes0ReturnsEmptyArray()
    {
        0.PrimesSmallerOrEqual().Should().BeEmpty();
    }

    [Test]
    public void GeneratePrimes1ReturnsEmptyArray()
    {
        1.PrimesSmallerOrEqual().Should().BeEmpty();
    }

    [Test]
    public void GeneratePrimes2ReturnsArrayWith2()
    {
        2.PrimesSmallerOrEqual()
            .Should().BeEquivalentTo(new[] { 2 });
    }

    [Test]
    public void GeneratePrimes10ReturnsArray()
    {
        10.PrimesSmallerOrEqual()
            .Should().BeEquivalentTo(new[] { 2, 3, 5, 7 });
    }

    [Test]
    public void GeneratePrimes10000ReturnsArray()
    {
        10000.PrimesSmallerOrEqual()
            .Should().HaveCount(1229).And.EndWith(9973);
    }
}

The full code is at https://github.com/bsonnino/PrimeNumbers. Each commit there is a phase of the refactoring.

Recently, Microsoft introduced the new version of its test framework, MS-Test 2. With this new version, they introduced a new feature that I was waiting for a long time: parametrized tests (yes, NUnit and XUnit have had this for a long time, I know).

And what are parametrized tests? Let me show you with an example. Let’s say we have this routine to return Fibonacci numbers (source: https://www.dotnetperls.com/fibonacci)

 

public static int Fibonacci(int n)
{
    int a = 0;
    int b = 1;
    // In N steps compute Fibonacci sequence iteratively.
    for (int i = 0; i &lt; n; i++)
    {
        int temp = a;
        a = b;
        b = temp + b;
    }
    return a;
}

And we want to test it. We would like to test it with the numbers 0, 1, 2 and 80 (the first two are special cases, the third is a normal case and 80 is a large number to be sure that the routine works with large numbers). We should create a test like this:

[TestMethod]
public void Given0FibonacciReturns0()
{
   var fib = new Fib();
    var actual = fib.Fibonacci(0);
    Assert.AreEqual(0,actual);
}

This is not a bad test, but we must copy and paste to test the other results. You may argue that we could create a test like this one:

[TestMethod]
public void GivenDataFibonacciReturnsResultsOk()
{
    var numbers = new[] { 0, 1, 2, 80 };
    var results = new[] { 0L, 1L, 1L, 23416728348467685L };
    var fib = new Fib();
    for (int i = 0; i &lt; numbers.Length; i++)
    {
        var actual = fib.Fibonacci(numbers[i]);
        Assert.AreEqual(results[i], actual);
    }
}

But this has some problems:

  • If a test fails, it’s difficult to know which number failed
  • If one number fails, the next ones are not tested
  • You don’t have a clear view of what is being tested

MS-Test has had for a long time Data Driven tests (https://msdn.microsoft.com/en-us/library/ms182527.aspx), but this is very cumbersome. You must create a data file, assign it to the test and run the test using the TestContext. It’s too much work for just four tests, no?

Then it comes MS-Test 2. With it, you can create a DataTestMethod, with DataRows for each test. Let’s see how do you create a test with this new feature.

Creating Parametrized tests with MS-Test 2

In Visual Studio, create a new Console Project. In this project, create a new class and name it Fib.cs. Add this code to the class:

 public class Fib
 {
     public int Fibonacci(int n)
     {
         int a = 0;
         int b = 1;
         // In N steps compute Fibonacci sequence iteratively.
         for (int i = 0; i &lt; n; i++)
         {
             int temp = a;
             a = b;
             b = temp + b;
         }
         return a;
     }
 }

Then, in the solution, add a new Class Library project. Right click the References node in the Solution Explorer and add a reference to the console project. Then right click in the References node again and select “Manage NuGet packages”. Add the packages MsTest.TestAdapter and MsTest.TestFramework.

image

With that, you have a test project with MS-Test 2. If you are using Visual Studio 2017, the Test Project template already includes these two packages, but you must update them to the latest version, as the parametrized tests didn’t run well with the default packages.

Then, we can create our test:

[TestClass]
public class FibonacciTests
{
    [DataRow(0, 0)]
    [DataRow(1, 1)]
    [DataRow(2, 1)]
    [DataRow(80, 23416728348467685)]
    [DataTestMethod]
    public void GivenDataFibonacciReturnsResultsOk(int number, Int64 result)
    {
        var fib = new Fib();
        var actual = fib.Fibonacci(number);
        Assert.AreEqual(result, actual);
    }
}

The test is very similar to the ones we are used to create, it just has some differences:

 

  • Instead of the TestMethod attribute, it is decorated with the DataTestMethod attribute
  • The method receives two parameters
  • Each test has a DataRow attribute associated to it.

 

 

If we run this test, we get these results:

image

As you can see, we have three tests that passed and one that failed. We didn’t take in account in our routine that the results could be very large and overflow. So, we must change the routine to take this in account:

public Int64 Fibonacci(int n)
{
    Int64 a = 0;
    Int64 b = 1;
    // In N steps compute Fibonacci sequence iteratively.
    for (int i = 0; i &lt; n; i++)
    {
        Int64 temp = a;
        a = b;
        b = temp + b;
    }
    return a;
}

Now, when you run the tests, you get this:

image

All tests are passing, and we can have a clear view of which tests were run, without the need of extra files or any other tricks. Cool, no? This was a very welcome addition to MS-Test and can improve a lot our testing.

The source code for this article is in https://github.com/bsonnino/Fibonacci