One thing that I use a lot is sample data. Every article I write needs some data to explain the concepts, I need some data to see how it fits in my designs or even sample data for testing. This is a real trouble, as I must find some reliable data for my programs. Sometimes, I go to databases (Northwind and AdventureWorks are my good friends), sometimes, I use Json or XML data and other times I create the sample data by myself.

None of them are perfect, and it’s not consistent. Every time I get a new way of accessing data (yes, it can be good for learning purposes, but it’s a nightmare for maintenance). Then, looking around, I found Bogus (https://github.com/bchavez/Bogus), It’s a simple data generator for C#. All you have to do is create rules for your data and generate it. Simple as that! Then, you’ve got the data to use in your programs. It can be fixed (every time you run your program, you have the same data) or variable (every time you get a different set of data), and once you got it, you can serialize it to whichever data format you want: json files, databases, xml or plain text files.

Generating sample data

The first step to generate sample data is to create your classes. Create a new console app and add these two classes:

public class Customer
{
    public Guid Id { get; set; }
    public string Name { get; set; }
    public string Address { get; set; }
    public string City { get; set; }
    public string Country { get; set; }
    public string ZipCode { get; set; }
    public string Phone { get; set; }
    public string Email { get; set; }
    public string ContactName { get; set; }
    public IEnumerable<Order> Orders { get; set; }
}
public class Order
{
    public Guid Id { get; set; }
    public DateTime Date { get; set; }
    public Decimal OrderValue { get; set; }
    public bool Shipped { get; set; }
}

Once you’ve got the classes, you can add the repositories to get the sample data. To use the sample data generator, you must add the Bogus NuGet package to your project, with the command Install-Package Bogus, in the package manager console. Then we can add the repository class to retrieve the data. Add a new class to the project and name it SampleCustomerRepository. Then add this code in the class:

public IEnumerable<Customer> GetCustomers()
{
    Randomizer.Seed = new Random(123456);
    var ordergenerator = new Faker<Order>()
        .RuleFor(o => o.Id, Guid.NewGuid)
        .RuleFor(o => o.Date, f => f.Date.Past(3))
        .RuleFor(o => o.OrderValue, f => f.Finance.Amount(0, 10000))
        .RuleFor(o => o.Shipped, f => f.Random.Bool(0.9f));
    var customerGenerator = new Faker<Customer>()
        .RuleFor(c => c.Id, Guid.NewGuid())
        .RuleFor(c => c.Name, f => f.Company.CompanyName())
        .RuleFor(c => c.Address, f => f.Address.FullAddress())
        .RuleFor(c => c.City, f => f.Address.City())
        .RuleFor(c => c.Country, f => f.Address.Country())
        .RuleFor(c => c.ZipCode, f => f.Address.ZipCode())
        .RuleFor(c => c.Phone, f => f.Phone.PhoneNumber())
        .RuleFor(c => c.Email, f => f.Internet.Email())
        .RuleFor(c => c.ContactName, (f, c) => f.Name.FullName())
        .RuleFor(c => c.Orders, f => ordergenerator.Generate(f.Random.Number(10)).ToList());
    return customerGenerator.Generate(100);
}

In line 3, we have set the Randomizer.Seed to a fixed seed, so the data is always the same for all runs. If we don’t want it, we just don’t need to set it up. Then we set the rules for the order and customer generation. Then we call the Generate method to generate the sample data. Easy as that.

As you can see, the generator has a lot of classes to generate data. For example, the Company  class generates data for the company, like CompanyName. You can use this data as sample data for your programs. I can see some uses for it:

  • Test data for unit testing
  • Sample data for design purposes
  • Sample data for prototypes

but I’m sure you can find some more.

To use the data, you can add this code in the main program:

static void Main(string[] args)
{
    var repository = new SampleCustomerRepository();
    var customers = repository.GetCustomers();
    Console.WriteLine(JsonConvert.SerializeObject(customers, 
        Formatting.Indented));
}

We are serializing the data to Json, so you must add the Newtonsoft.Json Nuget package to your project. When you run this project, you will see something like this:

As you can see, it generated a whole set of customers with their orders, so you can use in your programs.

You may say that this is just dummy data, it will be cluttering your project, and it this data will remain unused when the program goes to production, and you’re right. But that’s not the way I would recommend to use this kind of data. It is disposable and will clutter the project. So, a better way would be create another project, a class library with the repository. That solves one problem, the maintenance one. When entering in production, you replace the sample dll with the real repository and you’re done. Or no? Not too fast. If you see the code in the main program, you will see that we are instantiating an instance of the SampleCustomerRepository, that won’t exist anymore. That way, you must also change the code for the main program. Not a good solution.

You can use conditional compilation to instantiate the repository you want, like this:

        static void Main(string[] args)
        {
#if SAMPLEREPO
            var repository = new SampleCustomerRepository();
#else
            var repository = new CustomerRepository();
#endif
            var customers = repository.GetCustomers();
            Console.WriteLine(JsonConvert.SerializeObject(customers, 
                Formatting.Indented));
        }

That’s better, but still not optimal: you need to compile the program twice: one time to use the sample data and the other one for going to production. With automated builds that may be less than a problem, but if you need to deploy by yourself, it can be a nightmare.

The solution? Dependency injection. Just create an interface for the repository, use a dependency injection framework (I like to use Unity or Ninject, but there are many others out there, just check this list). That way, the code will be completely detached from the data and you won’t need to recompile the project to use this or that data. Just add the correct dll and you are using the data you want. This is a nice approach, but the topic gives space for another post, just wait for it!

The code for this article is at https://github.com/bsonnino/SampleData

 

Leave a Reply

Your email address will not be published. Required fields are marked *

Post Navigation