Enhanced enums in C#


This will be an evolving post, hopefully. (If no-one comments on it, it probably
won’t change unless I come up with better ideas myself.) Since working on a Java
project last year, I’ve been increasingly fed up with C#’s enums. They’re really
not very object oriented: they’re not type-safe (you can cast from one enum to
another via a cast to their common underlying type), they don’t allow any
behaviour to be specified, etc. They’re just named constant integral values.
Until I played with Java 1.5′s enum support, that wouldn’t have struck me as being
a problem, but (at least in some cases) enums can give you so much more. This post
is a feature proposal for C# 4.0. (I suspect the lid for this kind of thing is closed
on 3.0.)


What’s the basic idea?



Instead of being a fixed set of integral values, imagine if enums were a fixed set of
objects. They could have behaviour (and state of sorts), just like other objects – the
only difference would be that there’d only ever be one instance for each value. When
I say they could have “state of sorts” I mean that two values of an enum could differ
in just what they represented. For instance, imagine an enumeration of coins – I’ll use
US coins for convenience for most readers. Each value in the enum would have a name
(Cent, Nickel, Dime, Quarter, Dollar) and a monetary value in cents (1, 5, 10, 25, 100
respectively). Each might have a colour property too, or the metal they’re made of. That’s
the kind of state I mean. In fact, there’s nothing in the proposal below to say that the
state within an enum has to stay the same. I’d recommend that it did stay the same,
but maybe someone’s got an interesting use case where mutable enums would be useful.
(Actually, it’s not terribly hard to think of an example where the underlying state mutates
even if it doesn’t appear to from a public property point of view – some properties could
be lazily initialised if they might take time to compute.)



As well as properties, the enum type could have methods, just as other types do. For instance,
an enumeration of available encryption types may have methods to encrypt data. (I’m giving
fairly general examples here – in my experience, the actual enums you might use tend to be
very domain-specific, and as such don’t make good examples. I’m also trying to steer well
clear of risking giving away any intellectual property owned by my employer.)



Now, consider the possibilities available when you bring polymorphism into the picture. Not
every implementation of a method has to be the same. Some enum values may be instances of
a type derived from the top-most one. This would be limited at compile-time to make enums
fixed – you couldn’t derive from an enum type in the normal way, so you’d
always know that if you had a reference to an instance of the enum type, you’d got one of
the values specified by the enum.


What’s the syntax?



I propose a syntax which for the simplest of cases looks very much like normal enumerations,
just with class enum instead of enum:


public class enum Example
{
    FirstValue,
    SecondValue;
}


Note the semi-colon at the end of the list of values. This could perhaps be optional,
but when using the power of “class enums” (as I’ll call them for now) in a non-trivial way,
you’d need them anyway to tell the compiler you’d reached the end of the list of values,
and other type members were on the way. The next step is to introduce a constructor and
a property:


public class enum Coins
{
    Cent(1),
    Nickel(5),
    Dime(10),
    Quarter(25),
    Dollar(100);
    
    // Instance variable representing the monetary value
    readonly int valueInCents;
    
    // Constructor - all enum constructors are private
    // or protected
    Coins(int valueInCents)
    {
        this.valueInCents = valueInCents;
    }
    
    public int ValueInCents
    {
        get { return valueInCents; }
    }
}


Now, there would actually be some significant compiler magic going on at this point. Each of
the generated constructor calls would actually have an extra parameter – the name of the value. That
parameter would be present in every generated constructor, and if no constructors were declared, a protected
constructor would be generated which took just that parameter. Each constructor would implicitly
call the base constructor which would stash the name away and make it available through a Name
property (and no doubt through calls to ToString() too). What’s the base class in this case?
System.ClassEnum or some such. This could be an ordinary type as far as the CLR is concerned,
although language compilers would be as well to prevent direct derivation from it. This leaves room for
some compilers to allow types which aren’t really enums to derive from it, but does have the advantage
of not requiring a CLR change. Whenever you use someone else’s code you’re always taking a certain amount on
trust anyway, so arguably it’s not an awful risk. More about the services of System.ClassEnum
later…



The next piece of functionality is to have some members with overridden behaviour. The canonical example
of this is simple arithmetic operations – addition, subtraction, division and multiplication. The enumeration
contains a value for each operation, and has an Eval method to perform the operation. Here’s
what it would look like in C#:


public class enum ArithmeticOperation
{
    Addition
    {
        public override int Eval(int x, int y)
        {
            return x+y;
        }
    },
    
    Subtraction
    {
        public override int Eval(int x, int y)
        {
            return x-y;
        }
    },
    
    Multiplication
    {
        public override int Eval(int x, int y)
        {
            return x*y;
        }
    },
    
    Division
    {
        public override int Eval(int x, int y)
        {
            return x/y;
        }
    };
    
    public abstract int Eval(int x, int y);
}


Sometimes, you may wish to save a bit of space and specify the implementation of
a method as a delegate – especially if the method would otherwise be abstract (i.e.
there was no “common” implementation which most values would use). Here, C#’s
anonymous method syntax helps:


public class enum ArithmeticOperation
{    
    delegate int Int32Operation (int x, int y);
    
    Addition (delegate (int x, int y) { return x+y; }),
    Subtraction (delegate (int x, int y) { return x-y; }),
    Multiplication (delegate (int x, int y) { return x*y; }),
    Division (delegate (int x, int y) { return x/y; });
        
    Int32Operation op;
    
    ArithmeticOperation (Int32Operation op)
    {
        this.op = op;
    }
    
    public int Eval (int x, int y)
    {
        return op(x, y);
    }
}


That’s still a bit clumsy, of course – let’s try with lambda function syntax instead:


public class enum ArithmeticOperation
{    
    Addition ( (x,y) => x+y),
    Subtraction ( (x,y) => x-y),
    Multiplication ( (x,y) => x*y),
    Division ( (x,y) => x/y);
        
    Func<int, int> op;
    
    ArithmeticOperation (Func<int,int> op)
    {
        this.op = op;
    }
    
    public int Eval (int x, int y)
    {
        return op(x, y);
    }
}


Now we’re really cooking! Of course, some of the time you’ll be able to provide a single implementation for most values,
which only some values will want to override. Something like:


public class enum InputType
{
    Integer,
    String,
    Date
    {
        // Default implementation isn't quite good enough for us
        public override string Format(object o)
        {
            return ((DateTime)o).ToString("yyyyMMdd");
        }
    };
    
    // Default implementation of formatting
    public virtual string Format(object o)
    {
        return o.ToString();
    }
}


So far, this is all quite similar to Java’s enums in appearance. Java’s enums also come with an ordinal
(the position of declaration within the enum) automatically, but in my experience this is as much of a
pain as it is a blessing. In particular, as that ordinal can’t be specified in the source code, if you
have other code relying on specific values (e.g. to pass data across a web-service) you have to leave
bogus values in the list in order to keep the ordinals of the later values the same. Java also only
allows you to specify the constructors in “top-most” enum. This can occasionally be a nuisance. Let’s extend
the enum above to include DateTime and Time – both of which need the same kind of
“special” formatting. In Java, you’d have to override Format in three different places, like
this:


public class enum InputType
{
    Integer,
    String,
    DateTime
    {
        public override string Format(object o)
        {
            return ((System.DateTime)o).ToString("yyyyMMdd HH:mm:ss");
        }
    },
    Date
    {
        public override string Format(object o)
        {
            return ((System.DateTime)o).ToString("yyyyMMdd");
        }
    },
    Time
    {
        public override string Format(object o)
        {
            return ((System.DateTime)o).ToString("HH:mm:ss");
        }
    };
    
    public virtual string Format(object o)
    {
        return o.ToString();
    }
}


I would propose that enum values could reuse each other’s implementations, possibly
parameterising them via constructors. The above could be rewritten as:


public class enum InputType
{
    Integer,
    String,
    DateTime("yyyyMMdd HH:mm:ss")
    {
        string formatSpecifier;
        
        protected DateTime(string formatSpecifier)
        {
            this.formatSpecifier = formatSpecifier;
        }
        
        public override string Format(object o)
        {
            return ((System.DateTime)o).ToString(formatSpecifier);
        }
    },
    Date : DateTime("yyyyMMdd"),
    Time : DateTime("HH:mm:ss");
    
    public virtual string Format(object o)
    {
        return o.ToString();
    }
}

If Date wanted to further specialise the class (e.g. if another method needed overriding), it could add implementation there too. Note that the DateTime constructor does not explicitly call any constructor. In this case, an implicit call to the InputType constructor which took only the name parameter would be made. Explicit calls to base class constructors could be included in the normal way – the extra parameter would be entirely hidden from the source code. Only protected constructors could be called by derived types in the normal way.


Switch


Switch statements would appear in exactly the same way they do now. (Possibly without the qualification (e.g. case Time: instead of case InputType.Time:. The code is more readable without the qualification, and is unambiguous, but it would be inconsistent with the current handling of switch cases for “normal” enums.) The implementation would work a lot like strings – either using the equivalent of a sequence of if statements or building a map behind the scenes. This is where keeping something like Java’s ordinals would speed things up, but then I would at least want something like an attribute to be able to specify the value in source code to avoid the problems described in Java earlier. Note that only reference equality needs to be checked in any of these cases, as only one instance of any value would be created.


Static field initializers and static constructors


Java has restrictions on where you can use static fields in enums, because the static field initializers are executed after the enum values have been created. Static fields are useful in a surprising number of circumstances, mostly getting at an enum value dynamically by something other than name. The rules for this would need to be considered carefully – sometimes it’s useful to have code which will be run after the enums have all been set up; other times you want it beforehand (so you can use the static fields during initialization, e.g. to add a value to a map).


Other features


Like Java, there could be an EnumSet<T> type which would be the equivalent of using FlagsAttribute on normal enums. Indeed, the compiler could even generate operator overloads to make this look nicer.


In Java, some enums with many values overriding many methods can end up being pretty large. In C#, of course, we can use partial types to split the whole enum definition over several files. (Some may object to this, others not – it would be available if you wanted it.)


Open questions


  • The potential abuse problem mentioned earlier
  • Serialization/deserialization would need to know to use the previously set up values
  • Should identifying values (like Java ordinals) be present, if only for switch performance?

I suspect there are other things I haven’t thought of, but with any luck this will be food for thought.

19 thoughts on “Enhanced enums in C#”

  1. Great idea :)

    Though I’d already settle for inheritance support and type safety. I use enums a lot in my framework and I always have to re-define enums for sub-types while they would be better suited with a derived enum from the enum of their super-type.

    So that you could have:
    enum EmployeeFieldIndex
    {
    EmployeeId,
    Name,
    StartDate
    }

    and then manager:
    enum ManagerFieldIndex:EmployeeFieldIndex
    {
    ManagesDepartment
    }

    Now I have to re-define the employeefieldindex enums in manager’s fieldindex enum. And besides that, indeed the type unsafety is weird, as in all other parts of the .NET framework type safety is THE first class citizen. (I mean: there’s no co-variance in generics because it ‘could’ be type unsafe. go figure)

  2. Inheritance is interesting, because it goes against the idea of an enum in terms of a fixed/limited set of values. It’s sort of like having inheritance in the singleton pattern – there are times when the result is useful, but it’s not really a singleton any more.

    Does that make any sense?

  3. What does all this do better than what is currently possible with static classes?

    public abstract class ArithmeticOperation
    {
    private ArithmeticOperation()
    {
    }

    public abstract int Eval(int x, int y);

    public static readonly ArithmeticOperation Addition = new Add();

    private class Add : ArithmeticOperation
    {
    public override int Eval(int x, int y)
    {
    return x + y;
    }
    }

    public static readonly ArithmeticOperation Multiplication = new Multiply();

    // …
    }

    The only drawback here is that these are not usable in a switch statement (“A value of an integral type expected”).

    But perhaps we should ask Ms to loosen this restriction on switch statements, instead of adding yet another construct to the language.

    Geert

  4. 1) Switch statement support
    2) There’s a well-defined list of possible values
    3) Easy to tie all the similar things together
    4) Having language support makes it easy to use the design pattern.

    Number 4 is very important to me. I’ve spent the last year coming to the conclusion that software engineering can only be done “right” if the “right” way of achieving something is also the easiest (even in the short term) way of achieving it.

    It’s a useful pattern, but having to specifically ensure that nothing else can construct instances, and making those instances available universally, and getting switch equivalent possibilities is too much – so some of them tend to get sacrificed as soon as something works.

    With language support, the easiest way of following the broad pattern becomes the elegant way too.

  5. 1) Fully agreed, and I wish Ms would remove this restriction from switch cases.
    2) This is equally true for static classes with static fields. There’s no possible way for someone to create new values outside the static class (except perhaps via codedom)
    3) Same as 2
    4) Yes, but this has a cost too. Each new keyword steepens the learning curve, and must thus add sufficient value to warrant its existence. IMHO, this is not the case here.

    Please convince me otherwise.

  6. 2/3 – yes, you’re right. I don’t think I’d looked at your code carefully enough.

    4 – well, I can only go by experience. I don’t think I’d have written the code you’ve got there in the Java project I’ve just been working on, but as enums were simply available, we have several of them. It’s effectively suggesting a design pattern and making it simple to implement correctly. Experience of what happens when such a pattern is available is the only evidence I’ve got, unfortunately :(

    (Note that it doesn’t actually add any new keywords – I was quite careful about that. It doesn’t break any old code.)

  7. Let’s not forget the wonderful Java enum addition of values() which lets you easily enumerate over all the items in the enum list.

  8. Has anyone written a class that is a workaround until the language includes this functionality? I would think it would be fairly easy to write an EnumType class that would provide for much of the functionality that is desired.

  9. I’m not sure how feasible it is to write a workaround for this, in the same way that one can’t really write a “general” singleton implementation to derive from. However, it’s an interesting challenge – I’ll have a go :)

  10. Here is what I came up with so far. The main problem I found was getting the static members (the items of the enum) instantiated. If you call a static method of a base class, the static members of the derived class are not instantiated.

    So if a person called BaseValues() directly (not possible below) before calling any method defined in the derived class, the list of values would be empty because non of the static members in the base class would have been instantiated.

    That is why I had to make the static methods in the base class protected and each derived class has to provide access methods (really ugly).

    public abstract class EnumType where T : EnumType
    {

    #region Static Fields

    private static List values = new List();
    private static Dictionary nameList = new Dictionary();

    #endregion

    #region Static Methods

    protected static ReadOnlyCollection BaseValues() {

    return values.AsReadOnly();
    }

    protected static T BaseValueOf(String name) {

    return nameList[name];
    }

    #endregion

    #region Non-Public Fields

    private String name;

    #endregion

    #region Properties

    public String Name {
    get {
    return name;
    }
    }

    #endregion

    #region Constructors

    protected EnumType(String name) {

    if (nameList.ContainsKey(name)) {
    throw new ArgumentException (
    String.Format(“Class {0} already contains an element with name \”{1}\”.”,
    typeof(T).Name, name));
    }

    this.name = name;

    values.Add((T)this);
    nameList.Add(name, (T)this);
    }

    #endregion

    #region Class Base Overrides

    public override string ToString() {
    return name;
    }

    #endregion
    }

    ////////////////////////////////////

    private class MyClass : EnumType
    {
    public static MyClass Value1 = new MyClass(“Value1″);
    public static MyClass Value2 = new MyClass(“Value2″);
    public static MyClass Value3 = new MyClass(“Value3″);

    private MyClass(String name) : base (name){
    }

    public static System.Collections.ObjectModel.ReadOnlyCollection Values() {
    return BaseValues();
    }

    public static MyClass ValueOf(String name) {
    return BaseValueOf(name);
    }
    }

  11. Have you proposed this on MS feedback site? If were posted there, it would have a better chance of being included in a future version!

  12. Better than that – I’ve spoken to Mads directly about it. Don’t worry, the C# team knows it would be nice :)

    Jon

  13. John, could this be a solution to the problem of the derived class not being inited
    (to be added in the base class):
    private static Dictionary subclassInited = new Dictionary();
    private static void checkSubclassInited()
    {
    Type t = typeof(T);
    if ( !subclassInited.ContainsKey(t) )
    {
    subclassInited.Add(t, true);
    FieldInfo[] f = t.GetFields();
    object f1 = t.InvokeMember(f[0].Name, BindingFlags.GetField, null, null, new object[] { });
    }
    }
    public static System.Collections.ObjectModel.ReadOnlyCollection Values()
    {
    checkSubclassInited();
    return BaseValues();
    }
    public static T ValueOf(String name)
    {
    checkSubclassInited();
    return BaseValueOf(name);
    }

    It does add a bit overhead to check if the subclass is initialized each time a base class method that needs that is called, and I’m not sure how much InvokeMember adds, but this actually works.

  14. Probably
    object f1 = f[0].GetValue(null);
    yields better performance than InvokeMember according to some sources

  15. Came across your blog while investigating if anyone had ideas surrounding “heirarchical enums”. The code below explains how I’d like to use such a thing. This might be too specific a case to warrant inclusion as a language feature, but I believe that the goal might be doable using the syntax you proposed above….

    public enum Stuff
    {

    public partial enum Number = one
    {
    public partial enum Letter = exx, //X
    public partial enum Letter = wye, //Y
    },

    public partial enum Number = two
    {
    public partial enum Letter = exx, //X
    public partial enum Letter = zee, //Z
    },

    }

    //Uses

    Stuff myStuff1 = Stuff.one.exx;
    Stuff myStuff2 = Stuff.two;

    if(myStuff1.Letter == Letter.exx) //Returns true

    if(myStuff2.Letter == null) //Returns true

    myStuff2.Letter = Letter.zee;

    if(myStuff2.Letter == Letter.zee) //Returns true

    Stuff myStuff3 = Stuff.two.wye; //Can’t do this — won’t compile

    myStuff1.Letter = Letter.zee; //Throws runtime exception

  16. A very late post, but there is a hidden attribute called [BeforeFieldInit], that controls whether static fields are initialized the first time a static method is called.

    For performance reasons, this attribute is enabled by default. If you wish to disable it, just specify a static constructor (even if private and empty).

    I anyone has any leverage in the .Net team, please *please* suggest adding a public interface to this attribute, so we can control it.

  17. John B posted a question about an enum work-around. He might find my SpecializedEnum approach to this problem on CodePlex useful.

    I wrote a class that uses Generics and Reflection to emulate most of the behavior of an enum with arbitrary types.

Comments are closed.