Lessons learned from Protocol Buffers, part 2: self-referential generic types

In the first part of this series we saw that a message type and its builder are closely related. The tricky bit comes when we want to define an interface describing messages and builders. Although some members clearly depend on the data being built (the first and last name in the person example above, for instance) others apply to all messages or all builders. For instance, a message can always provide you with a suitable builder, and a builder always allows you to build it to create the actual message. Likewise the message and builder types also have methods which return other instances of themselves – you can ask any message for the default message of the same type, or clone a builder. Many common builder methods effectively return this (i.e. the same builder) – but the declared return type needs to be the concrete type involved, not just the interface, otherwise you couldn’t then use the returned builder to set properties without casting.

(Aside: some of the members of the common interface would be more pleasant if they could be declared statically. We’ll look at that later in the series.)

We have two slightly different issues here: defining the interface to allow members to return the concrete types, and tying builders and messages together. This post will just talk about the first of these issues. Enjoy the luxury of only having to think about one type parameter at a time – it won’t last long.

First encounters of the self-referential kind

I first came across a generic constraint which referred to itself back in the early days of Java 5. Here’s the declaration for java.lang.Enum:

public abstract class Enum<E extends Enum<E>>

Assuming you’re more comfortable in C#, I’ll translate that into C# syntax:

public abstract class Enum<T> where T : Enum<T>

The constraint is easier read than understood. Any concrete, constructed class deriving from this will be an “enum of something” where something itself an “enum of something“.

Now, Java puts additional restrictions on the Enum class (it’s like System.Delegate in C# – you can’t explicitly derive from it yourself; you have to let the compiler do it for you). However, the syntax is perfectly valid in “normal” code. Typically when you encounter this kind of type constraint, you satisfy it in new classes by using the same class as the type argument for T. So, in the Enum example we might have:

public sealed class Currency : Enum<Currency>
{
   // Code
}

public sealed class Status : Enum<Status>
{
   // Code
}

There’s nothing to actually stop you from declaring class Status : Enum<Currency> – it’s not just not normally useful. Likewise you can leave the derived type as a generic one, but again that’s atypical. I don’t know any way to enforce the usual implementation – short of building into the language, as Java did – but it’s generally not a problem.

Back to Protocol Buffers

So why is this useful? Well, moving on from enums let’s look at the builder interface in Protocol Buffers. Here’s part of it – somewhat simplified, admittedly:

public interface IBuilder<TBuilder> where TBuilder : IBuilder<TBuilder>
{
    TBuilder Clear();
    TBuilder Clone();
    TBuilder ClearField(FieldDescriptor field);
    TBuilder AddRepeatedField(FieldDescriptor field, object value);
    TBuilder SetUnknownFields(UnknownFieldSet unknownFields);
    TBuilder MergeUnknownFields(UnknownFieldSet unknownFields);
    TBuilder MergeFrom(ByteString data);
    TBuilder MergeFrom(CodedInputStream input);
    TBuilder MergeFrom(CodedInputStream input, ExtensionRegistry registry);
}

None of those methods mention the actual message directly – for that we need another type parameter, as we’ll see in the next post – but all of them return a TBuilder. As it happens, the interface documentation requires that all the methods return the same reference back, just as StringBuilder methods do, but you could equally create an interface around immutable types, expecting each operation to return a new value. For instance, you could create an IArithmetic<T> interface such that int could implement IArithmetic<int>, double could implement IArithmetic<double> etc. You can then chain multiple operations together, e.g. 5.Add(10).Multiply(2) and know that you’re always within the world of integers.

It’s important to note that the return type of each of the methods in our builder interface is TBuilder, not IBuilder<TBuilder>. I point this out mostly because the latter is what I originally had. After all, it’s often best to expose fairly general return types. That works fine while you’re only using operations within the interface, but often clients know more detail about the concrete type and want to use that information. For instance, you might want to be able to write:

Person.Builder builder = …; // Get a builder from somewhere
builder = builder.Clear()
                 .SetFirstName(“Fred”)
                 .SetLastName(“Jones”)
                 .Clone();

Here SetFirstName() and SetLastName() aren’t members of the interface, but Clear() and Clone() are. We can mix and match like this (and finally reassign the builder variable) because the interface is as strongly typed as it is. Code which only knows about the interface can still do whatever it likes, because it knows that TBuilder implements IBuilder<TBuilder>. In particular, that means it’s fine for some of the interface to be implemented by an abstract class – in Protocol Buffers there can be quite a deep inheritance tree for messages and builders, and a lot of the methods (particularly the merging ones) can be written in terms of the others. (Yes, that suggests that an extension method might be appropriate – but leaving it in the interface allows for particular implementations to override the general one, which can be important for optimisation. There’s also the matter of making the whole thing play nicely for people who are still stuck with .NET 2.0 and Visual Studio 2005.)

A small diversion via Java

It’s interesting to note that while my C# port is larely a port of the Java code, there are significant differences around how generics are used. This is understandable given how different generics are in .NET and Java. (My preference being heavily towards the .NET side – but there are moments when Java has its advantages.) However, one aspect of Java which is used to great effect is covariant return types. In the Java protocol buffers, the Message and Builder interfaces aren’t generic at all. For instance, the equivalent of the earlier part of the builder interface is just this:

public interface Builder
{
    Builder clear();
    Builder clone();
    Builder clearField(FieldDescriptor field);
    Builder addRepeatedField(FieldDescriptor field, object value);
    Builder setUnknownFields(UnknownFieldSet unknownFields);
    Builder mergeUnknownFields(UnknownFieldSet unknownFields);
    Builder mergeFrom(ByteString data);
    Builder mergeFrom(CodedInputStream input);
    Builder mergeFrom(CodedInputStream input, ExtensionRegistry registry);
}

Does that mean we can’t chain operations together any more, mixing and matching “concrete-type-specific” methods (such as the setters for first name and last name) with the interface methods? Not at all – because where Person.Builder implements (say) clear() it can do it like this:

Person.Builder clear()
{
    // Implementation
    return this;
}

At that point, everything which knows at compile-time that it’s calling Person.Builder.clear() knows that it returns a Person.Builder – whereas code which only knows that it’s calling the interface method only knows that it will return some implementation of the interface. (Apologies for the naming here – it’s unfortunate from a clarity standpoint that both the interface and the implementation is called Builder, but I thought it would be worth being faithful to the real code on this point.)

It’s just about possible to do this in C# as well, with explicit interface implementation. Again, I went that way to start with – and it was a disaster. In the intermediate abstract classes I was having to cast to the interface sometimes, not cast at other times, declare new abstract protected methods of ClearImpl etc. It was simply awful. I’ve gone back to my school of thought which is that explicit interface implementation is handy when it’s absolutely required (or where you deliberately want to make it hard to call certain members), but should be largely avoided.

In fact, I do have a non-generic interface for both messages and builders, but where types would be involved I’ve renamed the methods to things like WeakClear and WeakBuild. These Weak* methods are only defined in terms of the non-generic interfaces, and are mostly used in cases where we really don’t know at compile time what kind of message we’re dealing with, even in a generic sense. Life would, however, be much simpler if only we had covariant return types in C#.

Conclusion

Self-referential generic types shouldn’t be used more widely than they really need to be – they can be tough to get your head round. However, they can be useful when you want to maintain a strongly typed API which needs to talk in terms of itself. One redeeming feature of the complexity in Protocol Buffers is that most of it is in the implementation: users of Person and Person.Builder really don’t need to know or care about the interfaces for most of the time. So long as they use a strongly typed expression to start with, they’ll keep that strong typing and be presented with appropriate members to call as if the interfaces and intermediate abstract classes didn’t even exist. It’s an API which gets out of your way when you’re not interested in it, which is always a nice sign.

While trying a number of schemes I’ve learned that there can often be a lot of subtly different options available, and their benefits and drawbacks aren’t always obvious until you try them. Oh, and covariant return types would be very welcome, and explicit interface implementation should generally be avoided where possible :)

Next time I’ll reveal a bit more about the real interfaces in my PB port. Bear in mind that messages need to know about their builders, and vice versa…

3 thoughts on “Lessons learned from Protocol Buffers, part 2: self-referential generic types”

  1. Hi Jon!

    I’ve been down this path in the Autofac builder API. I agree that self-referential types are something to use only when completely necessary.

    Things get tricky once you add additional, specialised builder interfaces (e.g. ISpecialisedBuilder : IBuilder< ...>) in a hierarchy.

    Autofac uses this approach, but the complexity of representing such types in the signatures of other methods increases quickly (e.g. extension methods.) This is one situation best avoided, I think.

  2. Jon–

    You say, “Yes, that suggests that an extension method might be appropriate – but leaving it in the interface allows for particular implementations to override the general one, which can be important for optimisation.”

    But wouldn’t a type-specific implementation of a method matching the name of the extension method always take precedence (unless callers were calling the extension method with static method syntax)?

  3. Jacob: the type-specific implementation would take precedence *if* the caller knows at compile-time which type is involved. Anything which only knows about the interface would end up calling the extension method.

Comments are closed.