Josh Twist asked me this via Twitter:
is it possible to invoke a member before a ctor is finished (eg maybe using threaded IL trickery) or is this forbidden somehow?
Now I don’t know why everyone seems to think I enjoy writing code which could have bizarre effects on either you, the compiler, the resulting execution or your co-workers… but it’s an interesting topic to look at, anyway.
The perils of partially constructed objects
Hopefully it’s reasonably obvious why it’s dangerous to access a member before it’s been properly constructed – but it may be worse than you’ve considered.
In particular, immutable types are only immutable after they’re fully constructed. It’s entirely reasonable for an immutable type to change read-only fields several times during the course of initialization. The fields can only be set in the constructor itself (or a variable initializer for the field) but this can occur several times. If the constructor for the immutable type exposes the instance it’s constructing to other code, all the immutability guarantees go out of the window.
Even in mostly-mutable types, code may well assume that it’s dealing with some fixed aspects. For example, you may have some database entity type which is either freshly created with a random GUID, or created from an existing record with an ID from the database. In either case, code consuming this type wouldn’t expect to see an ID of Guid.Empty, or for the ID to change after it’s been observed… even if other properties of the object can be changed later.
What C# does to protect you
C# as a language (plus conforming compiler, of course) protects you from some of this.
When you chain to another constructor, you can’t use
this to calculate any arguments you want to pass to the other constructor. The code is clearly dealing with a partially constructed object at this point – it knows none of your constructor body has been executed – so it’s protecting you from harm. Unfortunately this means you can’t even call
this.GetType(), which can make it tricky to write objects which populate themselves using reflection.
During the constructor body, you have complete access to
this of course – you have to, in order to set any state within the object. This is where things can get nasty.
One way in which Java, C# and C++ diverge in their constructor behaviour is with regard to virtual methods:
- In C++, the object only really "becomes" an instance of the subclass when the subclass constructor has been executed, so calling a virtual method will only execute the override in the "current" type hierarchy.
- In Java, the object is of the final type from the start, so the most deeply overridden implementation of the virtual method is called – but this will occur without any initialization having taken place. All fields will still have their default values (null, 0 etc).
- C# is like Java, except that variable initializers will have been executed (as they’re executed before the base class constructor is called). In other words, initialization within the constructor body won’t have taken place, but any fields which are initialized as part of the declaration will have their appropriate values.
This is really dangerous if you’re not aware of it. In particular, any time you override a virtual method in Java or C#, you need to know whether it might be called in a partially-initialized state.
Wherever possible, try not to call virtual methods from the constructor for precisely this reason. I would advise that if you absolutely have to do it (I failed to remove this behaviour when porting Joda Time to Noda Time, for example) you document that fact very heavily and make sure that you don’t call the method in any other place. Make it protected, too. Basically it should only be part of initialization. If you need similar behaviour at other times, create another method. This allows derived classes to tailor their implementation to the expected state at the time of invocation.
You may be thinking that this is all easy: just avoid virtual methods, don’t do anything stupid like setting a static variable to
this during a constructor (making it visible from other threads before initialization is completed) and you’ll be fine.
Well, I suspect that almost every Windows Forms app in existence publishes
this during the constructor. Any time you have an event handler, that’s effectively providing a callback… and if that’s an instance method, it’s tied to the relevant instance, usually
How sure are you – really, really sure – that none of those event handlers will fire as part of the rest of the initialization? For example, if you use Visual Studio to hook up the ControlAdded event for a WinForms form, and also add a bunch of controls to the form… when is that event going to fire? Will the autogenerated code add the event handler after it adds the controls, or before? If it adds the handler at the start, then clearly the method handling the event will be called before your constructor finishes… so you need to be ready for that.
How much of a problem is this really?
Like many matters of purity, I suspect this is usually more of a theoretical issue than a practical one. In complicated situations like the Windows Forms one above, most event handlers are likely to be fired after initialization… and there’s typically not as strong a sense of invariants being set up as there would be in an immutable data type, for example.
Immutable data types, in turn, are less likely to accidentally let
this escape during construction… but the consequences of them doing so are much more severe, of course.
To answer Josh’s question: Yes. At least on the simplest reading of the question: members can certainly end up being invoked on an object during its construction. They can potentially end up being invoked on multiple threads during construction. This is basically under the control of the constructors in the type hierarchy though.
In particular, I believe that the .NET memory model is stricter than the ECMA specification in terms of threading: I believe a constructor will have completed (and all its writes retired) before the reference returned by the constructor can be published to another variable, which was a concern in double-checked locking. It’s a valid concern to consider though.
Alternative conclusion: almost nothing is really as simple as it appears to be.