LA.NET [EN]

BasicsArchive

Oct 27

The dynamic type

Posted in Basics, C#       Comments Off on The dynamic type

C# 4.0 introduced a new type whose main job is to simplify our job when writing code that needs to use reflection. I’m talking about the new dynamic type. As we all know, C# is type-safe programming language. In practice, this means that the compiler must resolve all expressions into types and their respective operations. Whenever the compiler detects an invalid operation (ex.: calling a method not exposed by a class), it must stop the compilation process and generate an error. The good news is that this type safety ensures that most (if not all) programmer’s errors are detected at compile time.

Compare this with what happens with other dynamic languages, like JavaScript. Before going on, a disclaimer: I love JavaScript, so any errors you might end up having while writing code in it can only be attributed to the developer writing it :,,) Anyway, how many times have we written JS code only to find out some misspelling error at runtime?

Now, there’s also advantages associated with dynamic languages. For instance, compare the code you need to write for using COM components from C# with the code you have to write to consume them from, say, JavaScript…yep, C# starts to suck when you need to do that. With the new dynamic type, things get better:) Here’s an example of what I mean:

dynamic word = new Application {Visible = true};
dynamic doc = word.Documents.Add(  );
word.Selection.TypeText( "Hello, dynamic!" );

Now, if you’re an experienced C# dev, you can’t stop noticing the simplicity of the new code. Just for the fun, let’s see the C# 3.0 equivalent code:

Application word = new Application{Visible = true};
//now, the fun begins
Object missingValue = Missing.Value;
Document doc = word.Documents.Add(
    ref missingValue, ref missingValue, ref missingValue, ref missingValue);
word.Selection.TypeText( "Hello, dynamic!" );

And I was lucky because I picked an easy method. If I needed to replace text, things would quickly become  even more boring…It’s safe to say that we all prefer version 1 of the previous example, right? And the good news is that you can use the same strategy when writing reflection code (for an example of it, check this old post).

So, what happens when you mark a variable or expression with the dynamic keyword? Whenever the compiler sees a dynamic expression, it will insert special code for describing that operation which is used at runtime to determine the real operation that needs to be performed.This special code is generated by the runtime binder. In C#, the runtime binder is defined in the Microsft.CSharp assembly and you must reference it whenever you use the dynamic keyword in your code.

At runtime, things get rather complicated because the binder ends up consuming more memory that would have been necessary if you’re using, say, reflection (if you’re using dynamic types only on a small portion of your code, then you probably should consider not using dynamic types since the advantages of dynamic might not pay off).

A dynamic operation is resolved at runtime according to the real type of the object. If that object implements the IDynamicMetaObjectProvider interface, its GetMetaObject method ends up being called. It returns a DynamicMetaObject derived type which is responsible for performing the bindings for members of that type (ie, mapping the members, methods and operators specified in the code you’ve written. Dynamic languages in .NET have their own DynamicMetaObject derived classes (which allows them to be easily consumed from C#). Something similar happens with COM components (the C# runtime binder uses a DynamicMetaObject derived object which knows how to communicate with COM components). When the object doesn’t implement the interface, C# ends up using reflection for executing the required operations.

Now, there are a couple of interesting operations you can do with a dynamic type. For starters, any expression can be implicitly converted into a dynamic type:

Int32 a = 10;
dynamic b = a;

Yep, you’ll end up with boxing in the previous snippet. Even more interesting is the fact that you can implicitly convert from a dynamic into some other type because the CLR will validate that cast at runtime:

Int32 c = a;

Notice that you cannot do this with an Object instance that resulted from boxing an integer. If the dynamic value isn’t compatible with the desired type, you’ll end up with a InvalidCastException. Another interesting thing is that evaluating a dynamic expression gives you a new dynamic expression:

dynamic a = 10;
Int32 b = 2;
var t = a + b;
t.DoesntHaveThisMethodButCompiles( );

You’ll succeed if you try to compile the previous snippet! Of course, you’ll get an exception at runtime since ints don’t have a DoesntHaveThisMethodButCompiles method. Notice that var is the same as dynamic in the previous snippet! (and, btw, don’t confuse var with dynamic. var is just a shortcut that lets the compiler infer the type of a variable).

Whenever you use a dynamic variable in a foreach or using block, the compiler will automatically generate the correct code for that scenario (in the foreach, it will convert the variable into an IEnumerable; in the using case, it will cast it to IDisposable). Pretty slick, right?

And that’s it. Stay tuned for more.

Oct 25

In the previous post, we’ve started looking at the Equals method and saw that its default implementation (inherit from Object) had some flaws. We’ve seen a better implementation for it and we’ve also talked about some strategies for overriding the method in new custom types. In this post, we’re going to talk about a somewhat related concept: hash codes.

You see, all objects inherit a method called GetHashCode from the base Object type. This method is virtual, returns an integer and is defined by the Object type because the designers of the framework though that it would be a good idea to allow any object to be used as a key in a hashtable. The current rules governing hash code generation are quite interesting:

  • if two objects *are* equal, then they should return the *same* hash code.
  • if two objects *aren’t* equal, they don’t have to generate *different* hash codes. Many are surprised by this at first…
  • you should use at least one instance field for calculating the hash code of an object. You should rely on immutable fields for this because these fields are initialized during construction and then remain constant during the object’s lifetime. This is important and the docs should have presented this a a *must* (not a *should*).
  • the returned result must be consistent and it should be the same as long as there is no modification to the object state that determines the return value of the Equals method.
  • your method should strive to have a good random distribution.

As you can see, the rules for hash code generation imply that you’ll have to override GetHashCode whenever you override the Equals method. The Object’s GetHashCode implementation inherited by all types will simply return a number which doesn’t change during the object’s lifetime and is guaranteed to uniquely identify it in the current application domain. As you might expect, ValueType does follow the previous rules while overriding the GetHashCode method. Unfortunately, you’ll have the same performance problem mentioned before because it will have to use reflection to ensure that all the fields of a type are used in the algorithm.

Building your own hash code method isn’t as easy as it might look at first. If you look at the previous rules, you’ll notice that there are several constraints which make it hard to implement. One of the things that isn’t mentioned in the previous list (and it should be!) is that hash codes shouldn’t be able to change. In fact, they *can’t* change because it might break your application. Unfortunately,this isn’t really mentioned in the docs nor followed by the framework code. A quick example is the best way of illustrating this point. Take a look at the following code:

struct Point {
    public Int32 X { get; set; }
    public Int32 Y { get; set; }
}

This code seems simple enough and harmless,right? Well, guess what? It’s not…one of the things you should keep in mind while creating new value types is that they should be immutable! For instance, take a look at the DateTime struct…you’ll quickly notice that it doesn’t have any write properties and none of the existing methods change the value of its internal fields (at best, you’ll get a new instance returned). In other words, DateTime is an immutable type: after creating one instance, you can’t really change its state!

Now, if you look at our Point type, you’ll notice that it reuses the base’s Equals and GetHashCode implementation. Yes, I’ve said we should always override those methods, but they should work fine for the purpose of this demo (though probably a bit slower than if we introduced our own overrides of these methods). So. let’s start simple:

var hashtable = new Dictionary<Point, String>( );
var a = new Point {X = 10, Y = 20};
hashtable.Add(a, "Hi"  );
Console.WriteLine(a.GetHashCode( ));
Console.WriteLine(hashtable[a]);

Nothing too fancy here…we’re creating a new instance of a Point and using it as the key of a Dictionary instance. Till now, everything works out perfectly! Now, suppose we do this:

a.X = 20;
Console.WriteLine(a.GetHashCode( ));

I guess that by now you’re seeing it, right? If you’re not hearing alarm bells all over the place, then you should probably make a pause and read the info on the Dictionary class. Specifically, the part where it says this:

Retrieving a value by using its key is very fast, close to O(1), because the Dictionary<TKey, TValue> class is implemented as a hash table.

oopss…if you’ve run the previous code, you’ll notice that a.GetHashCode no longer returns the same value you got in the previous snippet. In fact, go ahead and try to get the previous entry from the hashtable variable:

Console.WriteLine(a.GetHashCode( ));

And here’s the result in my machine (the 1 you see is the total number of entries in the hashtable variable):

post

It seems like you just can’t get the existing entry from the dictionary through the Point instance variable that was used as key. Not good, right? Well, let’s see how we can improve our code to solve this kind of problem. We’ve got several options, but my favorite is turning Point into an immutable instance:

struct Point {
    private readonly Int32 _x;
    private readonly Int32 _y;

    public Int32 X { get { return _x; } }
    public Int32 Y { get { return _y; } }
    public Point(Int32 x, Int32 y) {
        _x = x;
        _y = y;
        _hashcode = null;
    }
    public override bool Equals(object obj) {
        if (obj == null) return false;
        if (obj.GetType() != GetType()) return false;
        var other = ( Point )obj;//unbox
        return other.X == X && other.Y == Y;
    }
    private Int32? _hashcode;
    public override int GetHashCode() {
        if(!_hashcode.HasValue) {
            _hashcode = X.GetHashCode( ) ^ Y.GetHashCode( );
        }
        return _hashcode.Value;
    }
}

I didn’t really follow all the recommendations I’ve mentioned in the previous post (I’ll leave that to you 🙂 ) but now we’ve solved the previous problems. Since point is immutable, you cannot change an instance after creating it and now the hash code stays constant along the instance lifetime.

Notice that I will only calculate the hash code once and only if someone asks for it. If you’re creating a new instance type, you can follow several of the principles presented in this sample. For instance, you should always strive to define which fields are immutable (if you don’t have one, then you can always add one!) and rely on them for calculating the hash code. Since this has become a rather large post, I wont be boring you with an example that shows how this can be done. Instead, I’ll simply redirect you to take a look at the S#arp project, which has some interesting base classes you can reuse to solve these problems.

And that’s it. Stay tuned for more.

Oct 24

Yep, it’s true: I’m still alive! After a long pause on blogging (due to a future project which I’ll talk about in a future post), I’m back…And with a very interesting topic. Comparison between objects is something which developers tend to do a lot. By default, all objects inherit the Object’s Equals virtual method, which returns true if both objects refer to the exactly the same object. According to  the excellent CLR via C#, here’s how that code might look like (I say might look because equality is implemented as an extern method):

public virtual Boolean Equals(Object obj) {
    if (this == obj) return true;
    return false;
}

In other words, you’re running an identity check. This approach might not be enough for you. In fact, many people say that the base Equals method implementation should look like this:

public virtual Boolean Equals(Object obj) {
    //if obj is null, then return false
    if (obj == null) return false;
    //check for types
    if (this.GetType() != obj.GetType()) return false;
    //check for field values
    return true;
}

There’s a lot going on here. We start by checking against null (obviously, if obj is null, we can simply return false). We then proceed and compare the object’s types. If they don’t match, then false it is. Finally,we need to check the fields values of both objects. Since Object doesn’t have any fields,then we can return true. Now, there are two conclusions you can take from the previous snippet:

  • the first, is that you shouldn’t really use Equals to perform identity checks because Equals is virtual and that means that a derived class might change the way that method works. If you want to perform an identity check, then you should use the static ReferenceEquals method (defined by the  Object type).
  • the second is that having a poor Equal’s implementation means that the rules for overriding it are not as simple as they should be. So, if a type overrides the Equals method, it should only call the base class’ method if the base class isn’t Object.

To make things more complex, we should also take the ValueType class into consideration. Interestingly, it overrides the Equals method and uses a similar algorithm to the last one we showed. Since it has to ensure the equality checks for its fields, it needs to resort to reflection. In practice, this means that you should provide your own implementation of the Equals methods when you create new value types classes.

When you’re creating new type, you should always ensure that it follows four rules:

  • Equals must be reflexive: x.Equals(x) must always return true.
  • Equals must be symmetric: x.Equals(y) should return the same result as y.Equals(x).
  • Equals must be transitive: x.Equals(y) == true && y.Equals(z) == true =>x.Equals(z) == true.
  • Equals must be consistent: calling the method several times with the same values should return the same results.

Now, these are the basic guarantees you need to give. There are a couple of extras things you can do too. For instance, you could make your type implement the IEquatable<T> interface to perform equals comparisons in a type safe manner. You *should* also overload the == and != operators. Their internal implementations should always reference the Equals override you’ve written.

Finally, if your object will be used in comparison operations, then you should go ahead and implement the IComparable<T> interface and overload the <, <=, > and >= operators (once again, the operators’ implementation should reuse the IComparable<T>’s CompareTo method). Hashcodes are also influenced by custom Equals method implementation, but we’ll live that discussion for a future post. Stay tuned for more.

Sep 22

Back to methods: overloading

Posted in Basics, C#       Comments Off on Back to methods: overloading

After a small JavaScript detour, I’m back into .NET land. We’ve already seen several interesting details regarding methods, but we still haven’t really discussed the concept of overloading. So, what’s method overloading? The concept is simple, but before, we should probably find a good enough definition for a method. For now, lets define a method as a name that represents a piece of code which performs some operation on a type or on an instance of a type.

If you’ve been doing C# for some type, then you know that you can have several methods with the same name, provided they have a different set of arguments. Ok, here’s a small sample which shows the concept:

public class Student {
    //static overloads
    public static void SayHi(Student std) {
    }
    public static void SayHi(Student std, String msg){
    }
    //instance overloads
    public void SayHi() {
    }
    public void SayHi(String msg) {
    }
}

The previous snippet shows how you can overload static and instance methods. In C#, you can only overload based on a different set of parameters. That means you cannot add the following method to the Student type without getting a compilation error:

//can't overload based on return type
public String SayHi(String msg) {
}

Now,do keep in mind that the CLR does allow this kind of stuff,ie, it does allow you to overload with a different return type. Anyway, we’re writing C# code, so to achieve that, we really need to write IL (which is not going to happen right now, so we’ll consider this paragraph a small side note:)).

Ok, back to C#…so why doesn’t C# allow overloads based in the return type? That happens because overloading is based on having different method signatures. In C#, a method signature consists of its name, the number of type parameters and the type and kind of each of its parameters (considered from left to right). Besides the return type, in C# the params keyword and the type parameter constraint (when you’re using generics) aren’t part of the method signature. Another interesting gotcha is that you cannot override solely based in the out/ref modifiers (as we’ve seen that in the past). Interestingly, the our/ref keywords are considered part of the signature for hiding/overriding purposes.

Overload resolution is an important concept which is used to decide which method will be called when you” write code like the following:

var std = new Student();
std.SayHi("Hello");

The algorithm for finding a method involves several steps. Instead of describing the complete process here (which would involve some serious copying of the C# spec) and since  this is becoming a large post, I guess I’ll just redirect you to the C# spec for getting the complete story on method overload resolution. So, open your C# spec and read the 7.5.5.1 section carefully to see what I mean…And that’s it for now. Tomorrow, we’ll continue our basic tour around .NET and the CLR. Stay tuned for more.

Sep 20

Operator overloading

Posted in Basics, C#       Comments Off on Operator overloading

In some languages (ex.: C#), you can customize the way an operator works. For instance, if you take a look at the String type, you’ll notice that it has these peculiar looking methods:

public static bool operator !=(string a, string b);
public static bool operator ==(string a, string b);

What you’re seeing here is known as operator overloading. Since the String class introduced these methods, you can write the following code to compare strings (btw, when comparing strings, you probably should be more explicit so that someone  that reads your code in the future knows exactly which type of comparison you’re performing):

String str1 = "hi";
String str2 = "hi";
Console.WriteLine( str2 == str1);

Operator overloading isn’t a CLR feature, but a compiler feature. In other words, the compiler is responsible for mapping the operators used in the source code into a method call which can be understood by the CLR. Now, in theory,you’re probably thinking that you can call the static method shown before directly:

Console.WriteLine(String.operator==(str1,str2));

But no, the truth is that you can’t do that (if you try, you’ll get a compiler error which says something like “invalid term: operator”). Even though the CLR doesn’t understand operators, it does specify how languages should expose operator overloads. If a language decides to support operator overload, then it must follow the CLR defined syntax and it must generate methods which match the expected signature. In the case of the == operator, the compiler is supposed to generate an op_Equality method. In case you’re thinking of trying to call that method directly from C#, don’t: you’ll end up getting a compilation error saying that you cannot access the op_Equality method directly.

Before we proceed, you should probably take a look at the complete method name list from here. If you’ve checked it, then you’ve probably noticed that the table has an extra column, called Name or alternative method. When I said that the C# compiler generated a special method for the operator overload, I didn’t mention one important detail: besides respecting the name defined by the CLR, it will also set a flag in the metadata of the method saying that this is a special method which can be used “as an operator”.

When you’re writing code from  a language which doesn’t support operator overloading, you can still introduce the op_XXX methods. Now, the problem is that you also need to have the special flag applied to that method. If you don’t have it, then you won’t be able to use the operator when consuming the type from, say C#. And that’s one of the reasons why you have the friendly name column in that table. MS recommends that you add those methods when you overload operators so that you can always perform the intended operation (as you might expect, these methods will always redirect to the adequate op_XXX methods). I believe MS could have done better here, but  we have to live with what we have, right? And I guess that’s it for now. Stay tuned for more.

Sep 17

In the previous posts, I’ve presented the basics on boxing and unboxing. Today, we’ll take a deep dive into several scenarios which illustrate how boxing/unboxing can occur without you even noticing it. Lets start with a quick recap. Suppose we’ve got the same structure as used in the previous examples:

public struct Student {
    public String Name { get; set; }
    public Int32 Age { get; set; }
}

As we’ve seen, you’ll get a boxing operation whenever you pass an instance of Student to an element which expects a reference type. In other words, the next instructions result in a boxing operation:

Student std = new Student {Name = "Luis", Age = 34};
Object boxedStd = std; //boxing here!

The previous snippet shows an explicit boxing operation. You might need to do this to improve performance. For instance, suppose you need to perform several calls and that all of them expect reference types. Here’s a naive way of doing that:

//all callx methods expect a reference to an Object instance
Call1( std );
Call2( std );
Call3( std );

If you do this, you’ll end up performing three boxing operations (one for each method invocation) because Call1,Call2 and Call3 are expecting to receive a reference type. In these cases,you should probably perform a manual boxing operation before calling those methods:

//all callx methods expect a reference to an Object instance
Object boxedStd = std; //1 boxing operation
Call1( boxedStd );
Call2( boxedStd );
Call3( boxedStd );

With the previous change, we’ve managed to perform a single boxing operation (instead of three). It’s a small improvement, but one that might make a difference when you have lots of calls. Besides method calls, there are other scenarios where you’ll end up with boxing.

If you look at the ValueType type definition, you’ll notice that it inherits from the Object type. So, what happens when you call one of the methods defined by the base class? The answer is: it depends. If you override one of the inherited virtual methods, then the CLR won’t box the instance and you’ll get a non-virtual method call (the CLR can do this since value types are implicitly sealed). *However*, if your override calls the base method, then the value type instance will get boxed because the base method’s this parameter expects a reference type.

When you’re calling a non-virtual method (or a non-overridden method), you’ll get boxing for the same reasons presented in the previous paragraph. There’s still one last scenario where you need to consider boxing. As you know, a struct might implement one or more interfaces. Casting an unboxed instance of a value type to one of the interfaces it implements requires boxing. Take a look at the next example:

public interface IStudent {
    String Name { get; set; }
}
public struct Student: IStudent {
    public String Name { get; set; }
    public Int32 Age { get; set; }
}

And here’s some code which explains where boxing will occur:

Student std = new Student {Name = "Luis", Age = 34};
String nameFromStruct = std.Name;//no boxing here
IStudent stdRef = std;//boxing
String nameFromInterface = stdRef.Name; //gets value from boxed ref

As an extra bonus, do you think you’ll get boxing in the next call? If so, why is that? And if there is boxing, can you do anything to prevent it from happening?

var toString = std.ToString( );

And I think we’ve covered most boxing related scenarios. Stay tuned for more…

Sep 16

Unboxing: is it really the opposite of boxing?

Posted in Basics, C#       Comments Off on Unboxing: is it really the opposite of boxing?

[Update: Thanks to JPC for finding (yet!) another error in my post]

Yes…well, it is, but it’s cheaper (as we’ll see). As we’ve seen in the previous post, we can convert a value type instance into a reference type instance through an operation known as boxing. Unboxing lets you convert a reference type into a value type (if the reference type was originally obtained through a boxing operation, that is). We’ve already seen an unboxing operation in the previous post (though I really didn’t mention it at the time):

var std2 = (Student)cll[0];

In the previous snippet, we’re converting the reference type instance obtained through boxing into a value type instance (notice the cast operator). The algorithm used for unboxing involve the following steps:

  1. the first thing required is obtaining the reference type. Then, the reference is checked against null and if it is null, you’ll end up with an exception. If it’s not, then an additional check is performed: the type of the boxed instance is checked. If it doesn’t match the type indicated in the unboxing operation, you’ll get a different type of exception: in this case, an InvalidCastException is thrown.
  2. If we reach this step, then the fields values are copied from the managed heap to the newly allocated space in the stack.

Notice that the unboxed value is *always* copied into *newly* allocated space in the stack.  IN other words,in the example of the previous post,we’ve ended up with two different Student instances on the stack. And that’s why you’re wrong if you assumed that the next snippet (also copied from the example in the previous post) would print the value true:

Console.WriteLine(std.Name == std2.Name);

As you can see, unboxing is relatively cheaper than boxing. It goes without saying that boxing and unboxing won’t help to improve the performance of your application, so you better pay attention to your code. Now, that you understand boxing and unboxing, we’re ready for the next phase: understand all the scenarios where boxing happens. Stay tuned for more.

Sep 16

Value types and boxing

Posted in Basics, C#       Comments Off on Value types and boxing

As we’ve seen, value types have better performance than reference types because they’re allocated in the stack, they’re not garbage collected nor do they get the extra weight generally associated with reference types. There are, however, some times where we need a “reference to  a value object” (yes, I wanted to write “reference to a value object”). In the old days, that would happen whenever you needed a collection of value objects (as you recall, in .NET 1.0/.NET 1.1 there were no generics). Here’s a small example:

public struct Student {
    public String Name { get; set; }
    public Int32 Age { get; set; }
}

Now, and this is important, pay attention to the following snippet:

var cll = new ArrayList();
var std = new Student {Name = "Luis", Age = 10};
cll.Add(std);
var std2 = (Student)cll[0];
std2.Name = "Abreu";

If you look at the docs, you’ll notice  that the Add method expects an Object instance. In other words,it requires a reference type and not a value type. If you go ahead and compile the previous snippet,you won’t get any compilation errors. What’s going on here? What you’re seeing is perfectly normal and it’s called boxing. Boxing allows us to convert a value type into a reference type. Boxing involves a rather simple algorithm:

  1. allocate memory on the managed heap for the value object plus the “traditional” overhead space required for all the reference types (btw, I’m talking about the type object pointer and the sync block index).
  2. copy the values of the value type’s fields into the heap’s allocated memory.
  3. use the returned memory address as the “converted” reference type instance.

When the compiler detected that the Add method requires a reference type, it went ahead and applied the previous algorithm in order to transform the value type and pass a reference into the method. In other words, what got added to the cll collection was the reference obtained from step 3 (and not the std variable).

This has lots of implications which might not be obvious at first. For instance, if you think that the following snippet should print true, you’re wrong:

Console.WriteLine(std.Name == std2.Name);

Before getting into why that happens, we need to understand unboxing. Since it’s 22:37 and I still haven’t had my dinner, I’ll leave the unboxing post for later 🙂

Stay tuned for more!

Sep 15

[Update: thanks to Wesner, I’ve fixed the list you should consider when using value types]

In the previous post, I’ve talked about some basic features related with reference types. If all the types were reference types, then our applications would really hurt in the performance department. In fact, hurt is really a kind way of putting it…Imagine having to go through all the hoops associated with creating a new reference type for allocating space for a simple integer…not funny, right?

And that’s why the CLR introduced value types. They’re ideal for simple and frequently used types. By default, value types are allocated in the stack (though they can end in the heap when they’re used as a field of a reference type). Whenever you declare a variable of a value type, that variable will hold the required space for saving a value of that type (instead of holding a reference for a memory position on the heap, like it happens with reference types). In practice, this means that value types  aren’t garbage collected like reference types.

In C#, you create a new value type by creating a new structure (struct keyword) or a new enumeration (enum). Here are two examples:

public struct Student     {
    public String Name { get; set; }
    public Int32 Age { get; set; }
}
public enum Sex {
    Male,
    Female
} 

Whenever you do that,you end up creating a new type which is derived from the abstract ValueType type (notice that ValueType inherits from Object). Trying to specify a base type for a struct results in  a compilation error.  There’s really nothing you can do about that,so get used to it. Notice, though, that you’re free to implement one or more interfaces if you wish to.  Another interesting thing to notice is that all value types are sealed, making it impossible to reuse them as base  for any other type.

Whenever you create a new enum, you’ll end up with a new type which inherits from System.Enum (which is itself derived from ValueType). There’s more to say about enums, but we’ll leave that for another post.

Creating an instance of a struct is as easy as declaring a variable  of that type:

Student std = new Student();
std.Name = "Luis";

In the previous snippet, we’re forced to used new to make the C# compiler happy. With value types, the new operator doesn’t end up allocating space in heap because  the C# compiler knows that Student is a value type and that it should be allocated directly on the stack (btw, it zeroes all the fields of the instance). Notice that you must use it  (or initialize it in some other way) before accessing its fields. If you don’t, you’ll end up with the “use of unassigned field” compilation error.

Now that you’ve met value and reference types, you might be interested in knowing when to use one or the other. In my limited experience, I can tell you that I end up using reference types in most situations, though there are some scenarios where value types should be used:

  • simple, immutable types which behave like primitive types are good candidates for value types. DateTime is probably the best known example of a type.
  • “small” types (ie, types which required less than 16 bytes of allocated space) might be good  candidates. Don’t forget method parameters! If the type isn’t passed into methods a lot, then you can probably relax the size rule.

Before you start using value types all over the place, you should also consider that:

  • value types can be boxed (more about this in a future post).
  • you cannot *should* customize the way these types handle equality and identity.
  • you can’t use any other base than ValueType or Enum nor can you reuse the type as a base for another type.
  • assigning an instance of a value type to another, you’re really doing a field by field copy. this means that assignments will always duplicate the amount of space necessary.

And that’s sums it up pretty nicely. Stay tuned for more.

Sep 14

Still on types: reference types

Posted in Basics, C#       Comments Off on Still on types: reference types

Most types introduced by the framework are reference types. So, what is a reference type? Take a look at the following code:

Student std = new Student();

When you instantiate a reference type you end up with a…yep, that’s right: a reference to the instantiated object. Notice that I said reference, not pointer. People tend to use both as synonyms, and that’s ok for most scenarios. However, if you want to be precise, there are some differences between reference and pointers. For instance, reference will support identity checks but, unlike pointers,  it cannot support all comparison operations (for instance,with pointers you can use the < and > operators,but you cannot do that with references).

Reference types are always allocated from the heap and the new operator will always return a reference for the memory that was allocated for that object. If you look at the previous snippet, that means that std will hold a reference to the memory position where the Student object was instantiated. Even though most of the types introduced by the framework are reference types, the truth is that they come with some gotchas. As you might expect, allocating memory from the heap incurs in a slight performance hit. And it might event force a garbage collection if the heap is getting “full”. Besides that, all reference types have some “additional” members associated with them that must be initialized (more on this in future posts).

In C#, all types declared using the class keyword are reference types. Here’s a possible definition for our Student type used in the previous snippet:

public class Student {
    public String Name { get; set; }
    public Int32 Age { get; set; }
}

There still more to say about reference types, but before that, we need to understand value types. Stay tuned for more.

Sep 10

Checked vs unchecked operations

Posted in Basics, C#       Comments Off on Checked vs unchecked operations

In the previous post, I’ve introduced the concept of primitive type and we’ve seen how they’re treated in a special way by the compiler. One interesting topic that pops up when we talk about primitive types is operation overload. For instance, what happens when you try to run the following snippet:

var max1 = Int32.MaxValue;
var max2 = Int32.MaxValue;
Int32 max3 = max2 + max1;
Console
.WriteLine(max1 + max2);

MaxValue returns the maximum value that can be stored in an Int32 variable. It’s obvious that trying to save that value into an integer won’t produce the correct result. However, you won’t get an exception if you try to run the previous code. The justification is simple: you see, the CLR offers several IL instructions for doing arithmetic operations. Some, allow overflow to occur silently (ex.: add) because they don’t check for it; others don’t and when that happens, you’ll get an exception (ex.:add.ovf).  As you might expect, the instructions which don’t run an overflow check are faster and those are the ones that are used by default by the C# compiler.

If you want to turn overflow verification for all the code in your assembly, then you can simply use the /checked+ compiler switch. If you’re only interested in checking a specific operator or block of instructions, then you can use the checked operator (there’s also an unchecked operator that does the opposite of that). Here’s a small snippet that shows how you can use these operators:

Int32 max3 = checked( max2 + max1 );

And yes, that should throw an exception at runtime. If you’re interested in running several instructions as checked operations, then the checked statement is for you:

checked {
    var max1 = Int32.MaxValue;
    var max2 = Int32.MaxValue;
    Int32 max3 = max2 + max1;
    Console.WriteLine(max3); Console.WriteLine(max3);
}

Notice that only the instructions placed directly inside the block will be checked,having no direct impact in method calls.

Many people recommend turning the /checked+ compiler switch on for debug builds for detecting problems associated with code where you’re not using the checked and unchecked blocks/statements. You should probably used the /checked+ option for release builds unless you can’t take the small performance hit associated with the use of the aritmethic checks.

Before ending the post,a small note on the Decimal type. You can’t run checked/unchecked operations against the Decimal type because the CLR doesn’t introduce any instructions for arithmetic operations against it. In practice, that means that the special treatment given to the Decimal type is supported only by the compiler (ie, adding too decimals will always result in invoking the Decimal’s corresponding operation method). So, if you hit an overflow during an arithmetic operation, you’ll end up getting an exception (even when you’re using the unchecked operator or the /checked- compiler switch).

And that’s it. Stay tuned for more.

Sep 09

An intro on types: primitive types exist!

Posted in Basics, C#       Comments Off on An intro on types: primitive types exist!

If you’re not new to .NET, you’ve probably heard several times that one type can be a value type or a reference type. Well, that is true. What people miss (sometimes) is that they’ve also got some data types which are used so often that compilers allow code to treat them in a simplified way: say hello to the primitive types.

Lets look at a small example. Nothing prevents us from writing this code:

System.Int32 myInt = new System.Int32();
myInt = 10;

But does anyone want to write that when they simply can write this:

System.Int32 myInt = 10;
Oh, yes, you can reduce it even further by using the int alias:
int myInt = 10;

Any data type that is *directly* supported by the compiler can be considered a primitive type. Notice the *directly*…it’s used here to indicate that the compiler knows what to do in order to create a new instance from “simplified” code.

Btw, you’ve surely noticed the int alias, haven’t  you? In fact, it’s not really an alias but a C# keyword that is mapped into the System.Int32 FCL type. Even though you can’t introduce new keywords, you can still introduce new alias to simplify the use of certain types. Here’s an example:

using integer = System.Int32;

And now it’s possible to use integer as an alias to the System.Int32 type (btw, I don’t recommend doing this). Currently, there are several types which are treated as primitive by the C#. In that list, you’ll find plenty of numeric types (Int32,Double,etc), strings (String or string – if you prefer the C# keyword), the object (System.Object) type and even the new dynamic type (which, btw, is mapped into an Object instance). As you can see, primitive types aren’t really limited to a subset of value types.

Besides knowing how to create these types, the compiler is also able to perform other interesting operations over them. For instance, it is able to convert automatically between two types without having any relationship between them. Here’s an example:

Int32 myInt = 10;
Int64 myLong = myInt;

Since Int64 and Int32 aren’t “related”, you can only get away with the previous code because you’re using primitive types (and the compiler has privileged knowledge about them!). The C# compiler will allow this type of implicit conversion only when it knows it’s safe (ie, when it knows that there are no data loss during the conversion – which doesn’t happen ion the previous example because you can always save a 32 bit signed integer into a 64 bit signed integer).

For unsafe conversions (ie, conversions where you loose precision), you need to perform an explicit cast. For instance, if you try to convert a Single into an Int32, you’ll need to perform a cast:

Single val = 10.9f;
Int32 myInt = (Int32) val;

In C#, myInt will end up with the value 10 since the compiler will simply truncate the Single’s value. There’s still one additional feature we have with primitive types: they can be written as literals. A literal is *always* an instance of a type and that’s why you can access its type member directly. Here’s an example:

String val = "Hello There".Substring(0, 5);

Finally, the compiler will also let you use several operators against primitive types and it will be responsible for interpreting them. For instance, you can write the following code:

Int32 someVal = 30;
Int32 res = 10*someVal;

There’s still an interesting discussion left (how to handle overflows), but I’ll leave it for a future post. Stay tuned for more.

Sep 07

Friend assemblies

Posted in Basics, C#       Comments Off on Friend assemblies

In the previous post, we’ve seen the difference between type visibility and member accessibility. A type can be visible to all the other types defined in the same assembly (internal) or it can be visible to any type, independently from the assembly where it’s defined. On the other hand, member accessibility controls the exposure of a member defined by a type.

By default, internal types cannot be instantiated from types defined in other assemblies. That’s why you’ll typically define your helper types as internal so that they can’t be consumed by types defined in other assemblies. There are, however, some scenarios where you’d like to grant access to all the types defined in another assembly and, at the same time, block access to all the other types defined in other assemblies. That’s where friend assemblies can help.

When you create an assembly, you can indicate its “friends” by using the InternalsVisibleToAttribute. This attribute expects a string which identifies the assembly’s name and public key (interestingly, you must pass the full key – not a hash – and you’re not supposed to pass the values of the culture, version and processor architecture which you’d normally use in a string that identifies an assembly). Here’s a quick example which considers the LA.Helpers assembly a friend:

[assembly:InternalsVisibleTo(“LA.Helpers,PublicKey=12312…ef”)]

As I’ve said before,you *do* need to pass the full public key (if the assembly isn’t strongly signed, then you just need to pass its name). In practice, all the types defines in the LA.Helpers assembly can now access all internal types defined on the assembly which contains the previous instruction. Besides getting access to all the internal types, friend assemblies can also access all of internal members of any type maintained in that assembly.

You should probably think carefully about the accessibility of your type’s members when you start grating friend access to other assemblies. Notice also that creating a friend relationship between two assemblies ends up creating a high dependency between  them and that’s why many defend that you should only use this feature with assemblies that ship on the same schedule. Notice that in my experience, I’ve ended up using this feature *only* for testing helpers that I don’t want to expose publicly from an assembly.

I think that most of us don’t use the command line for building any projects, but this post wouldn’t really be complete without mentioning some interesting details about the C# compilation process. When you’re building the friend assembly, you should use the /out parameter and pass the name of the assembly. This should improve your compilation experience since because you’re giving the compiler what it needs to know to check if the types defined in the assembly can access the internal types of the other assemblies. If you’re compiling a module (is anyone doing this in a regular basis?), then don’t forget to use the /moduleassemblyname parameter to specify the name of the assembly that will contain the module (the reason is the same as the one presented for the /out parameter).

And that sums it up quite nicely (I think). Stay tuned for more.

Sep 06

Type visibility vs member accessibility

Posted in Basics, C#       Comments Off on Type visibility vs member accessibility

One of the things I’ve noticed while trying to help others get started with the .NET framework is that they tend to confuse type visibility with member accessibility. In this quick post I’ll try to point out the differences between these two concepts. Let’s start with type visibility.

When you create a type (ex.: class, struct, etc.) it may be visible to all the code (public keyword) or only to the types defined in the same assembly as that type (internal keyword). In C#, you can use the public or the internal qualifier to define the visibility of a type. By default, all the types which haven’t been explicitly qualified with one of these keywords is considered to be internal:

//internal by default
struct T {
//… }

Member accessibility is all about specifying the visibility of the members of a type. In other words, the accessibility indicates which members might be accessed from some piece of code. Currently, the C# allows you do use 5 of the 6 supported CLR member accessibility options:

  • private: members qualified with the private keyword (C#) are only accessible by other members defined in the same type or in a nested type.
  • family: members qualified with the protected keyword (C#) can only be accessed by methods in the defining type, nested type or one of its derived types.
  • family and assembly: you *can’t* use this accessibility in C#. This accessibility says that a member can only be used by methods in the same type, in any nested type or in any derived type defined in the *same* assembly as the current type.
  • assembly: in C#, you use the internal keyword to specify this accessibility level. In this level, the member can only be accessed by all the types defined in the same assembly as the current type.
  • family or assembly: in C#, you need two keywords to specify this level: protected internal. In practice,it means that the member is accessible by any member of the type,any nested type, any derived type (*regardless* of the assembly) or any other method in the same assembly as the current type.
  • public: members qualified with the public keyword (C#) can be used by any other member in any assembly.

Before going on, it’s important to notice that member accessibility depends always in the visibility of the type. For instance, public members exposed by an internal type in assembly A *cannot* be used from assembly B (by default) since the type isn’t visible in that assembly.

In C#, if you don’t specify the accessibility of a member, the compiler will default to private in most cases (one exception: interface methods are always defined as public!). In C#, when you override a member in a derived type, you must use the same accessibility as defined in the base class. Interestingly, this is a C# restriction since the CLR does allow you to change the accessibility of a member in a derived class to a less restrictive level (ex.: you can go from private to protected in the override, but not the other way around).

There’s still a couple of things I have to say about member accessibility, but I’ll leave it for a future post. Stay tuned for more.

Sep 06

Partial classes

Posted in Basics, C#       Comments Off on Partial classes

In the previous post, I’ve talked about partial methods and I’ve promised that the next post would be about partial classes. And here it is! Now that I think about it, I should have written this post before the one on partial methods, but this order will have to do it now, won’t it?

The first thing you need to know about partial classes is that they depend entirely on the C# compiler. The second thing you should know is that you can apply the partial qualifier to classes, structs and interfaces (so I probably should have used Partial types for the title of this post).

In practice, when the C# compiler finds the partial qualifier applied to a type (class, struct or interface), it knows that its definition may span several source “groups”, which may (or may not) be scattered across several files. Partial classes are cool for splitting the “functionalities” of a type into several “code groups” (for example, I’ve used it for splitting the definition of a class into several files to improve readability).

Partial types were introduced to solve the same problem I’ve mentioned in the previous post: customization of code built through code generators. Now that you know the basics, lets go through a simple example. Suppose we’ve got a Student type and that you want to separate it’s features into different code groups. Here’s how you can do that with a partial class:

//File Student.cs
public partial class Student {
    public String Name { get; set; }
    public Int32 Age { get; set; }
}
//File Student.Validation.cs
public partial class Student {
    public void Validate(){
        //...
    }
}

In the previous snippet,we’re creating a new type (Student) which is split into 2 different files (Student.cs and Student.Validation.cs). When you compile the previous code (notice that both files must be in the same project!),you’ll end up with a single type (Student) which has all the members exposed by the partial classes.

Even though I’ve put the partial class definitions into different files, the truth is that you can place both in the same file (though I haven’t seen this being used like that in lots of places). Since partial types don’t depend on the CLR but on the compiler, you need to write all the files in the same language and they must be compiled into the same unit (ie, all the partial file definitions must be defined in the same project).

And there’s not much to say about partial types

Sep 04

Partial methods

Posted in Basics, C#       Comments Off on Partial methods

In previous posts, I’ve mentioned extension methods and how you can use them for extending existing types. After talking with a friend, I’ve noticed that I didn’t mention partial methods, which are really useful for code generators. Imagine you’re writing the code generator that needs to generate C# for some specs which are defined through some sort of wizard and that you need to allow customization of some portions of that code. In the “old days”, the solution was creating virtual methods which didn’t do anything and were called by the generated C# code.

Then, the developer was responsible for creating a new class that expanded the generated C# type and for overriding the required virtual methods. Unfortunately, this technique doesn’t really work all the time. For instance, if the generator is creating a struct, you’re out of luck because structs are implicitly sealed!

Btw, and before going on, you should notice that I’ve discarded the option of changing the generated code file because we all know that, sooner or later,something will happen that will force us to regenerate the code again (leading to the lost of all customizations that have been written).

C# 3.0 (if I’m not mistaken) introduced the concept of partial method for helping with this kind of problem. Partial methods work together with partial classes for allowing the customization of the behavior of a method. Here’s a quick example:

//suppose this is generated by a tool
partial class Address{
    private String _street;
    public String Street {
        get { return _street; }
        set {
            if (value == _street) return;
            _street = value;
            OnStreetChanged();
        }
    }
    //always private!
    partial void OnStreetChanged();
}
//dev code for customizing
partial class Address {
    partial void OnStreetChanged() {
        //do your thing here
    }
}

If you wanted,the generated class could have been sealed. Now, customizing the method is as simple as creating a new partial class and implementing the desired partial method. When compared with the virtual method approach I mentioned earlier, there is an important improvement: if you don’t define a custom implementation to your method, then the method definition and call will simply be removed from the source during compilation. In other words, if there isn’t an implementation of the partial method, then the compiler won’t generate any IL for performing that call.

Before you go crazy and start creating partial methods, you should consider that:

  • they can only be declared in partial classes or structs (the next post will be about partial classes).
  • the return type of a partial method is *always* void and you cannot define any out parameters (understandable since you’re not obliged to implement the partial method).
  • a delegate can only refer to a partial method if you define it explicitly (the reason is the same as the one presented for the previous item).
  • partial methods are always private (you can’t really apply any qualifier to the method and the compiler ensures that they’re always private).

And I guess this sums it up nicely. Stay tuned for more.

Sep 03

If you’re a long time programmer/developer, then you probably expect to be able to create a parameter that receives a variable number of parameters.In C#, to declare a method that accepts a variable number of parameters you need to qualify the parameter with the params keyword. Here’s a quick example:

static void PrintNames( params String[] names){
    foreach (var name in names) {
        Console.WriteLine(name);
    }
}

As you can see, you use an array to represent a variable number of parameters. Notice that you can only apply the params keyword to the last parameter defined by a method and that parameter’s type must be an array of a single dimension (of any type). When you add the params qualifier, you’re essentially simplifying the invocation code:

PrintNames("Luis", "Miguel", "Nunes", "Abreu");

As you can see from the previous snippet, you don’t need to pass an array to the PrintNames method. If you want, you can create an array and call the method, but I’m thinking that most people will prefer the approach used in the previous snippet.

When the compiler finds the params keyword, it applies the ParamArrayAttribute to that parameter. Whenever the C# compiler detects a call to a method,it will first try to match that calls against a method which hasn’t any parameter annotated with the ParamArrayAttribute.  It will only consider methods with a variable number of parameters (ie,that use the params keyword) when it can’t find a match in the previous candidate list.

Before ending, a performance related note: calling a method with a variable number of parameters incurs in a small performance hit since the compiler needs to transform the list of values into an array (in other words, the method will always receive an array, even though you haven’t built one explicitly). btw, this does not happen if you pass null to the method. And as you’ve seen, it takes longer to find that method because the compiler will first try to find a method which doesn’t use a variable number of parameters. That’s why you’ll find several classes which define several overloads with a variable number of parameters before introducing the method which has a single parameter annotated with the params qualifier (ex.: String’s Concat method).

And that’s it for now. Stay tuned for more.

Sep 02

Parameters by reference

Posted in Basics, C#       Comments Off on Parameters by reference

By default, parameters are always pass by value. However, the CLR does allow you to pass parameters by reference. In C#, you can use the out or ref keywords for passing parameters by reference. When the compiler sees that you’ve used these keywords, it will emit code that passes the *address* of the parameter rather than its value.

Interestingly, these two keywords are identical from the CLR’s point of view. The story is completely different for the C# compiler since it uses them to see who is responsible for initializing the value of a variable. Using the *ref* keyword is the same as saying that the caller is responsible for initializing the value of the parameter. On the other hand, using the out leaves that responsibility for the method. Here’s some really simple (dumb) methods:

static void ChangeNameWithOut(out String name){
    name = DateTime.Now.ToString();
}
static void ChangeNameWithRef(ref String name){
    name = DateTime.Now.ToString();
}

Now, here’s how you can use both methods:

String name = "Luis";
String anotherName;
ChangeNameWithOut(out anotherName);
ChangeNameWithRef(ref name);

As you can see, you’re not initializing the anotherName method before passing it to the ChangeNameWithOut method. If you tried to pass anotherName to the ChangeNameWithRef method, you’d end up with a compilation error: Use of unassigned variable ‘anotherName’.

You’ve probably noticed that you’re forced to use the ref and out keywords on the call. For a long time, this puzzled me and I thought that the C# compiler should be able to infer that from the call. According to Jeffrey Richter, the designers of the language felt that the caller should make their intentions explicitly. I’m not sure I agree with that,but it’s just the way it is. And,as we’ll see next, this decision allows to overloads methods based on these keywords.

You can use the out and ref parameters for overloading methods, though you cannot add two overloads that differ only in out or ref only. Here’s some code that illustrate these principles:

static void ChangeName(String name){}
static void ChangeName(ref String name){} //ok
static void ChangeName(out String name){} //error

Adding the first overload won’t lead to a compilation error because you can overload methods by using the ref or out keywords. Adding the last method leads to a compilation error because you cannot add an overload that differs only by out and ref.

Besides overloading, there are some gotchas when you use reference parameters. Here’s a small example that might catch you off guard:

static void DoSomething(ref Object someParameter){}

var str = "Luis";
DoSomething(ref str);

You can’t compile the previous code. If you try, you’ll get an error saying that you cannot convert from ‘ref string’ to ‘ref object’. In other words, the parameter’s type must match the type of the value that is passed. In case you’re wondering, this is needed to ensure that type safety is preserved. The next snippet shows why this is a good thing:

static void DoSomething(ref Object someParameter){
    someParameter = new Student();
}

And I guess this sums it up nicely. There’s still some more about parameters, but we’ll leave it for future posts. Stay tuned for more.

Sep 01

Parameters: by value or by reference? Say what?

Posted in Basics, C#       Comments Off on Parameters: by value or by reference? Say what?

I’m guessing that I won’t be giving any news when I say that parameters are used for passing values into methods. By default, parameters are passed by value. Here’s a quick example which will let us discuss this behavior:

public class Student {
    public String Name { get; set; }
    public Int32 Age { get; set; }
}
static void ChangeName(Student std) {
    std.Name = DateTime.Now.ToString()
}

As I was saying, parameters are passed by value. And that’s true. However, many still are surprised when they discover that ChangeName will update the Name of the Student’s instance that was passed into the method. How can that be? Isn’t passing by value the same as “copying”? Well, that is absolutely correct. What you must keep in mind is the *type* of variable you’re passing into  the method.

In my previous instance, Student is a reference type. What that means is that a variable of that type will reference some memory space, which is where the values of its fields will be stored. Now, even though I don’t like to talk about implementation details, I’d say that they help (at least, in this case). Lets suppose we’ve got some variable std:

var std = new Student { Name = "Luis", Age = 34 };

Now, you can think of std as holding the memory address where the Student object was allocated. When you look at if from this perspective,things might start to make sense,right? Suppose you’ve got the following code:

ChangeName(std);

Since ChangeName’s only parameter is passed by value, you should be able to see what’s going on. The value of the std variable is copied to th the std parameter and that means that the std will hold a memory address. When the ChangeName starts  executing, there are two “variables” pointing at the same “memory location”: the parameter and the variable which was passed into the method through that parameter. That’s why if you try to change the  value of the std parameter (ie, make it reference another instance of Student), that change won’t be replicated outside the method (notice that changing the value isn’t the same as changing the property of object to which the parameter “points” to). Here’s a quick example of what I mean:

static void ChangeName(Student std) {
    std = new Student { Name = "John", Age = 40 };
}

var std = new Student { Name = "Luis", Age = 34 };
ChangeName(std);
Console.WriteLine(std.Name);//prints Luis, not John

See the difference? If the std parameter was passed by reference, then the std variable would point to the new Student instance allocated inside the method. If you run the previous code, you’ll see that that doesn’t happen.

How, then, can we change this behavior? Simply by indicating that the parameters should be passed by reference. We’ll see how in the next post, so stay tuned!

Aug 31

Extension methods: what, how, when

Posted in Basics, C#       Comments Off on Extension methods: what, how, when

I’ve already written a little bit about extension methods in the past. However, since I’ve decided to create a new basics series, I think that I probably should go back and write about it again. I guess that the first thing I need to do is define the what. In other words, *what* is an extension method?

An extension method is a static method that you can invoke using instance method syntax. Here’s a quick example:

namespace StringHelpers {
    public static class StringExtension {
        //don''t need this in real world!!
        public static Boolean ContainsOnlyDigits(this String str){
            if(str == null ) throw new NullReferenceException();
            return str.All(c => char.IsDigit(c));
        }
    }
}
//use the extension method
namespace ConsoleApplication1 { //import extension methods using StringHelpers; class Program { static void Main(string[] args){ var notOnlyDigits = "123Nop"; //const is better Console.WriteLine(((String)null).ContainsOnlyDigits()); } } }

As you can see, an extension method is always static and must be defined in a non generic, non nested, static class. The first parameter of an extension method is annotated with the this qualifier and defines the type over which the method can be called using an instance method syntax. In order to use the extension method (the *how*), we must first introduce it in the current scope. To achieve that, we need to resort to the using directive (as shown in the previous example).

Whenever the compiler finds an extension method, it will convert it into a static method call. Currently, you can use extension methods to extend classes, interfaces,delegates (probably more about this in a future post) and enumerate types. Internally,and since external methods aren’t C# specific, there is a “standard” way to mark a method as an extension method. In C#, when you mark a static method’s first parameter with the this qualifier, the compiler will automatically apply the ExtensionAttribute to that method and to the class where that method is defined. By annotating the type with the attribute, the compiler is able to query an assembly and get all the types that have extension methods defined.

Extension methods are really great and do improve the quality and readability of the code. However, I’ve noticed that they’re starting to be abused…for instance, it was only recently that I’ve downloaded the source code for the Fluent NH project and I couldn’t stop noticing that the CheckXXX methods (used against PersistenceSpecification instances) are all implemented as static methods. I’m having a hard time understanding this and here’s why. In my limited experience, PersistenceSpecification exists so that you can write tests for your NH mappings. Now, that means that you’ll *be* using this methods to test the mappings whenever you instantiate a new instance of that type. Since the guys that wrote the code for those methods *do* have access to the PersistenceSpecification class, I really can’t understand their decision of defining these methods as extension methods (and yes, I know that MS used the same approach for extending the IEnumberable<T> interface, but I believe that we’re not talking about the same type of scenario: after all, it’s perfectly fine to use the IEnumerable<T> without using LINQ).

So, when should we use extension methods? Here’s my thoughts about it (the *when*):

  • use it *sparingly*. I really can’t stress this enough! We’re still doing OO and there are several options for extending types. Do remember that there are some versioning problems associated with extension methods. For instance, if the type you’re “extending” adds a new method with the same name as the extension method, the code will always use the instance method (after recompilation, of course) because extension methods will only be considered as candidates when all the instance method available have been considered as non-viable.
  • they should look and feel like extension methods. The typical example of not following this recommendation is adding an IsNullOrEmpty extension method to some class and returning true when the “this” parameter is null (btw, here’s an example of what I mean). If the “this” parameter of the extension method is null, it should really throw a null reference exception since this is the best option for mimicking what happens with instance methods.

Overall, I’d say that extension methods are an excellent tool, but don’t get carried on and start using it everywhere. After all, we’re still writing object oriented code, right?

Jul 21

Back to the basics: more on casting

Posted in Basics       Comments Off on Back to the basics: more on casting

I’ve already mentioned casting several times before, but I guess I’ve jumped over several basic features associated with that operation. Since this is a basics series, I should have probably explained casting instead of assuming that everyone knows how it works. So, I’ll go back and use this post to present some features associated with casting in C#. By default, the CLR allow us to cast an object to its type or to any of its base types. If you’re using C#, then you don’t need to do anything to cast an object to any of its base types. Take a look at the following code:

class A { }
class B : A { }
class Program {
    static void Main(string[] args){
        var b = new B();
        A a = b;
    }
}

The example is simple: we have two types, where one type (B) extends the other (A). As you can see, we can cast the variable b to the type A without using any special syntax. C# will also let you go from an A to a B (but only if that A is-a B). Take a look at the following snippet:

var a = new A();
var b = new B();
A anotherA = b; //ok
B anotherB = (B) anotherA; //also ok
B stillAnotherB = (B) a;//oops: InvalidCastException

As you can see, “going” from A to B is possible, but only when we’re talking about an instance of type B (or derived from B). The CLR will always ensure at runtime that the cast works and it ends up throwing an exception when that cast is not possible (and when you don’t define a conversion operator that allows the conversion between those types).

Why does the CLR enforces this rule? One of the corner stones of the CLR is type safety. In practice, this means that at runtime the CLR always knows the type of each instance of an object. If the CLR allowed casts without enforcing these rules, then  you could end up with unpredictable code that could cause security breaches in an application. I’m not sure if you’ve noticed but a type can’t spoof another type at runtime because the method that returns the type of an instance (GetType) is not virtual. Since it’s not virtual,you cannot override,so casting an object to Object and calling the GetType method over that instance will always return an object’s true type (even in those cases where you use the new keyword to “redefine” a method). Here’s some code that illustrates this technique:

class A { }
class B : A {
    public new Type GetType(){
        return typeof (A);
    }
}
class Program {
 static void Main(string[] args){
  var b = new B();
  Console.WriteLine(((Object)b).GetType());//prints DemoProj.B
  Console.WriteLine(b.GetType());//ATTENTION:prints DemoProj.A
 }
}

Do notice that if you don’t cast the variable to object (btw, this cast always succeeds), then you’ll end up getting a spoofed type. And that’s it. Stay tuned for more.

Jul 19

[Update: small update to the implicit operators code. Thanks Kevin]

In a previous post, I’ve mentioned conversion operators. But what is a conversion operator? In the past, I bet that we’ve all needed to convert from one type of object to another. When we’re talking about primitive types, the CLR knows how to perform the conversion (of course, when that is possible). However, when we’re not talking about primitive types, the CLR is only able to perform the conversion if the source object’s type is the same as (or derived from) the target type.

There are other scenarios where we’d like to convert from type A to type B but these types aren’t related. Since the types aren’t “related”, we can’t simply cast from one type to another. In these cases, we’re limited to adapting the API of our types so that they ease the “translation” between types. For instance, suppose we’ve got the following class:

public class A {
    private Int32 _someState;
    public A(){
    }
    public A(Int32 someState){
        _someState = someState;
    }
    public Int32 ToInt32(){
        return _someState;
    }
}

By writing this code, we can “convert” an integer into an A instance (through the constructor) and we can also get an integer from any A instance (by calling the ToInt32 method). Here’s an example:

var fromInteger = 10;
var someInstance = new A(fromInteger);
var wrappedInteger = someInstance.ToInt32();

Defining the methods presented in the class is really a good idea because it means that your class can be consumed from any .NET language. Now, if you’re writing code in a language like C# which supports conversion operators,you can take this a little further and create you own customized conversion operators. The idea is to be able to write code like this:

var fromInteger = 10;
var someInstance = (A)fromInteger;
var wrappedInteger = (Int32) someInstance;

The previous snippet is using an *explicit* conversion operator for converting from and to an integer. An explicit operator is define by a special public and static method where the return type or the parameter must be of the same type as the class where it’s being declared. In order to make the previous code compile,we need to add the following methods to our type:

public static explicit operator A(Int32 anInteger) {
    return new A(anInteger);
}
public static explicit operator Int32(A anA){
    return anA.ToInt32();
}

As you can see, I’ve implement both methods by using the constructor and helper method previously added to the class.  As you can see, both methods return an instance of type A or receive a single parameter of type A. Notice also the use of the explicit keyword.

By annotating our conversion operator methods with that keyword, we’re saying that the compiler is only allowed to use them when it finds an explicit cast. There is also another option here: we can create an implicit conversion operator. We should create an implicit operator whenever there’s no precision lost during the conversion. In our simple example, that never happens, so we can add implicit operators to our classes:

public static implicit operator A(Int32 anInteger){
    return new A(anInteger);
}
public static implicit operator Int32(A anA){
    return anA.ToInt32();
}

Since the implicit and explicit keywords aren’t part of the method signature, then you can’t simultaneously define an explicit and implicit operator for the same type. After adding the operators, you can now run the following code without any errors:

var fromInteger = 10;
A someInstance = fromInteger;
Int32 wrappedInteger = someInstance;

I have used conversion operators in the past and they have improved the readability of my code. For instance, they’re useful when you’re using a fluent builder to create a new instance of a type. Since conversion operators are methods, I guess it would be nice to see the final result of our C# code. Implicit cast operators end up generating methods named op_Implicit, while explicit convertors end up generating op_Explicit methods. Here’s the signature of the IL generated for the implicit operators:

.method public hidebysig specialname static int32 op_Implicit(class DemoProj.Program/A anA) cil managed

.method public hidebysig specialname static class DemoProj.Program/A op_Implicit(int32 anInteger) cil managed

We could, of course, improve our code and add other operators. For instance, lets add an explicit Single conversion operator:

public static explicit operator Single(A anA){
    return anA.ToInt32();
}
public static explicit operator A(Single aSingle){
    return new A((Int32)aSingle);
}

And now, we can write the following code:

var single = (Single) someInstance;
var anA = (A) single;

If you’ve been paying attention and you’ve been writing C# for some time, you’ve probably noticed something strange. How can we write two methods which only differ by return type?

I’ve already said that the explicit/implicit keyword isn’t used in the method signature, but “just to be sure”, you can go ahead and make all conversion methods explicit. After performing this small change, you’ll end up with two methods which differ only in the return type (the explicit convertors from A to Int32 and Single). What’s going on here?

Unlike C#, the CLR does allow you to overload based on the return type. So, if you’re writing IL (anyone?), then you *can* overload methods by return type (though this really isn’t recommended because you won’t be able to consume those types from C#). The C# compiler uses this knowledge and allows you to introduce these overrides when you’re writing operator convertors.

Before ending, you should also notice that there are some things to keep in mind before creating and using conversion operators:

  • you cannot create custom conversions for interfaces or for the Object type.
  • conversion operators are not executed when using the is and as operators.

And I guess that’s it for now. Stay tuned for more basics.

Jul 15

Back to the basics: type constructors

Posted in Basics       Comments Off on Back to the basics: type constructors

After the last two posts, I guess you could see this coming, right? Today it’s all about type constructors (and it’s a long post, so make sure you’re comfy). What is a type constructor (aka, class constructor)? In C#, it’s a static and private parameterless method which is named after the class where it’s defined. Here’s a small example:

class MyClass {
    static MyClass() {
        //initialize static fields here
    }
}

Type constructors must be private (in fact, in C# you can’t even use  that qualifier or the compiler will start complaining) and can’t receive any arguments. In practice, this means that you can only have one type constructor. Typically, you’ll be using type constructors to initialize static fields. Here’s an example:

class MyClass {
    private static Int32 _myField;
    static MyClass() {
        _myField = 10;
    }
}

If you’re just initializing static fields with “simple” expressions, then you can use the same technique I’ve shown you before for instance constructors. In other words, you can use static field initialization:

class MyClass {
    private static Int32 _myField = 10;
}

When the previous class is compiled, you’ll end up with code that is identical to the first snippet. To be sure,you can always fire up Reflector and see what’s the generated C#:

reflector

The previous image shows another interesting thing: type constructors are always named .cctor in the metadata table which contains the methods definitions of a type. Notice  that the C# compiler will never generate a type constructor automatically when the class doesn’t have any static fields which use the initialization “trick” I’ve mentioned above. Another interesting caveat is that type constructors don’t call the base type constructor (if it exists,that is). Here’s some  code that shows this:

class Program {
    static void Main(string[] args) {
        Console.WriteLine(MyDerived.SomeField);
    }
    class MyClass {
        static MyClass(){
            Console.WriteLine("MyClass");
        }
    }
    class MyDerived:MyClass{
        public static Int32 SomeField = 10;
        static MyDerived()  {
            Console.WriteLine("Derived");
        }
    }
}

When I first notice this, I was  a little bit surprised since that “seems” to go against what’s expected. Anyway, it works this way and that’s that. I was also surprised the first time I got a TypeInitializationException when trying to instantiate a new variable of a specific type. After some debugging, I’ve noticed that my type constructor was generating an exception which was being silently caught (poor coding, I know) before I tried to instantiate an object of that type. What I’m trying to say is that if your type constructor throws, then you can no longer instantiate an instance of that type in the current AppDomain.

Type constructors do have more surprises…Type constructor invocation is injected on the fly by the JIT when it detects that a “piece” of code is trying to access a type which has a static constructor that hasn’t been invoked yet. Since the CLR guarantees that a static constructor is executed only once (per AppDomain), then the thread that calls it does it from within a lock. Even if there are multiple threads trying to execute the constructor, only one will be able to call the type constructors while the others wait for the lock.

When the lock is released, they will notice that the type constructor has already been invoked and won’t call them again. Things can get a little complicated when you have two static constructors of different classes that “reference” each other. Interestingly, the CLR does ensure that both static constructors are called, but it can’t guarantee that one type constructor is run to completion before the other is executed.

But there’s more! As I’ve said, the JIT needs to decide where to insert the static constructor call when it notices that some piece of code is trying to access a member of a type which defines a static constructor. This makes sense, but it was only after reading Jeffrey Richter’s fantastic book that I’ve managed to get a clear picture of what’s happening. 

The static constructor call can be made *immediately* before some piece of code that creates an instance of that type or accesses a member of the type. Another option is to  guarantee that the call will be made *sometime* before one of those things happen. These two strategies are known as precise semantics and before-field-init semantics (respectively). Even though the CLR supports both approaches, the before-field-init semantics is the preferred option since it lets the CLR decide when to make the static constructor call (and this can bring huge gains in performance).

In practice, the approach used is defined by the compiler. When the C# compiler sees that you’ve explicitly defined a type constructor, it will always use the precise semantics call. If you’re just using inline initilization (and have no static constructor), then the compiler will use the before-field-init semantics. Notice that this information is maintained in the metadata table of a type (there’s a flag called beforefieldinit which is signaled for the before-field-init semantics).

And that’s it for today. Stay tuned for more.

Jul 14

Back to basics: structures instance constructors

Posted in Basics       Comments Off on Back to basics: structures instance constructors

In the previous post, I’ve talked about some interesting features that explain why reference type’s instance constructors behave the way they do. Today, we’ll keep looking at instance constructors, but we’ll concentrate on structs. Before going on, I’m assuming that you know the difference between value and reference types.

The first thing you should keep in mind is that there’s simply no way for you to prevent the instantiation of a value type. And this happens because you can only add non-parameterless constructors to your customs structs. Don’t even try to add a new parameterless constructor because the C# compiler will stop you immediately when you try to compile your project. Don’t believe me? Ok, try to compile this in C#:

struct Person {
    private Int32 __age;
    public Person(){ 
__age = 10;
} }

[Patiently waiting…]

Ok, we’re ready to keep going…You might be curious on why you can’t add a parameterless constructor to a struct. I was, especially after seeing that the CLR does allow you to add parameterless constructors to value types (ie, you can add a parameterless constructor to a value type, but you’ll need to use another language  – ex.: IL). I must confess that the best explanation I’ve read is in Jeffrey Richter’s fantastic  CLR via C# book (oh damn, I’ve forgot to write a review about it!). Suppose for a minute that we can define parameterless constructors to a value type. What would happen when you execute the following code (ok, never mind the example; concentrate only in the code):

public class Manager {
    private Person _person;
    public Manager() { }
}

If you’re expecting to see the Person’s constructor involved,then you better wait seated in a comfy chair :,). In order to improve runtime performance, constructors will only be  called if you call them explicitly  (compare this behavior with the one we have for reference types).  And according to Jeffrey, the team thought that this would confuse developers and opted for not allowing the definition of parameterless constructors in C#.

Now, it’s important to understand that you can still create parameter constructors for you custom structs. They do need to receive parameters and you must ensure that all the private fields are initialized. Take a look at the following snippet:

struct Color {
    private Byte _red;
    private Byte _green;
    private Byte _blue;
    private Byte __alpha;
    public Color(Byte red, Byte blue, Byte green) {
        _red = red;
        _blue = blue;
        _green = green;
    }
}

Anything wrong? Hell, yes! Since verifiable code “insists” that all fields must be written to before they’re read, you must initialize all the private fields from within all the instance constructors that have been added to a custom struct. So, the easiest thing you can do to make the previous code compile is initialize the _alpha field explicitly. Do notice that you must put the initialization code inside your constructor since the instance fields initializers aren’t permitted in value types.

Before ending, there’s still time for showing you an alternate way to initialize all the fields of a value type. Take a look at the following constructor for the Color struct:

public Color(Int32 alpha){
    this = new Color();
    _alpha= alpha;
}

I must confess that the first time I saw this I was really confused 🙂 Ok, so what’s going on here? Glad you asked 🙂 Calling new results in zeroing all the memory required for holding a Color object . You can then set the current instance to that new object. Not very readable, but it might save a few key strokes when you only want to explicitly initialize one of the fields of your struct.

One final note regarding initialization of value type fields in reference types: the CLR ensures that they’re always zeroed during initialization of that reference type. That doesn’t happen when we’re talking about value type fields in other value types. If you’re writing code in a “verifiable” language (ex.: C#), then you’re safe because the compiler will generate code that ensures that those types are zeroed out.

And I guess that’s it. After all, there were lots of things to say about structs instance constructors. Stay tuned for more!

Jul 12

[Update: updated the text that explains why you’d need to make an explicit call to the base class’ constructor. Thanks Damien]

In the last post, I’ve mentioned constructors…but what is a constructor? The first thing you should keep in  mind is that there are several types of constructors and, in this post, we’ll only be talking about instance constructors. So, what’s an instance constructor?

Constructors are special methods which are responsible for initializing an object to a “good” state (curiosity snippet: constructors are represented by .ctor methods in the method definition table of a type). In practice, you create a constructor by adding a method with the same name as the class that does not return anything. Here’s a quick example:

public class Test {
    public Test(){
    }
}

The previous snippet creates an instance constructor which doesn’t receive any parameters. When you don’t define *any* constructor, the C# compiler will automatically add a parameterless (aka, default) constructor which looks like the one shown in the previous snippet.

When you create a new reference type, the runtime ensures that all the fields of that type are zeroed out before the instance constructor gets called. Since constructors are never inherited, then you cannot use the virtual, new, etc. qualifiers with them.  What you should do is call the base class from within your class’s constructor. The following snippet shows how:

public class Test {
    public Test(): base(){
    }
}

Once again,if you don’t add the call explicitly,the compiler will try to add one for you. However, if you’ve introduced a non-parameterless constructor in your base and don’t there’s not any parameterless constructor, then you do need to call the base explicitly (the compiler will remind you of that because it won’t compile your code in these cases):

public class Test {
    public Test(String name){
    }
}
public class Derived:Test {
        public Derived(String name): base(name){
        }
    }
}

Since all types inherit from Object, then you must be probably wondering what happens when the Object’s constructor is called (and it will be called since Object does introduce a default constructor which will be called from within Test’s “customized” constructor). I’m sorry to disappoint you, but the truth is that that constructor simply doesn’t do anything. Yep, that is true…it does nothing, zero, nickles, rien de rien…And the reason for that is really simple: the Object class doesn’t have any state (ie, it defines no fields), so its constructor doesn’t really do anything “visible”.

Even though constructors aren’t inherited, you can still introduce several constructors (overloads) provided they have different signatures. In these cases, you tend to reuse the more specific constructor. Here’s an example of what I mean:

public class Test {
    public Test(String name){
    }
    public Test():this(""){
    }
}

As you can see, the parameterless constructor ends up “redirecting” to the other constructor. Doing this sort of thing is really a good practice because you can centralize all your code in one place.

Before ending this post, there’s still time for talking about initialization. C# allows us to set the initial value of a field in the declaration instruction:

public class Test {
    private Int32 _number = 10;
    public Test(){
    }
}

This is just convenient syntax. During compilation, the compiler will automatically put the initialization for _number in the first line of the default constructor. Do notice that this line is inserted before the call to the base class’ constructor and it will be “replicated” through each of the constructors defined by your class (which might not really be a good thing). In practice, this means that if you have a lot of constructor overloads, you shouldn’t use the field initialization syntax: instead, you should rely on putting the initialization code on a “central” constructor which ends up being called by all the others constructors.

And I guess this covers most of the things you can do with reference type instance constructors. Stay tuned for more.

Jul 09

Ok, now that the PT Silverlight book is mostly done (I’m awaiting for feedback form my reviewers – if you’re one of them and you’re reading this post, then stop right now and go back to the manuscript please! :),,), I guess it’s time to return to my blog. And I’ll be writing some posts for my basic series, which is a topic that seems to make everyone happy 🙂 – at least, I’ve received several emails about the previous posts, so I assume that this is a subject in which people are interested in.

We’ve all had to do some casting in the past, right? Everyone knows how to write a cast in C#: you just indicate use the ( ) operate, stating the type you want and that’s it. Here’s an example:

var aux = (MyClass)derived;

The previous code will compile, but you might end up getting an exception if derive isn’t a MyClass instance (or derived from it or if you don’t define a conversion operator in your derived’s class). So, how can you check if an object is of a specific type without getting an exception? Enter the is and as operators! The is operator will never throw an exception and it will return a Boolean which indicates if the object is of a specific type:

var isDerived = derived is MyClass;
if( isDerived ){
  //do something but cast before
  //ex.: ((MyClass)derived).CallSomeMethod();
}

You can use this, but I’m guessing that what you’d really love to have is a cast which doesn’t throw an exception when the type you’re casting to isn’t “compatible”. Enter the as operator:

var auxDerived = derived as MyClass;
if( auxDerived != null ){
  //do something
  //no need for casting
  //auxDerived.CallSomeMethod();
}

As you can see, the as operator will try to cast the object into the desired type. When that is possible, the result will automatically point to an instance of that type. When it isn’t, you’ll get null (that’s why you generally test the result of the operator against null before writing the code that accesses the members of the desired type).

Now, before you forget that the is operator exists, take a look at the following snippet:

public struct MyStruct { //more code }
var aux = instance as MyStruct;

Do you see anything wrong? No? Ok, let’s take a look at the docs:

Note that the as operator only performs reference conversions and boxing conversions. The as operator cannot perform other conversions, such as user-defined conversions, which should instead be performed using cast expressions.

In other words, the previous snippet won’t compile because MyStruct is a struck. In fact, if you do read the docs, then you’ll notice that it  says that:

The as operator is like a cast except that it yields null on conversion failure instead of raising an exception. More formally, an expression of the form:

expression as type

is equivalent to:

expression is type ? (type)expression : (type)null

In fact, you’ll be able to program without knowing this (since the compiler will enforce it at compilation time. However, you’ll really be amazed by the number of guys that says that they’re an “expert” in C# and don’t know about this behavior. And did I said that there was one C# “expert” that told me there was no way to “cast” a type into another if there wasn’t a relationship of is-a between them? I guess he didn’t had the time to read the section on converters in the C# spec…And no,I’m not really a C# expert. I’m just a curious guy which finds it interesting to write about this small things which aren’t used/known to many.

And that’s it for now. Stay tuned for more.

Apr 24

Constants are (really!) different from read-only fields. Unfortunately, there are still many people which can’t tell the difference between them, so I thought about writing a small post on this topic.

A constant is a symbol (ie, an alias) for value that *never* changes. That’s why you can only define constant primitive types (wtf? Primitive types? Don’t we have only value vs reference types in .NET? Good question…I’ll return to this in a future post :),,). When I say primitive type, I’m thinking about Boolean, Int32, Single, etc. Interestingly, there’s a small exception to this rule: the C# compiler does allow us to define a constant for a non-primitive type if we set that value to null (in my opinion, this is not that useful, but the truth is that you can do it!)

Whenever you use a constant, you’ll end up getting the value of the constant in the generated IL code. Suppose you’ve got assembly A with the following constant definition:

public sealed class Constants {
    public const String Message = "Hello!";
}
Whenever you write code like this:
Console.WriteLine(Constants.Message);

You’ll really end up with something like this:

Console.WriteLine("Hello"); //you get the idea :)

Now, embedding the constant value means that the compiler won’t allocate any memory for that constant because its value will be “embedded” in the code. There is, however, one thing which could “hurt” you. Suppose that the constant is defined in one assembly (let’s call it assembly A) and it’s used in another (say, B). Since constants are “copied” to the resulting IL, that means that if you change the value of the constant, you can’t simply recompile assembly A and redeploy it. With constants, you’re forced to recompile both assemblies.

The solution to the previous “problem” is to use read-only fields. A read-only field is declared with the C#’s readonly modifier and it can only be written to from within a constructor (type or instance, depending on the field being static or not). So, if we need to get better “versioning” support for a “constant” value, we should resort to static read-only fields. Here’s some code which re-writes the previous example with fields:

public sealed class Constants {
    public static readonly String Message = "Hello!";
}

And now you should be able to change the value of Message and get away with recompiling only assembly A (in this case, there’s no need to recompile assembly B!). Btw, if you don’t believe me, you can always use .NET Reflector to see the final result (just don’t forget to compare the IL – and not the C# or VB.NET – for both cases).

I’d say that most of us don’t really care about this subtle differences. However, you should consider them whenever you start writing components that get reused for others. And that’s it for now. Stay tuned for more posts on “basic” .NET features.

Apr 14

Back to the basics: Culture

Posted in .NET, Basics, C#       Comments Off on Back to the basics: Culture

In a previous post about basic concepts, I’ve talked a little bit about the version number and about the different “types” of version you can find in an assembly. Today, I’ll proceed with the basics concepts series and we’ll take a quick detour into Cultures.

Cultures are identified by a string which can contain two parts, known as primary and secondary parts (I’ve seen them called tags too). Here’s an example: “en-GB”. The previous string identifies the current culture as British English. Notice that the secondary part isn’t really required (in that case, you’d end up with the string “en” and you could only infer that  the culture is English).

When building culture-aware applications, you should always have a culture neutral assembly (which contains code and the application’s default resources) and then create one or more separate assemblies which contains only culture specific resources (these assemblies are known as satellite assemblies). The only thing you need to keep in mind is that you must take some care when deploying your culture assemblies: by default, they must be put it in a subdirectory whose name matches the culture of the assembly (if your app is in a folder named Dumb and you’re deploying a British English assembly,  then you need to put that culture specific assembly in the Dumben-GB folder).

Before ending, don’t forget that the culture of the current assembly is also part of its identity! And I guess that’s it for now.

Apr 12

It still amazes me how so many people don’t understand (or even care about) the version number of an assembly. A version number is composed of four parts: major number, minor number, build number and revision number. For instance, here’s a valid version number:

4.10.800.1

The first two (major and minor) define what is know as the “public version” of an assembly (notice that this number is used whenever you export an assembly – ex.: COM Interop). The third number defines the build of an assembly. Suppose, for instance, that you work for a company which produces a daily build of an assembly. In this case, this number should be incremented for each day’s main build. Finally, the last part is called the revision number. You’ll change its value whenever you need to perform an “extra build” to solve a pending issue (ex.: a bug which has been found after the daily build).

Now, that we all understand version numbers, I guess I should mention that you’ll encounter several  “types” of version numbers. An assembly is always associated with three version numbers:

  • AssemblyFileVersion: used for information purposes only. It’s the number you see when you access the properties of an assembly in Windows Explorer;
  • AssemblyInformationalVersion: again, this is used for informational purposes. It indicates the version of the product that includes this assembly.
  • AssemblyVersion: this version number is stored on the metadata and it’s used by the CLR when binding to strongly named assemblies.

And I guess this sums it up for assembly versions…