Mar 30

So, you know everything about text, right?– part II

Posted in Basics C# CLR      Comments Off on So, you know everything about text, right?– part II

In the previous post, we’ve started looking at how to work with text in C# and we’ve run a rather superficial analysis over the Char type. In this post, we’ll start looking at the String type which is probably what you’ll be using most of the time when you need to work with text.

What is a string? In .NET, a string can be seen as a an immutable sequence of characters. Programmatically,  it’s represented through the String type which is sealed and extends the Object type directly (in other words, it’s a reference type). Interestingly, Strings are also considered a primitive type in C# and this means that you can create new Strings through literals:

var aString = "Hello, there";

This is the preferred way to instantiate a new String. The type offers several constructors which let you create a new String from an unmanaged array of chars (char*) or from an unmanaged  array of 8-bit signed integers (aka, SByte). And no, there’s no constructor that receives a string as an argument, though there’s one which creates a new String from an array of Char.

Notice that using the preferred way of creating new strings (ie, through a literal) doesn’t really result in creating a “new“ instance through the newobj IL call. In these cases, the compiler embeds the string in the metadata so that it can load it at runtime.

Strings enjoy special treatment in several languages. For instance, it’s possible to concatenate strings at compile time or at runtime. Here’s an example where the C# compiler is smart enough to concatenate two Strings at compile time:

var aString = "Hello," +" there";

If you’ve had the luck to write some code in C or C++, then you’ll be right at home with the string escape sequences supported in C#:

var aString = "Hello,\tthere";

In the previous snippet, we’ve resorted to \t to introduce a tab in a string. In case you’re wondering, you can escape the \ char used in escape sequences by doubling it:

var path = "C:\\folder";

If you have lots of \ to escape, then you should be using verbatim strings:

var path = @"C:\folder";

Both snippets produce exactly the same results: you end up with a c:\folder string.

Before ending this initial post about strings, there’s one small detail I’ve mentioned at the beginning and which is *really* important. It’s probably the most important thing you should know about strings and I wouldn’t really feel well without writingabout it: strings are *immutable*. Once you create a string, there’s no way to modify it. No, you can’t change a char from it without building a new String instance. No, you can’t make it shorter or longer either!

This might be a surprise, but it does bring a couple of advantages too. For instance, since they’re immutable, you don’t have to worry about synchronization in multithreaded code (IMO, this is a big big thing!). So, you need to do a lot char manipulations? Probably need to concatenate lots of strings at runtime? If that is your case, then you should be using StringBuilder (we’ll be back to this in a future post).

And this is it for now. Stay tuned for more!