Apr 24

In the latest post, we’ve seen that strings are immutable. At the time, I’ve mentioned that this brings several advantages, but there are also a couple of gotchas. For instance,  concatenating strings can be an expensive operation, especially when you have lots of strings. To solve this kind of problem, we need to resort to the StringBuilder type. The idea is simple: you create a new instance of the type, add several strings through one of its methods and then, retrieve the final string through its ToString method (which is inherited from Object). Lets start from the beginning…

When you  create a new instance of the StringBuilder, you can use one of the several constructors which let you:

  • specify the maximum number of chars that can be kept by the StringBuilder instance;
  • indicate the default size of the array of chars used by the StringBuilder instance. Notice that that  array might grow when you add strings or chars and its available space  isn’t enough (in that case, the instance will double the current array’s size).
  • pass a string which is used to initialize the internal array of chars held by the type.

You can mix several of those items during construction because the type offers several overloads which let you specify those values (for instance, you can specify the maximum number of chars and the current capacity of the internal array through the StringBuilder( Int32 capacity, Int32 maxCapacity) constructor). The next snippet presents the simplest code you’ll need to instantiate a StringBuilder instance (which, btw, shows its most common use):

var str = new StringBuilder();

As you can see, the default constructor starts with a 16 chars array and limits the maximum size of that internal array to Int32.MaxValue. After creating an instance, there are a couple of properties/methods which let you change the internal StringBuilder’s array:

var str = new StringBuilder();
Console.WriteLine(str.Length);//number of chars in the array: 4
Console.WriteLine(str[0]); //get char at position 0

You can check the number of chars in the array through the Length property. You can also get or set a single char through the indexed Chars property. The Append method is probably the one you’ll use most often in the day-to-day operations. As you’ve probably inferred from the previous snippet, you can use it to append an object to the internal array (as you’re probably expecting, there are several overloads of this method). Besides Append, you can also use Insert, AppendFormat and AppendLine to add more chars to the internal array.

You’re probably expecting to be able to remove chars from the internal array. If that is the case, you’re correct: you can remove chars by calling the Clear (clears the internal buffer used by the StringBuilder instance) and Remove (removes a range of chars from the array) methods. Finally, there’s also the Replace method which is responsible for replacing all instances of a char with another char or all instances of a string with another string (yes, once again, there are several overloads of this method) in the internal buffer.

One interesting thing regarding these methods is that most of them (if not all) return a reference to itself. In practice, this means that you can chain several method calls:

    .Replace("o", "0!");

After concatenating everything and performing all the changes you need, you can get a string by calling its toString method:

var finalStr = str.ToString();

By default, most types inherit the Object’s ToString method which simply return the full name of the current object’s type. The StringBuilder type overrides the ToString method because, in this case, it makes more sense to return the encapsulated array char than the name of the type. Before ending, there’s a small gotcha which makes some operations more painful than they should be: there isn’t a complete parity between the methods exposed by String and StringBuilder. For instance, there’s no PadLeft method.

It’s really a pity that the StringBuilder doesn’t expose all the methods defined by String because that means 1.)  doing extra work or 2.) having to go to the String, perform the desired operation and back to StringBuilder instance for continuing with  the string manipulation work. And I guess this is it for now. Stay tuned for more!

2 comments so far

  1. David
    6:38 am - 4-25-2011

    Hi there,

    Thanks for that post. I’m an experienced developer and I still manage to find here and there tips that I wasn’t aware of in this “back to basics” series. Thanks for that.

    One thing I was sure you’ll concentrate on in this post is when to use a builder. It isn’t always recommended and it’s known that for simple (and when you don’t do a lot of) manipulations it’s better to work with plain strings.

    I’m sure you can elaborate more on that and I think that’s a very practical information tat all developers need.

    Thanks again,

    • luisabreu
      9:15 pm - 4-26-2011

      Yes, I believe that is a good idea and I’ll write about that in a future post…thanks for the tip!