The Small String Optimization

How do “Connie” and “meow” differ from “The Commodore 64 is a great computer”?

(Don’t get me wrong: They are all great strings! 🙂 )

In several implementations, including the Visual C++’s one, the STL string classes are empowered by an interesting optimization: The Small String Optimization (SSO).

What does that mean?

Well, it basically means that small strings get a special treatment. In other words, there’s a difference in how strings like “Connie”, “meow” or “The Commodore 64 is a great computer” are allocated and stored by std::string.

In general, a typical string class allocates the storage for the string’s text dynamically from the heap, using new[]. In Visual Studio’s C/C++ run-time implementation on Windows, new[] calls malloc, which calls HeapAlloc (…which may probably call VirtualAlloc). The bottom line is that dynamically-allocating memory with new[] is a non-trivial task, that does have an overhead, and implies a trip down the Windows memory manager.

So, the std::string class says: “OK, for small strings, instead of taking a trip down the new[]-malloc-HeapAlloc-etc. “memory lane” 🙂 , let’s do something much faster and cooler! I, the std::string class, will reserve a small chunk of memory, a “small buffer” embedded inside std::string objects, and when strings are small enough, they will be kept (deep-copied) in that buffer, without triggering dynamic memory allocations.”

That’s a big saving! For example, for something like:

std::string s{"Connie"};

there’s no memory allocated on the heap! “Connie” is just stack-allocated. No new[], no malloc, no HeapAlloc, no trip down the Windows memory manager.

That’s kind of the equivalent of this C-ish code:

char buffer[ /* some short length */ ];
strcpy_s(buffer, "Connie");

No new[], no HeapAlloc, no virtual memory manager overhead! It’s just a simple snappy stack allocation, followed by a string copy.

But there’s more! In fact, having the string’s text embedded inside the std::string object offers great locality, better than chasing pointers to scattered memory blocks allocated on the heap. This is also very good when strings are stored in a std::vector, as small strings are basically stored in contiguous memory locations, and modern CPUs love blasting contiguous data in memory!

Optimizations similar to the SSO can be applied also to other data structures, for example: to vectors. This year’s CppCon had an interesting session discussing that: “High Performance Code 201: Hybrid Data Structures”.

I’ve prepared some C++ code implementing a simple benchmark to measure the effects of SSO. The results I got for 200,000-small-string vectors clearly show a significant advantage of STL strings for small strings. For example: in 64-bit build on an Intel i7 CPU @3.40GHz: vector::push_back time for ATL (CStringW) is 29 ms, while for STL (wstring) it’s just 14 ms: one half! Moreover, sorting times are 135 ms for ATL vs. 77 ms for the STL: again, a big win for the SSO implemented in the STL!

 

5 Replies to “The Small String Optimization”

  1. How the migration from small string to big string is handled? How in the operation (say find) the data location is determined?

    1. It depends on the particular implementation. You may have a local embedded char buffer for storing small strings, and a pointer data member to the beginning of the string. If the string is small enough to fit in the local “small” buffer, the string pointer data member points to the beginning of that local embedded buffer; else, the string pointer data member points to some “external” heap-allocated string (char array).

  2. OK, so I have an old-style fixed width metadata file of positions. Lets say the first three fields are ID (5 alphanumeric characters), Date (expressed as a 10 char array e.g. 11/17/1969), and Latitude (expressed as a 9 char array with a format of ‘+00.00000;-00.00000’). For latitude, it is important to preserve any spaces as they indicate the precision for which the latitude is known. So 36.43 degrees north will be the characters 3, 6, ., 4, 3 , followed by 3 spaces on file. The on-disk file format is then “BA23Z11/17/1969+36.43 “. I’d like to create a record and field types that can read/write this using streams, has a memory footprint the same as the on-disk format, and allows each field to be a type so that validation logic can be processed by the stream reader. Just using strings creates a data structure that is much larger than necessary. Is there a way to create a string type that behaves this way?

    1. I would not recommend using a string type for this situation. In fact you can use a lot less memory (relatively speaking) if you parse the data into an internal structured format appropriate for the meaning of the input. Your example here should use 6 bytes for the ID (last byte is the null terminator), 6 bytes for the date (1 byte int each for month and date, 2 byte int for year), and 4 bytes for the latitude (fixed point integer, since min/max value is +/- 9999999 divided by 100,000) for a total of 16 bytes vs. 24 bytes on disk. Your CPU architecture’s alignment requirements may change the actual bytes used in memory.

      Actually your question made for a fun little programming challenge. Here is what I came up with to fulfill your requirements; the only oddity is that it will print out the trailing zeroes in the latitude so 36.43 is shown as 36.43000:
      https://gist.github.com/bobbymcr/ae2b09e8e9c5a2c86f62e8040f910ed7

Leave a Reply

Your email address will not be published. Required fields are marked *