How Small Is Small Enough for the SSO?

Yesterday I wrote about the SSO.

A good question would be: “What is the maximum length threshold to trigger the SSO?” In other words: How long is too much for a string to enable the SSO?

Well, this limit length is not specified by the standard. However, for the sake of curiosity, in VS2015’s implementation, the limit is 15 for “narrow (i.e. char-based) strings, and 7 for “wide (i.e. wchar_t-based) strings.

This means that “meow”, “Connie” and “Commodore” are valid candidates for the SSO, as their lengths are less than 16 chars. On the other hand, considering the corresponding wide strings, L“meow” and L“Connie” are still SSO’ed, as their lengths are less than 8 wchar_ts; however, L“Commodore” breaks the 7 wide characters limit, so it’s not SSO’ed.

Thanks to Mr. STL for his clarification on the SSO length limit in Visual Studio’s implementation.

 

The Small String Optimization

How do “Connie” and “meow” differ from “The Commodore 64 is a great computer”?

(Don’t get me wrong: They are all great strings! 🙂 )

In several implementations, including the Visual C++’s one, the STL string classes are empowered by an interesting optimization: The Small String Optimization (SSO).

What does that mean?

Well, it basically means that small strings get a special treatment. In other words, there’s a difference in how strings like “Connie”, “meow” or “The Commodore 64 is a great computer” are allocated and stored by std::string.

In general, a typical string class allocates the storage for the string’s text dynamically from the heap, using new[]. In Visual Studio’s C/C++ run-time implementation on Windows, new[] calls malloc, which calls HeapAlloc (…which may probably call VirtualAlloc). The bottom line is that dynamically-allocating memory with new[] is a non-trivial task, that does have an overhead, and implies a trip down the Windows memory manager.

So, the std::string class says: “OK, for small strings, instead of taking a trip down the new[]-malloc-HeapAlloc-etc. “memory lane” 🙂 , let’s do something much faster and cooler! I, the std::string class, will reserve a small chunk of memory, a “small buffer” embedded inside std::string objects, and when strings are small enough, they will be kept (deep-copied) in that buffer, without triggering dynamic memory allocations.”

That’s a big saving! For example, for something like:

std::string s{"Connie"};

there’s no memory allocated on the heap! “Connie” is just stack-allocated. No new[], no malloc, no HeapAlloc, no trip down the Windows memory manager.

That’s kind of the equivalent of this C-ish code:

char buffer[ /* some short length */ ];
strcpy_s(buffer, "Connie");

No new[], no HeapAlloc, no virtual memory manager overhead! It’s just a simple snappy stack allocation, followed by a string copy.

But there’s more! In fact, having the string’s text embedded inside the std::string object offers great locality, better than chasing pointers to scattered memory blocks allocated on the heap. This is also very good when strings are stored in a std::vector, as small strings are basically stored in contiguous memory locations, and modern CPUs love blasting contiguous data in memory!

Optimizations similar to the SSO can be applied also to other data structures, for example: to vectors. This year’s CppCon had an interesting session discussing that: “High Performance Code 201: Hybrid Data Structures”.

I’ve prepared some C++ code implementing a simple benchmark to measure the effects of SSO. The results I got for 200,000-small-string vectors clearly show a significant advantage of STL strings for small strings. For example: in 64-bit build on an Intel i7 CPU @3.40GHz: vector::push_back time for ATL (CStringW) is 29 ms, while for STL (wstring) it’s just 14 ms: one half! Moreover, sorting times are 135 ms for ATL vs. 77 ms for the STL: again, a big win for the SSO implemented in the STL!

 

Beginner’s Bug: What’s Wrong with Brace-Init and std::vector?

Question: I have some old C++ code that creates a std::vector containing two NUL wchar_ts:

std::vector<wchar_t> v(2, L'\0');

This code works just fine. I tried to modernize it, using C++11’s brace-initialization syntax:

// Braces {} used instead of parentheses () 
// to initialize v
std::vector<wchar_t> v{2, L'\0'};

However, now I have a bug in my new C++ code! In fact, the brace-initialized vector now contains a wchar_t having code 2, and a NUL. But I wanted two NULs! What’s wrong with my code?

Answer: The problem is that the new C++11 brace initialization syntax (more precisely: direct-list-initialization) in this vector case stomps (prevails) on the initialize-with-N-copies-of-X constructor overload that was used in the original code. So, the vector gets initialized with the content of the braced initialization list: {2, L’\0’}, i.e. a wchar_t having code 2, and a NUL.

I’m sorry: to get the expected original behavior you have to stick with the classical C++98 parentheses initialization syntax.

 

The Chinese Dictionary Loading Benchmark Revised

Available here on GitHub.

[…] All in all, I’d be happy with the optimization level reached in #2: Ditch C++ standard I/O streams and locale/codecvt in favor of memory-mapped files for reading files and MultiByteToWideChar Win32 API for UTF-8 to UTF-16 conversions, but just continue using the STL’s wstring (or CString) class!

Unicode UTF-16/UTF-8 Encoding Conversions: Win32 API vs. C++ Standard Library Performance

In this MSDN Magazine article, I showed how to convert Unicode text between UTF-16 and UTF-8 encodings using direct Win32 API calls (in particular, I discussed in details the use of the MultiByteToWideChar API).

In addition to that, the C++ standard library offers some classes to perform such conversions.

For example, you can combine std::codecvt_utf8_utf16 with std::wstring_convert to convert from a UTF-16-encoded std::wstring to a UTF-8 std::string:

#include <codecvt>  // for codecvt_utf8_utf16
#include <locale>   // for wstring_convert
#include <string>   // for string, wstring

...
std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>, wchar_t> conversion;
std::wstring utf16 = L"Some UTF-16 string";
std::string utf8 = conversion.to_bytes(utf16);

I developed some C++ code to convert many strings and measure the time spent for the conversions using the aforementioned C++ standard library classes vs. direct Win32 API calls, and the result is clearly in favor of the latter (code compiled with VS2015 in release build):

STL: 1050 ms

Win32: 379 ms  << — Clearly wins

 

MSDN Magazine article on Unicode Encoding Conversions with STL Strings and Win32 APIs

Another C++ article of mine was published on MSDN Magazine (in the 2016 September issue):

“Unicode Encoding Conversions with STL Strings and Win32 APIs”

Check it out!

Thanks to my editor Sharon Terdeman for her excellent editing work, and to my tech reviewers David Cravey and Marc Gregoire for their useful suggestions and feedback.

EDIT (2016-SEP-02): A Visual Studio solution containing C++ code based on this article can be found on GitHub here.

 

 

Conflicting overloads with STL’s wstring and ATL’s CStringW

If you want to overload a function with both STL’s wstring and ATL’s CStringW, you’ll end up with a compiler error:

#include <iostream>  // for cout
#include <string>    // for wstring
#include <atlstr.h>  // for CStringW

using std::cout;

void f(const std::wstring& ws)
{
    cout << "f(const wstring&)\n";
}

void f(const CStringW& cs)
{
    cout << "f(const CStringW&)\n";
}

int main()
{
    f(L"Connie");
}

VS2015 emits a message like this:

error C2668: ‘f’: ambiguous call to overloaded function
could be ‘void f(const ATL::CStringW &)’
or ‘void f(const std::wstring &)’
while trying to match the argument list ‘(const wchar_t [7])’

The problem is that the raw L”Connie” string (which is a “const wchar_t[7]”) can be used to initialize both an ATL::CStringW and a std::wstring, since both those string classes have a (non-explicit) constructor overload taking raw C-style NUL-terminated string pointers as input (i.e. “const wchar_t*”).

This ambiguity can be fixed providing another ad hoc overload for the error-generating function:

void f(const wchar_t* psz)
{
    cout << "f(const wchar_t*)\n";
}

Or maybe just try to stabilize your code on one string class, or just use functions with different names when you have to deal with both CStringW and wstring.

 

Using STL Strings in ATL/WTL/MFC-Based C++ Code

Many C++ beginners (and not only beginners…) seem to struggle when dealing with STL’s strings in Win32 C++ code.

I wrote a detailed article published on MSDN Magazine on “Using STL Strings at Win32 API Boundaries”, which may be interesting or helpful for someone.

But here I’d like to discuss a specific slightly different scenario, that is the “interoperability” of STL’s strings with ATL or MFC-based code.

A common pattern in this context is having CString used as the default string class, instead of STL’s strings. For example, ATL, WTL and MFC’s classes use the CString class as their “native” string type.

Before moving forward, let me clarify that I’m going to discuss the case of Unicode builds, which have been the default since probably Visual Studio 2005. (“ANSI” builds are something of the past, and to me they don’t make much sense in modern C++ Windows software; they are also a big source of trouble and confusion between “ANSI” code page, several other different code pages, etc.).

In Unicode builds, CString represents a Unicode UTF-16 string. The STL’s equivalent (on the Windows platform with Visual Studio) is std::wstring.

A very common pattern in ATL/WTL/MFC-based C++ code is having:

  • Input strings passed as raw C-style NUL-terminated read-only string pointers, using the LPCTSTR Win32 typedef.
  • Output strings passed as non-const references to CString, i.e. CString&.

Let’s consider the input case first. The LPCTSTR typedef is equivalent to “const TCHAR*“.  In Unicode builds, TCHAR is equivalent to wchar_t. So, in the input case, in Unicode builds, the string is usually represented as a raw C-style NUL-terminated “const wchar_t*” pointer.

How can a std::wstring instance be passed to a “const wchar_t*” parameter? Simple: just call its c_str() method!

// void DoSomething(LPCTSTR inputString);

std::wstring s = /* Some string */;

// Pass std::wstring as an 
// input C-style NUL-terminated 
// wchar_t-based string
DoSomething( s.c_str() );

Now, let’s consider the CString& output case. Here, what I suggest is to simply create an instance of CString, pass it as the output string parameter, and then just convert the returned CString to a std::wstring. In code:

// void DoSomething(CString& outputString);

// Just follow the method's prototype literally,
// and pass a CString instance that will be filled
// with the returned string.
CString cs;
DoSomething(cs);

// Convert from CString to std::wstring
std::wstring ws(cs);

// Now use the wstring ...

The last line converting from CString to std::wstring works since CString has an implicit conversion operator to LPCTSTR, which in Unicode builds is equivalent to “const wchar_t*”. So, CString is happy to be automatically converted to a “const wchar_t*”, i.e. a “raw C-style NUL-terminated wchar_t-based read-only string pointer”.

On the other side, std::wstring has an overloaded constructor expecting exactly a “const wchar_t*”, i.e. a “raw C-style NUL-terminated wchar_t-based read-only string pointer”, so there’s a perfect match here!

This conversion code can be optimized. In fact, for the previous conversion, std::wstring needs to know the exact length of the input string (i.e. its wchar_t count), and to do so it would typically call an strlen-like function that works for wchar_t-based strings. This is typically a O(N) operation. But a CString already knows its length: it’s bookmarked in the CString class and the CString::GetLength() method will return it instantly in O(1)! Considering that std::wstring has another overloaded constructor expecting a pointer and a length (i.e. wchar_t-count), we can combine these pieces of information building a convenient simple and efficient conversion function from CString to wstring:

inline std::wstring ToWString(const ATL::CStringW& s)
{
  if (!s.IsEmpty())
  {
    return std::wstring(s, s.GetLength());
  }
  else
  {
    return std::wstring();
  }
}

(I explicitly used the more specific CStringW  class in the aforementioned code snippet, but you can freely use CString in Unicode builds. In fact, in Unicode builds, CString is equivalent to CStringW.)

P.S. This blog post discussed the specific Unicode UTF-16 case. If you want to use the STL’s std::string class, you can store Unicode text in it using UTF-8 encoding. In this case, conversions between UTF-16 and UTF-8 (for std::string) are required. This will be discussed in a future article.

EDIT (2016, September 12):  Conversions between Unicode UTF-16 and UTF-8 (for std::string) are discussed in detail in this MSDN Magazine article of mine: “Unicode Encoding Conversions with STL Strings and Win32 APIs”.