ATL::CStringW vs. std::wstring Performance: String Sorting and Comparisons

According to previous measurements, std::wstring performs better than ATL::CStringW:

  1. in all string concatenation tests
  2. when sorting string vectors that are made by small strings, thanks to std::basic_string’s SSO

So, I focused my attention on the string vector sorting scenario, and it seems to me that (at least in the VS2015 implementation), the slowdown of (non-SSO) wstrings is caused by wmemcmp calls, that are used to compare wstrings when sorting the string vectors. On the other hand, CStringW invokes wcscmp, that seems to run faster.

In fact, invoking std::sort with a custom comparator function that calls wcscmp to compare the C-style pointers returned by wstring::c_str, results in faster vector<wstring> sorting times. So, in this case wstring sorting performs better than CStringW even for non-SSO strings.

However, as Stephan T. Lavavej pointed out, std::basic_string supports embedded nulls, so wstring cannot use wcscmp (that works only for C-style null-terminated strings, without embedded nulls).

 

String Performance Tests: ATL vs. STL

I wrote some C++ code to test the performance of the ATL CStringW class vs. the C++ Standard Library’s std::wstring.

There are several aspects that can be considered when comparing string class performance: In the aforementioned code, I tested string vector sorting and string concatenation.

The code is available in this repository on GitHub.

You can take a look at the README for further details (including my test results).

 

The CStringW with wcout Bug Under the Hood

I discussed in a previous blog post a subtle bug involving CStringW and wcout, and later I showed how to fix it.

In this blog post, I’d like to discuss in more details what’s happening under the hood, and what triggers that bug.

Well, to understand the dynamics of that bug, you can consider the following simplified case of a function and a function template, implemented like this:

void f(const void*) {
  cout << "f(const void*)\n";
}

template <typename CharT> 
void f(const CharT*) {
  cout << "f(const CharT*)\n";
}

If s is a CStringW object, and you write f(s), which function will be invoked?

Well, you can write a simple compilable code containing these two functions, the required headers, and a simple main implementation like this:

int main() {
  CStringW s = L"Connie";
  f(s);
}

Then compile it, and observe the output. You know, printf-debugging™ is so cool! 🙂

Well, you’ll see that the program outputs “f(const void*)”. This means that the first function (the non-templated one, taking a const void*), is invoked.

So, why did the C++ compiler choose that overload? Why not f(const wchar_t*), synthesized from the second function template?

Well, the answer is in the rules that C++ compilers follow when doing template argument deduction. In particular, when deducing template arguments, the implicit conversions are not considered. So, in this case, the implicit CStringW conversion to const wchar_t* is not considered.

So, when overload resolution happens later, the only candidate available is f(const void*). Now, the implicit CStringW conversion to const wchar_t* is considered, and the first function is invoked.

Out of curiosity, if you comment out the first function, you’ll get a compiler error. MSVC complains with a message like this:

error C2672: ‘f’: no matching overloaded function found

error C2784: ‘void f(const CharT *)’: could not deduce template argument for ‘const CharT *’ from ‘ATL::CStringW’

The message is clear (almost…): “Could not deduce template argument for const CharT* from CStringW”: that’s because implicit conversions like this are not considered when deducing template arguments.

Well, what I’ve described above in a simplified case is basically what happens in the slightly more complex case of wcout.

wcout is an instance of wostream. wostream is declared in <iosfwd> as:

typedef basic_ostream<wchar_t, char_traits<wchar_t>> wostream;

Instead of the initial function f, in this case you have operator<<. In particular, here the candidates are an operator<< overload that is a member function of basic_ostream:

basic_ostream& basic_ostream::operator<<(const void *_Val)

and a template non-member function:

template<class _Elem, class _Traits> 
inline basic_ostream<_Elem, _Traits>& 
operator<<(basic_ostream<_Elem, _Traits>& _Ostr, const _Elem *_Val)

(This code is edited from the <ostream> standard header that comes with MSVC.)

When you write code like “wcout << s” (for a CStringW s), the implicit conversion from CStringW to const wchar_t* is not considered during template argument deduction. Then, overload resolution picks the basic_ostream::operator<<(const void*) member function (corresponding to the first f in the initial simplified case), so the string’s address is printed via this “const void*” overload (instead of the string itself).

On the other hand, when CStringW::GetString is explicitly invoked (as in “wcout << s.GetString()”), the compiler successfully deduces the template arguments for the non-member operator<< (deducing wchar_t for _Elem). And this operator<<(wostream&, const wchar_t*) prints the expected wchar_t string.

I know… There are aspects of C++ templates that are not easy.

 

Wiring CStringW with Output Streams

We saw earlier that there’s a subtle bug involving CStringW and wcout.

If you really want to make CStringW work with wcout without the additional GetString call, you can follow a common pattern that is used to enable a C++ class to work with output streams.

In particular, you can define an overload of operator<<, that takes references to the output stream and to the class of interest, e.g.:

std::wostream& operator<<(std::wostream& os, 
                          const CStringW& str)
{
    return (os << str.GetString());
}

Note that the call to CString::GetString is hidden inside the implementation of this operator<< overload.

With this overload, the following code will output the expected string:

CStringW s = L"Connie";
wcout << s;

Note that this will work also for other wcout-ish objects (output streams), like wostringstream.

 

Subtle Bug with CStringW and wcout: Where’s My String??

Someone wrote some C++ code like this to print the content of a CStringW using wcout:

CStringW s = L"Connie";
…
wcout << s << …

The code compiles fine, but the output is a hexadecimal sequence, not the string “Connie” as expected. Surprise, surprise! So, what’s going on here?

Well, wcout doesn’t have any clue on how to deal with CStringW objects. All in all, CStringW is part of ATL; instead wcout is part of the <iostream>: they are two separate worlds.

However, CStringW defines an implicit conversion operator to const wchar_t*. This makes it possible to simply pass CStringW objects to Win32 APIs expecting input C-style NUL-terminated string pointers (although, there’s a school of thought that discourages the use of implicit conversions in C++).

So, wcout has no idea how to print a CStringW object. However, wcout does know how to print raw pointers (const void*). So, in the initial code snippet, the C++ compiler first invokes the CStringW implicit const wchar_t* conversion operator. Then, the <iostream> operator<< overload that takes a const void* parameter is used to print the pointer with wcout.

In other words, the hexadecimal value printed is the raw C-style string pointer to the string wrapped by the CStringW object, instead of the string itself.

If you want to print the actual string (not the pointer), you can invoke the GetString method of CStringW. GetString returns a const wchar_t*, which is the pointer to the wrapped string; wcout does know how to print a Unicode UTF-16 wchar_t string via its raw C-style pointer. In fact, there’s a specific overload of operator<< that takes a const wchar_t*; this gets invoked, for example, when you pass wchar_t-string literals to wcout, e.g. wcout << L”Connie”, instead of picking the const void* overload that prints the pointer.

So, this is the correct code:

// Print a CStringW to wcout
wcout << s.GetString() << …

Another option is to explicitly static_cast the CStringW object to const wchar_t*, so the proper operator<< overload gets invoked:

wcout << static_cast<const wchar_t*>(s) << …

Although I prefer the shorter and clearer GetString call.

(Of course, this applies also to other wcout-ish objects, like instances of std::wostringstream.)

P.S. In my view, it would be more reasonable that “wcout << …” picked the const wchar_t* overload also for CStringW objects. It probably doesn’t happen for some reason involving templates or some lookup rule in the I/O stream library. Sometimes C++ is not reasonable (it happens also to C).

Safe Array Sample Code on GitHub

I uploaded some sample code on GitHub, that shows how to create safe arrays in C++ and consume them in a C# application.

This project contains a C++ DLL (with a C interface) that exports two functions to produce safe arrays containing bytes and strings.

Then, there’s a WinForms C# application that consumes these safe arrays using proper PInvoke declarations, and shows their content on screen.

 

Marshal STL string vectors Using Safe Arrays

Suppose you have a vector<string> in some cross-platform C++ code, and you want to marshal it across module or language boundaries on the Windows platform: Using a safe array is a valid option.

So, how can you achieve this goal?

Well, as it’s common in programming, you have to combine together some building blocks, and you get the solution to your problem.

A safe array can contain many different types, but the “natural” type for a Unicode string is BSTR. A BSTR is basically a length-prefixed Unicode string encoded using UTF-16.

ATL offers a convenient helper class to simplify safe array programming in C++: CComSafeArray. The MSDN Magazine article “Simplify Safe Array Programming in C++ with CComSafeArray” discusses with concrete sample code how to use this class. In particular, the paragraph “Producing a Safe Array of Strings” is the section of interest here.

So, this was the first building block. Now, let’s discuss the second.

You have a vector<string> as input. An important question to ask is what kind of encoding is used for the strings stored in the vector. It’s very common to store Unicode strings in std::string using the UTF-8 encoding. So, there’s an encoding impedance here: The input strings stored in the std::vector use UTF-8; but the output strings that will be stored as BSTR in the safe array use UTF-16. Ok, not a big problem: You just have to convert from UTF-8 to UTF-16. This is the other building block to solve the initial problem, and it’s discussed in the MSDN Magazine article “Unicode Encoding Conversions with STL Strings and Win32 APIs”.

So, to wrap up: You can go from a vector<string> to a safe array of BSTR strings following this path:

  1. Create a CComSafeArray<BSTR> of the same size of the input std::vector
  2. For each string in the input vector<string>, convert the UTF-8-encoded string to the corresponding UTF-16 wstring
  3. Create a CComBSTR from the previous wstring
  4. Invoke CComSafeArray::SetAt() to copy the CComBSTR into the safe array

The steps #1, #3, and #4 are discussed in the CComSafeArray MSDN article; the step #2 is discussed in the Unicode encoding conversion MSDN article.

MSDN Magazine Article: Simplify Safe Array Programming in C++

The March 2017 issue of MSDN Magazine contains a feature article of mine on simplifying safe array programming in C++ with the help of the ATL’s CComSafeArray class template.

There is also an accompanying web-only side bar introducing the SAFEARRAY C data structure and some of the basic operations available for it via Win32 API calls, although for C++ code I encourage the use of a convenient higher-level C++ object-oriented wrapper like ATL::CComSafeArray.

Safe arrays are useful for example when you have a COM component and you want to exchange array data between the component and its clients (that can be potentially written in languages even different than C++, e.g. C#, or scripting languages).

I wish I could have had such a resource available when I did some safe array programming in C++.

Some of the insights and experience I developed in that regard are distilled in the aforementioned article.

I hope it may be helpful to someone.

Check it out here!

 

Updates to the ATL/STL Unicode Encoding Conversion Code

I’ve updated my code on GitHub for converting between UTF-8, using STL std::string, and UTF-16, using ATL CStringW.

Now, on errors, the code throws instances of a custom exception class that is derived from std::runtime_error, and is capable of containing more information than a simple CAtlException.

Moreover, I’ve added a couple of overloads for converting from source string views (specified using an STL-style [start, finish) pointer range). This makes it possible to efficiently convert only portions of longer strings, without creating ad hoc CString or std::string instances to store those partial views.

 

Unicode Encoding Conversions Between ATL and STL Strings

At the Windows platform-specific level, when using C++ frameworks like ATL (or MFC), it makes sense to represent strings using CString (CStringW in Unicode builds, which should be the default in modern Windows C++ code).

On the other hand, in cross-platform C++ code, using std::string makes sense as well.

The same Unicode text can be encoded in UTF-16 when stored in CString(W), and in UTF-8 for std::string.

So, there’s a need to convert between those two Unicode representations. I discussed the details of such conversions in my MSDN Magazine article: “Unicode Encoding Conversions with STL Strings and Win32 APIs”However, the C++ code associated to that article used std::wstring for UTF-16 strings.

I’ve created a new repository on GitHub, where I uploaded reusable code (in the form of a convenient header-only module) for converting between UTF-16 and UTF-8, using CStringW for UTF-16, and std::string for UTF-8. Please feel free to check it out!