CString or std::string – that is the question

…with apologies to Shakespeare [:)]

So, should we use CString or std::string class to store and manage strings in our C++ code?

Well, if there is a need of writing portable C++ code, the choice should be std::string, which is part of the C++ standard library.

But, in the context of C++ Win32 programming (using ATL or MFC), I find CString class much more convenient than std::string.

These are some reasons:

1) CString allows loading strings from resources, which is good also for internationalization.

2) CString offers a convenient FormatMessage method (which is good for internationalization, too; see for example the interesting problem of “Yoda speak” discussed in this post on Mihai Nita’s blog).

3) CString integrates well with Windows APIs (the implicit LPCTSTR operator comes in handy when passing instances of CString to Windows APIs, like e.g. SetWindowText).

4) CString is reference counted, so moving instances of CString around is cheap.

5) CString offers convenient methods to e.g. tokenize strings, to trim them, etc.

 

Note:
In Visual C++ 6’s timeframe, CString class was part of MFC. But since Visual Studio .NET 2003, CString was factored out of MFC, and was made part of ATL. So, post VC6, it is possible to use CString also in non-MFC code. To use CString in non-MFC code, just #include <atlstr.h> (a good place for that would be “StdAfx.h” precompiled header).

 

Conversion between Unicode UTF-16 and UTF-8 in C++/Win32

For fresh updated and richer information and modern C++ usage, please read my MSDN Magazine article (published on the 2016 September issue):

Unicode Encoding Conversions with STL Strings and Win32 APIs

New updated modern C++ code can be found here on GitHub.


Check out My Pluralsight Courses here.

 


 

 

C++ reusable code for mixed ATL/STL conversions can be found here on GitHub. Basically, ATL CString(W) stores Unicode text encoded in UTF-16, and std::string stores UTF-8-encoded text.


Code working with ATL’s CStringW/A classes and throwing exceptions via AtlThrow() can be found here on GitHub. For convenience, the core part of that code is copied below:

//////////////////////////////////////////////////////////////////////////////
//
// *** Functions to convert between Unicode UTF-8 and Unicode UTF-16 ***
//                      using ATL CStringA/W classes
//
// By Giovanni Dicanio 
//
//////////////////////////////////////////////////////////////////////////////


//----------------------------------------------------------------------------
// FUNCTION: Utf8ToUtf16
// DESC:     Converts Unicode UTF-8 text to Unicode UTF-16 (Windows default).
//----------------------------------------------------------------------------
CStringW Utf8ToUtf16(const CStringA& utf8)
{
    // Special case of empty input string
    if (utf8.IsEmpty())
    {
        // Return empty string
        return CStringW();
    }


    // "Code page" value used with MultiByteToWideChar() for UTF-8 conversion 
    const UINT codePageUtf8 = CP_UTF8;

    // Safely fails if an invalid UTF-8 character is encountered
    const DWORD flags = MB_ERR_INVALID_CHARS;

    // Get the length, in WCHARs, of the resulting UTF-16 string
    const int utf16Length = ::MultiByteToWideChar(
            codePageUtf8,       // source string is in UTF-8
            flags,              // conversion flags
            utf8.GetString(),   // source UTF-8 string
            utf8.GetLength(),   // length of source UTF-8 string, in chars
            nullptr,            // unused - no conversion done in this step
            0);                 // request size of destination buffer, in WCHARs
    if (utf16Length == 0)
    {
        // Conversion error
        AtlThrowLastWin32();
    }


    // Allocate destination buffer to store the resulting UTF-16 string
    CStringW utf16;
    WCHAR* const utf16Buffer = utf16.GetBuffer(utf16Length);
    ATLASSERT(utf16Buffer != nullptr);


    // Do the conversion from UTF-8 to UTF-16
    int result = ::MultiByteToWideChar(
            codePageUtf8,       // source string is in UTF-8
            flags,              // conversion flags
            utf8.GetString(),   // source UTF-8 string
            utf8.GetLength(),   // length of source UTF-8 string, in chars
            utf16Buffer,        // pointer to destination buffer
            utf16Length);       // size of destination buffer, in WCHARs  
    if (result == 0)
    {
        // Conversion error
        AtlThrowLastWin32();
    }

    // Don't forget to release internal CString buffer 
    // before returning the string to the caller
    utf16.ReleaseBufferSetLength(utf16Length);

    // Return resulting UTF-16 string
    return utf16;
}



//----------------------------------------------------------------------------
// FUNCTION: Utf16ToUtf8
// DESC:     Converts Unicode UTF-16 (Windows default) text to Unicode UTF-8.
//----------------------------------------------------------------------------
CStringA Utf16ToUtf8(const CStringW& utf16)
{
    // Special case of empty input string
    if (utf16.IsEmpty())
    {
        // Return empty string
        return CStringA();
    }


    // "Code page" value used with WideCharToMultiByte() for UTF-8 conversion 
    const UINT codePageUtf8 = CP_UTF8;

    // Safely fails if an invalid UTF-16 character is encountered
    const DWORD flags = WC_ERR_INVALID_CHARS;

    // Get the length, in chars, of the resulting UTF-8 string
    const int utf8Length = ::WideCharToMultiByte(
            codePageUtf8,       // convert to UTF-8
            flags,              // conversion flags
            utf16.GetString(),  // source UTF-16 string
            utf16.GetLength(),  // length of source UTF-16 string, in WCHARs
            nullptr,            // unused - no conversion required in this step
            0,                  // request size of destination buffer, in chars
            nullptr, nullptr);  // unused
    if (utf8Length == 0)
    {
        // Conversion error
        AtlThrowLastWin32();
    }


    // Allocate destination buffer to store the resulting UTF-8 string
    CStringA utf8;
    char* const utf8Buffer = utf8.GetBuffer(utf8Length);
    ATLASSERT(utf8Buffer != nullptr);


    // Do the conversion from UTF-16 to UTF-8
    int result = ::WideCharToMultiByte(
            codePageUtf8,       // convert to UTF-8
            flags,              // conversion flags
            utf16.GetString(),  // source UTF-16 string
            utf16.GetLength(),  // length of source UTF-16 string, in WCHARs
            utf8Buffer,         // pointer to destination buffer
            utf8Length,         // size of destination buffer, in chars
            nullptr, nullptr);  // unused
    if (result == 0)
    {
        // Conversion error
        AtlThrowLastWin32();
    }


    // Don't forget to release internal CString buffer 
    // before returning the string to the caller
    utf8.ReleaseBufferSetLength(utf8Length);

    // Return resulting UTF-8 string
    return utf8;
}