Building Double-NUL-Terminated Strings

EDIT (2016-10-11): New modern C++11 code available.

There was an interesting post on Raymond Chen’s blog about double-null-terminated strings.

Double-NUL-terminated strings have some pros, like reducing heap fragmentation and offering good locality: in fact, the strings are allocated in memory sequentially, and not scattered around.

Another benefit of double-NUL-terminated strings is marshaling an array of strings through a pure-C-interface DLL.

In fact, for a C++ programmer, the most natural way of storing an array of strings would be to use something like std::vector<CString> (or MFC CStringArray container). But how could we pass an array of strings at the boundaries of a pure-C-interface DLL? One solution could be to use a SAFEARRAY of BSTR‘s, from COM.

But another one (IMHO, simpler) could be to just use double-NUL-terminated strings.
This technique was proposed as an answer in a post in the Visual C++ MFC and ATL MSDN Forum.

Here is a simple C++ code to build a double-NUL-terminated string from a vector of CString’s:

//------------------------------------------------------------------------
// Builds a double-NUL terminated string from a vector of CString's.
// The caller must release allocated memory using ::LocalFree.
// Returns NULL in case of allocation error.
//------------------------------------------------------------------------

wchar_t * BuildDoubleNulString(const std::vector<CString>& strings )
{
    //
    // If the input string array is empty, just build an empty string.
    //
    if (strings.empty())
    {
        // Allocate memory using Win32 heap allocator
        wchar_t * emptyDoubleNulString = reinterpret_cast<wchar_t *>(
            ::LocalAlloc(
                LPTR,                   // fixed + zero init
                2 * sizeof(wchar_t)     // 2 Unicode characters
            ));
        if (emptyDoubleNulString == NULL)
        {
            // Alloc error
            return NULL;
        }
        return emptyDoubleNulString;
    }

    //
    // Calculate the total number of characters to build
    // the double-NUL terminated string.
    //
    size_t totalChars = 0;
    for (size_t i = 0; i < strings.size(); i++)
    {
        // Get length of current string, including terminating
        totalChars += (strings[i].GetLength() + 1);
    }

    // Consider adding a termination, so add +1 to required char count
    totalChars++;
 
    //
    // Build the double-NUL terminated string
    //
    wchar_t * result = reinterpret_cast<wchar_t *>(
        ::LocalAlloc(
            LPTR,                        // fixed + zero init
            totalChars * sizeof(wchar_t) // totalChars Unicode characters
        ));
    if (result == NULL)
    {
        // Allocation error
        return NULL;
    }

    wchar_t * dest = result;
    for (size_t i = 0; i < strings.size(); i++)
    {
        // Get current string length
        size_t currLen = strings[i].GetLength();

        // Can't have empty strings inside a double-NUL-terminated string
        ATLASSERT(currLen != 0);

        // Copy current string to destination memory
        memcpy(dest, strings[i].GetString(), currLen * sizeof(wchar_t));

        // No need to terminate current string
        //  dest[currLen] = 0;
        // because memory was allocated with zero-init flag.    

        // Move destination to next string slot
        dest += (currLen + 1);
    }

    // No need to add terminating ,
    // because memory was allocated with zero-init flag.
    //result[totalChars - 1] = 0;

    return result;
}

Note that LocalAlloc is used to allocate heap memory for the double-NUL-terminated string, because this memory should be freed by some client code outside the DLL, and it is fundamental that the code which allocates memory and the code which frees memory need to use the same allocator. So, e.g. if a DLL compiled with VC8 builds a double-NUL-terminated string allocating memory using new[], and this memory is freed using delete[] from some code built using VC9, there is a mismatch between allocators; using LocalAlloc/LocalFree prevents that kind of problem.

Leave a Reply

Your email address will not be published. Required fields are marked *