The Case of string_view and the Magic String


Someone was working on modernizing some legacy C++ code. The code base contained a function like this:

void DoSomething(const char* name)

The DoSomething function takes a read-only string expressed using a C-style string pointer: remember, this is a legacy C++ code base.

For the sake of this discussion, suppose that DoSomething contains some simple C++ code that invokes a C-interface API, like this (of course, the code would be more complex in a real code base):

void DoSomething(const char* name) 
{
    SomeCApi(name);
}

SomeCApi also expects a “const char*” that represents a read-only string parameter.

However, note that the SomeCApi cannot be modified (think of it like a system C-interface API, for example: a Windows C API like MessageBox).

For the sake of this discussion, suppose that SomeCApi just prints out its string parameter, like this:

void SomeCApi(const char* name) 
{
    printf(“Hello, %s!\n”, name);
}

In the spirit of modernizing the legacy C++ code base, the maintainer decides to change the prototype of DoSomething, stepping up from “const char*” to std::string_view:

// Was: void DoSomething(const char* name)
void DoSomething(std::string_view name)

The SomeCApi still expects a const char*. Remember that you cannot change the SomeCApi interface.

So, the maintainer needs to update the body of DoSomething accordingly, invoking string_view::data to access the underlying character array:

void DoSomething(std::string_view name) 
{
    // Was: SomeCApi(name);
    SomeCApi(name.data());
}

In fact, std::string_view::data returns a pointer to the underlying character array.

The code compiles fine. And the maintainer is very happy about this string_view modernization!

 

Then, the code is executed for testing, with a string_view name containing “Connie”. The expected output would be:

“Hello, Connie!”

But, instead, the following string is printed out:

“Hello, Connie is learning C++!”

Wow! Where does the “ is learning C++” part come from??

Is there some magic string hidden inside string_view?

As a sanity check, the maintainer simply prints out the string_view variable:

// name is a std::string_view
std::cout << “Name: “ << name;
 

And the output is as expected: “Name: Connie”.

So, it seems that cout does print the correct string_view name.

But, somehow, when the string_view is passed deep down to a legacy C API, some string “magic” happens, showing some additional characters after “Connie”.

What’s going on here??

Figuring Out the Bug

Well, the key here are two words: Null Terminator.

In fact, the C API that takes a const char* expected the string to be null-terminated.

On the other hand, std::string_view does not guarantee null-termination!

So, consider a string_view that “views” only a portion of a string, like this:

std::string str = “Connie is learning C++”;
auto untilFirstSpace = str.find(‘ ‘);
std::string_view name{str.data(), untilFirstSpace}; // “Connie”

The string_view certainly “views” the “Connie” part. But, if you consider the memory layout, after these “Connie” characters in memory there is no null terminator, which was expected by the C API. So, the C API views the whole initial string, until it finds the null terminator.

So, the whole string is printed out by the C API, not just the part observed by the string_view.

Memory layout: string_view vs. C-style null-terminated strings
Memory layout: string_view vs. C-style null-terminated strings

This is a very subtle bug, that can be hard to spot in more complex code bases.

So, remember: std::string_views are not guaranteed to be null terminated! Take that into consideration when calling C-interface APIs, that expect C-style null-terminated strings.

P.S. As a side note, std::string::c_str guarantees that the returned pointer points to a null-terminated character array.

> Follow up here.

Repro Code

// The Case of string_view and the Magic String 
// -- by Giovanni Dicanio

#include <stdio.h>

#include <iostream>
#include <string>
#include <string_view>

void SomeCApi(const char* name)
{
    printf("Hello, %s!\n", name);
}

void DoSomething(std::string_view name)
{
    SomeCApi(name.data());
}

int main()
{
    std::string msg = "Connie is learning C++";
    auto untilFirstSpace = msg.find(' ');

    std::string_view v{ msg.data(), untilFirstSpace };

    std::cout << "String view: " << v << '\n';

    DoSomething(v);
}