Fixing the string_view-and-the-Magic-String Bug

In a previous blog post we saw an interesting and subtle bug involving std::string_view.

So, how can you fix that bug?

Well, an option could be creating a std::string instance from the string_view, and then invoke the string::c_str() method to pass a properly null-terminated string to the legacy C API:

void DoSomething(std::string_view name)
{
    // BUG:
    //   SomeCApi(name.data());
    //
    // FIX:
    SomeCApi(std::string{ name.data(), name.length() }
             .c_str());
}

In fact, string::c_str() guarantees that the returned string is null-terminated.

So, you may think to implement a simple inline helper function, to abstract away the previous ugly code:

inline const char* StringViewToCApi(std::string_view sv)
{
    return std::string{ sv.data(), sv.length() }.c_str();
}

But, if you try it out, you get the following output:

Weird characters are printed out instead of “Connie”.
Weird characters are printed out instead of “Connie”.

So, it looks like this time the cure is worse than the disease!

You got those weird characters instead of the expected “Connie”!

What’s going on here?

Well, if you take a look at your code in the Visual Studio IDE, you’ll note that the offending line is properly squiggled; and if you hover over that line with the mouse cursor, you get an interesting and clear explanation:

“The pointer is dangling because it points at a temporary instance which was destroyed.”

The Visual Studio 2019 IDE clearly diagnosed the problem in that code.
The Visual Studio 2019 IDE clearly diagnosed the problem in that code.

Basically, you created a temporary std::string instance inside the helper function, and then you invoked c_str() on it. The string::c_str() method returns a pointer to the temporary string object, which gets destroyed at the end of the helper function. As a result of that, the pointer returned back to the caller is dangling, as it points to some memory that has been already freed!

In fact, those box drawing characters showed in the output correspond to a 0xCC byte sequence, which is used by the Microsoft Visual C++ compiler to mark this kind of “invalid” memory.

So, how can you fix this bug?

Well, unfortunately, you just cannot safely return a const char* pointer that was handed to you by string::c_str(), if that string object was destroyed when the function exited.

However, what you can do is to simplify the above code, using a proper std::string constructor, that simply takes a string_view as input, and creates a std::string instance from the input string_view:

void DoSomething(std::string_view name)
{
    // BUG: SomeCApi(name.data());
    //
    // FIX:
    //
    SomeCApi(std::string{ name }.c_str());
}

And, finally, you get the expected output!

The expected output is printed out using the fixed code.
The expected output is printed out using the fixed code.

Repro Code:

// FIX: The Case of string_view and the Magic String -- by Giovanni Dicanio

#include <stdio.h>

#include <iostream>
#include <string>
#include <string_view>

void SomeCApi(const char* name)
{
    printf("Hello, %s!\n", name);
}

void DoSomething(std::string_view name)
{
    // BUG: SomeCApi(name.data());
    //
    // FIX:
    //
    SomeCApi(std::string{ name }.c_str());
}

int main()
{
    std::string msg = "Connie is learning C++";
    auto untilFirstSpace = msg.find(' ');

    std::string_view v{ msg.data(), untilFirstSpace };

    std::cout << "String view: " << v << '\n';

    DoSomething(v);
}

 

Leave a Reply

Your email address will not be published. Required fields are marked *