Safe Array Sample Code on GitHub

I uploaded some sample code on GitHub, that shows how to create safe arrays in C++ and consume them in a C# application.

This project contains a C++ DLL (with a C interface) that exports two functions to produce safe arrays containing bytes and strings.

Then, there’s a WinForms C# application that consumes these safe arrays using proper PInvoke declarations, and shows their content on screen.

 

Marshal STL string vectors Using Safe Arrays

Suppose you have a vector<string> in some cross-platform C++ code, and you want to marshal it across module or language boundaries on the Windows platform: Using a safe array is a valid option.

So, how can you achieve this goal?

Well, as it’s common in programming, you have to combine together some building blocks, and you get the solution to your problem.

A safe array can contain many different types, but the “natural” type for a Unicode string is BSTR. A BSTR is basically a length-prefixed Unicode string encoded using UTF-16.

ATL offers a convenient helper class to simplify safe array programming in C++: CComSafeArray. The MSDN Magazine article “Simplify Safe Array Programming in C++ with CComSafeArray” discusses with concrete sample code how to use this class. In particular, the paragraph “Producing a Safe Array of Strings” is the section of interest here.

So, this was the first building block. Now, let’s discuss the second.

You have a vector<string> as input. An important question to ask is what kind of encoding is used for the strings stored in the vector. It’s very common to store Unicode strings in std::string using the UTF-8 encoding. So, there’s an encoding impedance here: The input strings stored in the std::vector use UTF-8; but the output strings that will be stored as BSTR in the safe array use UTF-16. Ok, not a big problem: You just have to convert from UTF-8 to UTF-16. This is the other building block to solve the initial problem, and it’s discussed in the MSDN Magazine article “Unicode Encoding Conversions with STL Strings and Win32 APIs”.

So, to wrap up: You can go from a vector<string> to a safe array of BSTR strings following this path:

  1. Create a CComSafeArray<BSTR> of the same size of the input std::vector
  2. For each string in the input vector<string>, convert the UTF-8-encoded string to the corresponding UTF-16 wstring
  3. Create a CComBSTR from the previous wstring
  4. Invoke CComSafeArray::SetAt() to copy the CComBSTR into the safe array

The steps #1, #3, and #4 are discussed in the CComSafeArray MSDN article; the step #2 is discussed in the Unicode encoding conversion MSDN article.

Subtle Bug with std::min/max Function Templates

Suppose you have a function f that returns a double, and you want to store in a variable the value of this function, if this a positive number, or zero if the return value is negative. This line of C++ code tries to do that:

double x = std::max(0, f(/* something */));

Unfortunately, this apparently innocent code won’t compile!

The error message produced by VS2015’s MSVC compiler is not very clear, as often with C++ code involving templates.

So, what’s the problem with that code?

The problem is that the std::max function template is declared something like this:

template <typename T> 
const T& max(const T& a, const T& b)

If you look at the initial code invoking std::max, the first argument is of type int; the second argument is of type double (i.e. the return type of f).

Now, if you look at the declaration of std::max, you’ll see that both parameters are expected to be of the same type T. So, the C++ compiler complains as it’s unable to deduce the type of T in the code calling std::max: should T be int or double?

This ambiguity triggers a compile-time error.

To fix this error, you can use the double literal 0.0 instead of 0.

And, what if instead of 0 there’s a variable of type int?

Well, in this case you can either static_cast that variable to double:

double x = std::max(static_cast<double>(n), f(/* something */));

or, as an alternative, you can explicitly specify the double template type for std::max:

double x = std::max<double>(n, f(/* something */));

Learning Modern C++ from Scratch

C++ is a language having a reputation of being hard to learn.

In this C++ course of mine published by Pluralsight, I did my best to prove the opposite: C++ can be learned in a simple, interesting, and fun way!

I used a variety of engaging visuals, metaphors and example demo code to try to teach modern, clear, good C++ from scratch, from the beginning, without any previous programming knowledge.

And, even if you already know C++, you may have fun watching this course as well.

Note: The table of contents and a brief course overview are freely available in the course page.

Sample slide: Iterating through vector elements
Sample slide: Iterating through std::vector’s elements

Here’s some of what my reviewers wrote about this course:

You sound really passionate about this technology.  It comes across in the narration and it’s quite infectious.

You’re a very talented teacher, offering lots of examples, analogies and stories that make the concepts easy to grasp.  The visuals are also really helpful for understanding the concepts.

Overall, I really enjoyed this module.  The content is logically structured, you do a great job explaining the concepts, supported by engaging visuals.  There’s also a nice mix of theory and demos.  You clearly understand your beginner audience, the knowledge they currently have, and how to lead them to a deeper understanding of this technology.  Bravo!

The demo showing the bug with implementing the swap function was excellent. It immediately reinforced your earlier lesson on the scope of local variables.

Fantastic use of Camtasia callouts in the demos.

Sample slide: Introducing the std::string class
Sample slide: Introducing the std::string class

I’d like to sincerely thank Mike Woodring of Pluralsight for approving this course idea, my fantastic editor Beth Gerard-Hess for her continuous encouragement and support during this course production (working with Beth is an absolute pleasure), Stephan T. Lavavej for interesting email conversations that provided good food for thought, all my reviewers (both peer and QA) for their quality course feedback, and all the Pluralsight persons who worked on this course.

This C++ course has been a work of love for me, I put my heart into it: I hope you will enjoy the result!

 

C++ Wrappers for Windows Registry APIs

I uploaded on GitHub some C++ code of mine that wraps some Windows registry C-interface APIs, using RAII, STL classes like std::wstring and std::vector, and signals error conditions using exceptions.

Using these high-level C++ wrappers, you can easily access the Windows registry with simple code like this:

// Open a registry key
RegKey key{ 
    HKEY_CURRENT_USER, 
    LR"(SOFTWARE\MyKey\SubKey)" 
};

// Read a DWORD value
DWORD dw = key.GetDwordValue(L"MyValue1");

// Read a string value
wstring s = key.GetStringValue(L"MyValue2");

// Enumerate the values under the given key
auto values = key.EnumValues();

// etc.

On the May issue of MSDN Magazine you’ll find an article describing some of the techniques applied in this code.

EDIT 2017-05-02: Added link to my MSDN Magazine article.

Reorg of My Blog’s Taxonomies

Initially, I used categories as the only taxonomy to group my blog posts together based on their content.

More recently, I discovered that, in addition to categories, WordPress offers also tags. Actually, I initially thought of categories just like tags, but after some research on the Internet, I figured out there’s a difference between these taxonomies.

So, it’s been quite a while I wanted to reorganize my blog taxonomies, following the advice I read in several places to reduce the number of categories, and adding tags for finer-grained and cross-category classifications.

I spent a fair amount of time thinking about this taxonomy reorg for my blog, and re-tagging all the existing posts, and finally I was able to reduce the number of categories from the initial 19 to just 6.

In particular, I “moved” several previous categories (like Bugs, PerformanceATL, STL, Unicode, Pluralsight, etc.) to tags. Two important categories now are C++, which groups topics related to the C++ language and standard library, and Windows C++ Programming, which is focused on the application of C++ to Windows development (for example, think of Win32 C++ programming, ATL, and so on).

I hope this reorg will increase the “information organization” quality of this blog!

Enjoy! 🙂

A Few Options for Crossing Module Boundaries

It’s common to build complex software systems mixing components written in different languages.

For example, you may have a GUI written in C#, and some high-performance component written in C++, and you need to exchange data between these.

In such cases, there are several options. For example:

  1. COM: You can embed the C++ high-performance code in some COM component, exposing COM interfaces. The C# GUI subsystem talks to this high-performance component using COM interop.
  2. C-interface DLL: You can build a C-interface native DLL, “flattening” the C++ component interface using C functions. You can use PInvoke declarations on the C# side to communicate with the C++ component.
  3. C++/CLI: You can build a bridging layer between C++ and C# using C++/CLI.

Each one of these options have pros and cons.

For example, the C++/CLI approach is much easier than COM. However, C++/CLI is restricted to clients written in C# (and other .NET languages); instead COM components can be consumed by a broader audience.

The C-interface DLL option is also widely usable, as C is a great language for module boundaries, and many languages are able to “talk” with C interfaces. However, in this case you are flattening an object-oriented API to a C-style function-based interface (instead, both COM and C++/CLI maintain a more object-oriented nature).

Moreover, both COM and C++/CLI are Windows-specific technologies; on the other hand, a C interface resonates better with cross-platform code.

 

A Subtle Bug with PInvoke and Safe Arrays Storing Variant Bytes

When exchanging array data between different module boundaries using safe arrays, I tend to prefer (and suggest) safe arrays of direct types, like BYTEs, or BSTR strings, instead of safe array storing variants (that in turn contain BYTEs, or BSTRs, etc.).

However, there are some scripting clients that only understand safe arrays storing variants. So, if you want to support such clients, you have to pack the original array data items into variants, and build a safe array of variants.

If you have a COM interface method or C-interface function that produces a safe array of variants that contain BSTR strings, and you want to consume this array in C# code,  the following PInvoke seems to work fine:

[DllImport("NativeDll.dll", PreserveSig = false)]
pubic static extern void BuildVariantStringArray(
  [Out, MarshalAs(UnmanagedType.SafeArray, SafeArraySubType = VarEnum.VT_VARIANT)]
  out string[] result);

So, if you have a safe array of variants that contain BYTEs, you may deduce that such a PInvoke declaration would work fine as well:

[DllImport("NativeDll.dll", PreserveSig = false)]
pubic static extern void BuildVariantByteArray(
  [Out, MarshalAs(UnmanagedType.SafeArray, SafeArraySubType = VarEnum.VT_VARIANT)]
  out byte[] result);

I’ve just changed “string[]” to “byte[]” in the declaration of the “result” out parameter.

Unfortunately, this doesn’t work. What you get as a result in the output byte array is garbage.

The fix in this case of safe array of variant bytes is to use an object[] array in C#, which directly maps the original safe array of variants (as variants are marshaled to objects in C#):

[DllImport("NativeDll.dll", PreserveSig = false)]
pubic static extern void BuildVariantByteArray(
  [Out, MarshalAs(UnmanagedType.SafeArray, SafeArraySubType = VarEnum.VT_VARIANT)]
  out object[] result);

And then manually convert from the returned object[] array to a byte[] array, for example using the C# Array.CopyTo method; e.g.:

// Get a safe array of variants (that contain bytes).
object[] data;
BuildVariantByteArray(out data);

// "Render" (copy) the previous object array 
// to a new byte array.
byte[] byteData = new byte[data.Length];
data.CopyTo(byteData, 0);

// Use byteData...

A variant is marshaled using object in C#. So a safe array of variants is marshaled using an object array in C#. In the case of safe arrays of variant bytes, the returned bytes are boxed in objects. Using Array.CopyTo, these bytes get unboxed and stuffed into a byte array.

The additional CopyTo step doesn’t seem necessary in the safe array of string variants, probably because strings are objects in C#.

Still, I think this aspect of the .NET/C# marshaler should be fixed, and if a PInvoke declaration clearly states byte[] on the C# side, the marshaler should automatically unbox the bytes from the safe array of variants.

 

MSDN Magazine Article: Simplify Safe Array Programming in C++

The March 2017 issue of MSDN Magazine contains a feature article of mine on simplifying safe array programming in C++ with the help of the ATL’s CComSafeArray class template.

There is also an accompanying web-only side bar introducing the SAFEARRAY C data structure and some of the basic operations available for it via Win32 API calls, although for C++ code I encourage the use of a convenient higher-level C++ object-oriented wrapper like ATL::CComSafeArray.

Safe arrays are useful for example when you have a COM component and you want to exchange array data between the component and its clients (that can be potentially written in languages even different than C++, e.g. C#, or scripting languages).

I wish I could have had such a resource available when I did some safe array programming in C++.

Some of the insights and experience I developed in that regard are distilled in the aforementioned article.

I hope it may be helpful to someone.

Check it out here!

 

Updates to the ATL/STL Unicode Encoding Conversion Code

I’ve updated my code on GitHub for converting between UTF-8, using STL std::string, and UTF-16, using ATL CStringW.

Now, on errors, the code throws instances of a custom exception class that is derived from std::runtime_error, and is capable of containing more information than a simple CAtlException.

Moreover, I’ve added a couple of overloads for converting from source string views (specified using an STL-style [start, finish) pointer range). This makes it possible to efficiently convert only portions of longer strings, without creating ad hoc CString or std::string instances to store those partial views.