Safe Array Sample Code on GitHub

I uploaded some sample code on GitHub, that shows how to create safe arrays in C++ and consume them in a C# application.

This project contains a C++ DLL (with a C interface) that exports two functions to produce safe arrays containing bytes and strings.

Then, there’s a WinForms C# application that consumes these safe arrays using proper PInvoke declarations, and shows their content on screen.

 

Marshal STL string vectors Using Safe Arrays

Suppose you have a vector<string> in some cross-platform C++ code, and you want to marshal it across module or language boundaries on the Windows platform: Using a safe array is a valid option.

So, how can you achieve this goal?

Well, as it’s common in programming, you have to combine together some building blocks, and you get the solution to your problem.

A safe array can contain many different types, but the “natural” type for a Unicode string is BSTR. A BSTR is basically a length-prefixed Unicode string encoded using UTF-16.

ATL offers a convenient helper class to simplify safe array programming in C++: CComSafeArray. The MSDN Magazine article “Simplify Safe Array Programming in C++ with CComSafeArray” discusses with concrete sample code how to use this class. In particular, the paragraph “Producing a Safe Array of Strings” is the section of interest here.

So, this was the first building block. Now, let’s discuss the second.

You have a vector<string> as input. An important question to ask is what kind of encoding is used for the strings stored in the vector. It’s very common to store Unicode strings in std::string using the UTF-8 encoding. So, there’s an encoding impedance here: The input strings stored in the std::vector use UTF-8; but the output strings that will be stored as BSTR in the safe array use UTF-16. Ok, not a big problem: You just have to convert from UTF-8 to UTF-16. This is the other building block to solve the initial problem, and it’s discussed in the MSDN Magazine article “Unicode Encoding Conversions with STL Strings and Win32 APIs”.

So, to wrap up: You can go from a vector<string> to a safe array of BSTR strings following this path:

  1. Create a CComSafeArray<BSTR> of the same size of the input std::vector
  2. For each string in the input vector<string>, convert the UTF-8-encoded string to the corresponding UTF-16 wstring
  3. Create a CComBSTR from the previous wstring
  4. Invoke CComSafeArray::SetAt() to copy the CComBSTR into the safe array

The steps #1, #3, and #4 are discussed in the CComSafeArray MSDN article; the step #2 is discussed in the Unicode encoding conversion MSDN article.

A Few Options for Crossing Module Boundaries

It’s common to build complex software systems mixing components written in different languages.

For example, you may have a GUI written in C#, and some high-performance component written in C++, and you need to exchange data between these.

In such cases, there are several options. For example:

  1. COM: You can embed the C++ high-performance code in some COM component, exposing COM interfaces. The C# GUI subsystem talks to this high-performance component using COM interop.
  2. C-interface DLL: You can build a C-interface native DLL, “flattening” the C++ component interface using C functions. You can use PInvoke declarations on the C# side to communicate with the C++ component.
  3. C++/CLI: You can build a bridging layer between C++ and C# using C++/CLI.

Each one of these options have pros and cons.

For example, the C++/CLI approach is much easier than COM. However, C++/CLI is restricted to clients written in C# (and other .NET languages); instead COM components can be consumed by a broader audience.

The C-interface DLL option is also widely usable, as C is a great language for module boundaries, and many languages are able to “talk” with C interfaces. However, in this case you are flattening an object-oriented API to a C-style function-based interface (instead, both COM and C++/CLI maintain a more object-oriented nature).

Moreover, both COM and C++/CLI are Windows-specific technologies; on the other hand, a C interface resonates better with cross-platform code.

 

A Subtle Bug with PInvoke and Safe Arrays Storing Variant Bytes

When exchanging array data between different module boundaries using safe arrays, I tend to prefer (and suggest) safe arrays of direct types, like BYTEs, or BSTR strings, instead of safe array storing variants (that in turn contain BYTEs, or BSTRs, etc.).

However, there are some scripting clients that only understand safe arrays storing variants. So, if you want to support such clients, you have to pack the original array data items into variants, and build a safe array of variants.

If you have a COM interface method or C-interface function that produces a safe array of variants that contain BSTR strings, and you want to consume this array in C# code,  the following PInvoke seems to work fine:

[DllImport("NativeDll.dll", PreserveSig = false)]
pubic static extern void BuildVariantStringArray(
  [Out, MarshalAs(UnmanagedType.SafeArray, SafeArraySubType = VarEnum.VT_VARIANT)]
  out string[] result);

So, if you have a safe array of variants that contain BYTEs, you may deduce that such a PInvoke declaration would work fine as well:

[DllImport("NativeDll.dll", PreserveSig = false)]
pubic static extern void BuildVariantByteArray(
  [Out, MarshalAs(UnmanagedType.SafeArray, SafeArraySubType = VarEnum.VT_VARIANT)]
  out byte[] result);

I’ve just changed “string[]” to “byte[]” in the declaration of the “result” out parameter.

Unfortunately, this doesn’t work. What you get as a result in the output byte array is garbage.

The fix in this case of safe array of variant bytes is to use an object[] array in C#, which directly maps the original safe array of variants (as variants are marshaled to objects in C#):

[DllImport("NativeDll.dll", PreserveSig = false)]
pubic static extern void BuildVariantByteArray(
  [Out, MarshalAs(UnmanagedType.SafeArray, SafeArraySubType = VarEnum.VT_VARIANT)]
  out object[] result);

And then manually convert from the returned object[] array to a byte[] array, for example using the C# Array.CopyTo method; e.g.:

// Get a safe array of variants (that contain bytes).
object[] data;
BuildVariantByteArray(out data);

// "Render" (copy) the previous object array 
// to a new byte array.
byte[] byteData = new byte[data.Length];
data.CopyTo(byteData, 0);

// Use byteData...

A variant is marshaled using object in C#. So a safe array of variants is marshaled using an object array in C#. In the case of safe arrays of variant bytes, the returned bytes are boxed in objects. Using Array.CopyTo, these bytes get unboxed and stuffed into a byte array.

The additional CopyTo step doesn’t seem necessary in the safe array of string variants, probably because strings are objects in C#.

Still, I think this aspect of the .NET/C# marshaler should be fixed, and if a PInvoke declaration clearly states byte[] on the C# side, the marshaler should automatically unbox the bytes from the safe array of variants.

 

MSDN Magazine Article: Simplify Safe Array Programming in C++

The March 2017 issue of MSDN Magazine contains a feature article of mine on simplifying safe array programming in C++ with the help of the ATL’s CComSafeArray class template.

There is also an accompanying web-only side bar introducing the SAFEARRAY C data structure and some of the basic operations available for it via Win32 API calls, although for C++ code I encourage the use of a convenient higher-level C++ object-oriented wrapper like ATL::CComSafeArray.

Safe arrays are useful for example when you have a COM component and you want to exchange array data between the component and its clients (that can be potentially written in languages even different than C++, e.g. C#, or scripting languages).

I wish I could have had such a resource available when I did some safe array programming in C++.

Some of the insights and experience I developed in that regard are distilled in the aforementioned article.

I hope it may be helpful to someone.

Check it out here!

 

Building In-Process Shell Extensions? C++ Is The Right Tool For The Job

In today’s world, there are so many programming languages to choose from when  developing software projects. So, oftentimes, this question arises: “With so many simpler and higher-level programming languages, why should I choose C++ to do X?”

Well, programming languages (and frameworks) are just tools. And my usual guidance is: use the right tool for the job. So, if for your project a simpler/more productive/higher-level programming language is well suited, then just go for it!

But, there are cases in which C++ is just The Best Tool For The Job.

C++ great for perf
C++ great for perf

One of these contexts is the development of in-process shell extensions for Windows. There is a whole MSDN web page discussing “Guidance for Implementing In-Process Extensions”.

A key point is that the .NET Framework/CLR is a high-impact runtime, with associated performance issues for in-proc extensions:

“Performance issues can arise with runtimes that impose a significant performance penalty when they are loaded into a process. The performance penalty can be in the form of memory usage, CPU usage, elapsed time, or even address space consumption. The CLR, JavaScript/ECMAScript, and Java are known to be high-impact runtimes. Since in-process extensions can be loaded into many processes, and are often done so at performance-sensitive moments (such as when preparing a menu to be displayed the user), high-impact runtimes can negatively impact overall responsiveness. […]”

The aforementioned MSDN documentation continues with a brief discussion of issues associated to high resource consumption as well.

Then, another paragraph about “Issues Specific to the .NET Framework” briefly touches on COM interop related problems. It’s important to keep in mind that the in-proc shell extension model was designed around native code, and there is a kind of “impedance mismatch” between that and the managed .NET world.

Note that the use of .NET is considered acceptable for other types of extensions, like out-of-process extensions.

A common type of shell extensions are the context-menu extensions.

Example of customized context-menu via shell extensions
Example of customized context-menu via shell extensions

If you are curious about the development  of this type of extensions using C++, you may find my Pluralsight course on the subject interesting.

(A brief description of the course content is offered in this blog post.)

 

Pluralsight Course: Building Context-Menu Shell Extensions in C++

I “wrote” a couple of video courses published on Pluralsight (a third one is work in progress, stay tuned!).

My first Pluralsight course was “Building Context-Menu Shell Extensions in C++”. It’s a slightly less than three-hour course, in which I teach you how to build context-menu shell extensions in C++, using Visual Studio.

The course starts with a brief introduction to COM: just to those COM concepts required for the remaining course modules.

Then, in the following module, I introduce the use of IExecuteCommand to build a simple context-menu shell extension. In this module, I use just “raw” C++, without any frameworks (like ATL). This approach gives the opportunity to show how some things work “under the hood”.

In the next module, I revisit the IExecuteCommand technique, but this time with the help of ATL. ATL is a very useful productive framework for C++/COM programmers: comparing the work done in the previous module with the ATL-based approach presented in this module will make you appreciate the productivity improvements brought by ATL (and Visual Studio ATL Wizards).

In the final module I introduce you to an IContextMenu-based technique for building context-menu shell extensions. There are pros and cons in using IExecuteCommand vs. IContextMenu. For example, while IContextMenu is available in Windows XP, IExecuteCommand is a Win7+ COM interface. So, if you need to develop a context-menu shell extension that supports XP, you have to use IContextMenu.

Moreover, while IExecuteCommand simplifies some common operations, more advanced techniques like building fancy UIs in the context-menu (for example, implementing owner-drawn menu items) require the use of IContextMenu and its later incarnations (like IContextMenu3).

I hope you enjoy the course.

 

Simplifying SAFEARRAY programming with CComSafeArray

EDIT 2017-03-01: Please read my MSDN Magazine article:

Simplify Safe Array Programming in C++ with CComSafeArray

for a detailed discussion.

 

Original blog post follows:

There are some questions on the MSDN forums about passing arrays between C++ and C#. This argument is rich and there are several options. There was a recent thread on this topic on the Visual C++ MFC and ATL MSDN Forum. As requested by the OP, a simple C++ code using raw C arrays with corresponding C# P/Invoke signature was showed.

Another option consists of using SAFEARRAY’s. While programming SAFEARRAY’s using raw C functions (e.g. SafeArrayCreate, SafeArrayLock, etc.) can lead to some boilerplate code, ATL offers a convenient CComSafeArray helper class that makes the life of SAFEARRAY programmers much easier.

CComSafeArray is a C++ template class; suppose that a SAFEARRAY of LONG’s is requested: it can be simply created with code like this:

       CComSafeArray<LONG> saData(count);

 

Thanks to CComSafeArray, SAFEARRAY items can be accessed using classic operator[], like this:

        for (int i = 0; i < count; i++)

            saData[i] = (i+1)*10;

 

And to transfer the created SAFEARRAY to the caller (assuming there is a SAFEARRAY ** ppsaData output parameter), a simple code like this works fine:

 

      *ppsaData = saData.Detach();

 

It is worth noting that CComSafeArray can throw C++ exceptions (instances of CAtlException class) on error. So, if the method in which the CComSafeArray is used returns an HRESULT (as common to COM methods), it is safe to guard the code using CComSafeArray in a try/catch block, to convert CAtlException instances back to HRESULT error codes, that can cross COM method boundaries, e.g. :

    try

    {

        // Create a safe array of LONG’s

        CComSafeArray<LONG> saData(count);

 

        // Fill with some data…

        for (int i = 0; i < count; i++)

            saData[i] = (i+1)*10;

       

        // Copy to output parameter

        *ppsaData = saData.Detach();

    }

    catch (const CAtlException & e)

    {

        // Trap errors signaled as C++ ATL exceptions

        // and return the corresponding HRESULT

 

        return e// implicit cast to HRESULT

    }

 

    // All right

    return S_OK;

 

A sample solution illustrating these concepts is attached to this blog post.

 

COM Automatic Initialization and Cleanup (and Text to Speech…)

Suppose we have some COM code where instances of CComPtr are used to conveniently wrap COM interface pointers:

{

    HRESULT hr = CoInitialize(NULL);

    // check return value…

 

    CComPtr<ISomeInterfacesp1;

    CComPtr<IAnotherInterfacesp2;

    …

    // Do something with interface pointers

    …

    CoUninitialize();

}

 

This code hides a subtle bug: the problem is that CoUninitialize is called before the CComPtr destructors. Instead, correct logic requires that CoUninitialize must be called after every COM interface pointer is released (in its own wrapping CComPtr destructor).

Actually, there is also a problem of exception safety here. In fact, if some exception is thrown in the middle of the code block, the call to CoUninitialize is missed.

To correct both these problems, it is possible to define a C++ class following the RAII pattern. The constructor of this class will call CoInitialize, and throw an exception if initialization failed. The class destructor will call CoUninitialize; so every successful call to CoInitialize will have a matching call to CoUninitialize, as prescribed by COM programming rules.

Moreover, assuming that instances of this class are created (on the stack) before instances of CComPtr (or other COM smart pointers), CoUninitialize will be the last call, after every CComPtr destructor is called:

{

    // COM automatic initialization and cleanup

    CComAutoInit comInit;

 

    CComPtr<ISomeInterfacesp1;

    CComPtr<IAnotherInterfacesp2;

    …

    // Do something with interface pointers

    …

}

 

The complete listing of this custom CComAutoInit class is attached to this blog post. There are some additional details, like having defined private copy constructor and operator=, to ban deep-copy semantic for this class.

Moreover, there is an (explicit) overload of CComAutoInit constructor which takes a DWORD parameter corresponding to the dwCoInit parameter of CoInitializeEx.

A working sample showing how to use this CComAutoInit class is attached to this blog post as well. It is basically a C++ command line app that “speaks” the arguments passed to it. (A slightly more complex GUI dialog-based MFC text-speaker app can be found on MSDN Code Gallery, too.)

 

Shell Extensions Tutorials

EDIT: Check out my Context Menu Shell Extension Pluralsight course!

Writing shell extensions is one of those programming tasks in which C++ (with the help of a library like ATL) excels.

(A Microsoft guy explained here why it is better to avoid .NET for writing shell extensions.)

Michael Dunn (a former Visual C++ MVP) wrote a very interesting series of tutorials on CodeProject on developing shell extensions:

1. A step-by-step tutorial on writing shell extensions.
2. A tutorial on writing a shell extension that operates on multiple files at once.
3. A tutorial on writing a shell extension that shows pop-up info for files.
4. A tutorial on writing a shell extension that provides custom drag and drop functionality.
5. A tutorial on writing a shell extension that adds pages to the properties dialog of files.
6. A tutorial on writing a shell extension that can be used on the Send To menu.
7. A tutorial on using owner-drawn menus in a context menu shell extensions, and on making a context menu extension that responds to a right-click in a directory background.
8. A tutorial on adding columns to Explorer’s details view via a column handler shell extension.
9. A tutorial on writing an extension to customize the icons displayed for a file type.