GDHasher

I developed a simple utility that offers a “right-click and hash” interface for calculating MD5, SHA-1 and SHA-256 hashes of files.


The app is written in C++, with ATL, WTL and some STL and Boost. (The CRT is statically linked, so there is no need to download Visual C++ run-time redistributables and install them. The all-in-one installer is just a couple of hundreds of kilobytes .exe.)


I tested the app with both Windows 7 64-bit and Windows XP SP2 32-bit, and it seems to work correctly. (However, note that SHA-256 isn’t available on XP SP2.)


Thanks to David Ching for his precious feed-back and suggestions, and to Michael Dunn for his excellent articles on CodeProject on WTL and shell extensions.


Installers for both 32-bit (x86) and 64-bit (x64) versions are attached to this blog post.


Enjoy!


Simplifying SAFEARRAY programming with CComSafeArray

There are some questions on the MSDN forums about passing arrays between C++ and C#. This argument is rich and there are several options. There was a recent thread on this topic on the Visual C++ MFC and ATL MSDN Forum. As requested by the OP, a simple C++ code using raw C arrays with corresponding C# P/Invoke signature was showed.


Another option consists of using SAFEARRAY’s. While programming SAFEARRAY’s using raw C functions (e.g. SafeArrayCreate, SafeArrayLock, etc.) can lead to some boilerplate code, ATL offers a convenient CComSafeArray helper class that makes the life of SAFEARRAY programmers much easier.


CComSafeArray is a C++ template class; suppose that a SAFEARRAY of LONG’s is requested: it can be simply created with code like this:


       CComSafeArray<LONG> saData(count);


 


Thanks to CComSafeArray, SAFEARRAY items can be accessed using classic operator[], like this:


        for (int i = 0; i < count; i++)

            saData[i] = (i+1)*10;


 


And to transfer the created SAFEARRAY to the caller (assuming there is a SAFEARRAY ** ppsaData output parameter), a simple code like this works fine:


 

      *ppsaData = saData.Detach();


 


It is worth noting that CComSafeArray can throw C++ exceptions (instances of CAtlException class) on error. So, if the method in which the CComSafeArray is used returns an HRESULT (as common to COM methods), it is safe to guard the code using CComSafeArray in a try/catch block, to convert CAtlException instances back to HRESULT error codes, that can cross COM method boundaries, e.g. :


    try

    {

        // Create a safe array of LONG’s

        CComSafeArray<LONG> saData(count);

 

        // Fill with some data…

        for (int i = 0; i < count; i++)

            saData[i] = (i+1)*10;

       

        // Copy to output parameter

        *ppsaData = saData.Detach();

    }

    catch (const CAtlException & e)

    {

        // Trap errors signaled as C++ ATL exceptions

        // and return the corresponding HRESULT

 

        return e;  // implicit cast to HRESULT

    }

 

    // All right

    return S_OK;


 


A sample solution illustrating these concepts is attached to this blog post.

Conversion between Unicode UTF-8 and UTF-16 with STL strings

Suppose there is a need to convert between Unicode UTF-8 and Unicode UTF-16 in a Windows C++ application. This can happen because it is good to use UTF-16 as the Unicode encoding inside a C++ app (in fact, UTF-16 is the encoding used by Win32 Unicode APIs), and use UTF-8 outside app boundaries (e.g. text files, etc.).



To do that, it is possible to use ATL conversion helpers like CA2W and CW2A, as shown in this blog post by Kenny Kerr. Or it is possible to directly use MultiByteToWideChar and WideCharToMultiByte and CString(A/W) class as illustrated in a previous blog post here.


Another option is to use STL strings instead of ATL/MFC CString. An advantage of this approach is that it works also with the Express editions of Visual Studio (which do not include ATL and MFC). Moreover, STL strings are better integrated in the context of STL and Boost, and there are C++ programmers who just prefer STL strings to ATL/MFC CString. The code that uses STL strings is similar to that illustrated previously for CString’s. Considering a conversion from UTF-8 to UTF-16, MultiByteToWideChar API is called twice: the first call determines the length of the resulting UTF-16 string, so that enough memory can be reserved for the string; then, the second call performs the actual conversion. A similar pattern is followed for the symmetric conversion (from UTF-16 to UTF-8, this time using WideCharToMultiByte API).


A couple of differences between CString and STL’s strings in the context of Win32 programming are worth noting.


First, Win32 APIs tend to receive input strings in the form of LPCTSTR, which is a typedef for “const TCHAR *”, i.e. these are raw C strings, NUL terminated. CString plays well in this model, in fact it is possible to simply pass instances of CString’s in the presence of LPCTSTR parameters (thanks to proper cast operator PCXSTR() implemented by CSimpleStringT, the base class of CStringT). Instead, in the presence of std::[w]string arguments, c_str() or data() methods must be called explicitly.


Moreover, when there is a need to reserve some memory inside CString buffer to modify its content directly, it is possible to call GetBuffer() or GetBufferSetLength() methods (these methods return a non-const pointer to the internal string buffer, allowing direct modification of its content). Instead, with STL’s strings it is possible to call the resize() method to reserve enough memory for the string content, and then use code like &myString[0] to get direct (non-const) access to internal string content. (This technique works at least with current Visual C++ implementation of STL strings.)


With these two differences between CString and STL’s strings in mind, it should be easy to follow the commented code in “utf8conv.h” file, attached to this blog post.


As a final note, Win32 API’s used in the UTF-8 conversion process can fail; as it is common in the Win32 programming model, GetLastError function can be used to retrieve more details on the error. Instead of using return codes for error conditions, the attached source code throws C++ exceptions. For this purpose, an exception class, named utf8_error, is derived from std::exception, and used to signal error conditions during the conversion process.


EDIT 2011, October 15th: Code Gallery sample can be found here.

COM Automatic Initialization and Cleanup (and Text to Speech…)

Suppose we have some COM code where instances of CComPtr are used to conveniently wrap COM interface pointers:


{

    HRESULT hr = CoInitialize(NULL);

    // check return value…

 

    CComPtr<ISomeInterfacesp1;

    CComPtr<IAnotherInterfacesp2;

    …

    // Do something with interface pointers

    …

    CoUninitialize();

}


 


This code hides a subtle bug: the problem is that CoUninitialize is called before the CComPtr destructors. Instead, correct logic requires that CoUninitialize must be called after every COM interface pointer is released (in its own wrapping CComPtr destructor).


Actually, there is also a problem of exception safety here. In fact, if some exception is thrown in the middle of the code block, the call to CoUninitialize is missed.


To correct both these problems, it is possible to define a C++ class following the RAII pattern. The constructor of this class will call CoInitialize, and throw an exception if initialization failed. The class destructor will call CoUninitialize; so every successful call to CoInitialize will have a matching call to CoUninitialize, as prescribed by COM programming rules.


Moreover, assuming that instances of this class are created (on the stack) before instances of CComPtr (or other COM smart pointers), CoUninitialize will be the last call, after every CComPtr destructor is called:


{

    // COM automatic initialization and cleanup

    CComAutoInit comInit;

 

    CComPtr<ISomeInterfacesp1;

    CComPtr<IAnotherInterfacesp2;

    …

    // Do something with interface pointers

    …

}

 


The complete listing of this custom CComAutoInit class is attached to this blog post. There are some additional details, like having defined private copy constructor and operator=, to ban deep-copy semantic for this class.


Moreover, there is an (explicit) overload of CComAutoInit constructor which takes a DWORD parameter corresponding to the dwCoInit parameter of CoInitializeEx.


A working sample showing how to use this CComAutoInit class is attached to this blog post as well. It is basically a C++ command line app that “speaks” the arguments passed to it. (A slightly more complex GUI dialog-based MFC text-speaker app can be found on MSDN Code Gallery, too.)


 

A simple 2D matrix class

There is a recurring question on some C++ forums about nesting std::vector’s to build a 2D matrix, i.e.:


    std::vector< std::vector<double> > someMatrix;

 


This is not very efficient, both memory-wise and speed-wise.


In fact, each vector has an overhead due to the fact that it typically stores three pointers. So, e.g. in case of a 20 rows by 30 columns matrix, assuming that inner vectors represent matrix rows, the overhead is 20 rows x 3 pointers/row = 60 pointers, i.e. 60 pointers x 4 bytes/pointer = 240 bytes.


But there is also a speed penalty. In fact, dynamic memory allocated by each vector is in general scattered on the heap; instead, it would be better for locality to have contiguous memory allocated for the whole matrix.


So, a better technique consists in using just one instance of std::vector, storing all matrix elements in this very instance, using a proper ordering for elements, e.g. storing matrix elements row-wise.


The total size of the vector is rows * columns, and given a 2D index (row, column) it can be “linearized” to point to proper vector element using the following formula:


<vector index> = <column index> + <row index> * <matrix columns>


These concepts are developed in a simple reusable C++ template class attached to this blog post. This is a simple class for simple needs (i.e. just storing 2D matrix elements in an efficient way and accessing them conveniently). For more advanced matrix classes, with template meta-programming optimizations, Blitz++ library can be considered.


 

STL Introductory Series on Channel 9

Stephan concluded his introductory series on the STL with an interesting chapter on template metaprogramming and type traits.


In addition to previous lessons, here is a complete list of this ten part series introducing the STL:


Part 1 is about sequence containers (like std::vector).


Part 2 is on associative containers (like std::map).


Part 3 discusses smart pointers (e.g. shared_ptr).


Parts 4 and 5 show a practical use of the aforementioned concepts applied to the development of a Nurikabe puzzle solver.


Part 6 and part 7 discuss STL algorithms.


Part 8 is about regular expressions.


In part 9 new C++0x core language features like r-value references and move semantics are discussed.


And finally part 10 is about template metaprogramming and type traits.


Thank you Stephan and Channel 9 for this quality introduction to the STL!