Fun with templates: creating an enumerator that works on all STL containers

A week or 2 ago I was programming a parser for an FHX file that we wanted to import into a different application.


The FHX file contains module tags, and each module tag contains attributes, and then there is another configuration csv file with a number of sections, and each section has multiple node configurations, and each node configuration has multiple items…


When all this information was parsed I had to build an XML file by cross referencing the FHX file data with the configuration file data to get an XML file with the desired configuration.


The class structure of my program was hierarchical and making liberal use of vectors and map template classes.


Since I had to iterate through all those collection containers I found myself faced with 2 options:


  • Return iterators to the caller so that he could iterate over all the items himself, or
  • Add a ‘ResetIterator’ and ‘GetNext’ method to my classes for each vector or map I was using.

Both options are equally un-satisfying. The first one is plain ugly from a design point of view. It exposes implementation details to the caller, and all the iterator dereferencing syntax made my data combination algorithms look very ugly.


The second option has the problem that it forced me to copy paste the same methods at different places and give them a slightly different name to reflect that they were to be used for Nodes, Tags, Types, …


In the end I chose the second solution, but I kept thinking that there had to be a better way to solve this. I spent a significant portion of my free time to create solution number 3: to use an enumerator class that can enumerate over any type of STL container in a consistent way so that the caller never has to make assumptions on the container or iterator type. It should look elegant as well.


The only way this could work was to use template classes.


Still, this is not a trivial task because there is a significant difference between the implementations of map, vector and set for example. And of course those classes take template arguments that I have to preserve somehow, and finally, I wanted my solution to take only 1 template argument for instantiation.


To top it off, my final implementation had to be of production quality so that I could actually use it in future projects.


It took me quite some time, but now I am ready to tell the tale of my odyssey for a correct template <class _ContainerType> class StlEnumerator implementation.


Before I proceed with the technical stuff, I want to thank MVP Igor Tandetnik for pointing out that all STL containers make internal typedefs for their template arguments, and MVP Eugene Gershnik for suggesting a better specialization mechanism than the one I was using originally.


I also would like to point out that most code comments and error handling was removed from this article for clarity. The implementations you can download with this article are well documented and the final solution contains correct error handling code.


Proof of concept


Before investing my precious free time (having a newborn baby around the house…) I wanted to be sure that I knew what I was doing so I set out to create a solution for just the sequence type containers vector, list and deque.


My solution was the following template class:


template< class _Container > class SequenceEnumerator
{
private:
  typedef typename _Container::value_type T;
  _Container *m_Container;
  typename _Container::iterator m_Iterator;

public:
  //initialize the enumerator with a pointer to the container.
  SequenceEnumerator( _Container &Container)
  {
    m_Container = &Container;
    m_Iterator = Container.end();
  }

  //reset the enumerator to the begin of the container
  void Reset(void)
  {
    m_Iterator = m_Container->begin();
  }

  //overload of GetNext that returns a copy instead of a pointer.
  bool GetNext(T &Element)
  {
    if(m_Container->end() == m_Iterator)
      return false;

    Element = *m_Iterator;
    m_Iterator++;
    return true;
  }
};

The whole assumption is that the template type ‘_Container’ is a vector, list or deque. The iterator type can be deduced directly from ‘_Container’. In order to do anything useful with our class (i.e. return items contained in it) we have to know what is in it.


We don’t know what that type is, but we can extract it because the containers I mentioned already have an internal typedef ‘value_type’ that represents the contained type.


With that information if is fairly simple to build our SequenceEnumerator class.


‘Reset’ will reset the iterator to the beginning of the container, and GetNext will sequentially return all elements in the container until it gets to the end.


With this proof of concept complete I could start laying some groundwork for a decent solution.


Enumerator foundation stuff


With the proof that my idea should work –at least in theory- I could now try to design the actual implementation. I don’t use the term ‘design’ lightly. I have no qualms about throwing together some hacks and demo code to fulfill a 1 time purpose, but for library code and other stuff that gets used by people other than me I only release quality code (at least I try to J.


Not only should it be tested, but it should also be understandable, maintainable, and last but not least: look good.


Looking good is of course a subjective thing. ;-)


But I digress…


The design of the end solution is based on the following requirements:


  •  A lot of requested functionality is common across all containers, so it should be implemented only once if possible.
  • There are 2 types of containers: ones in which a number of single items is stored, and others in which key – value pairs are stored
  • Specialization of the final StlEnumerator for different containers should be very easy. There are 11 container classes, and I don’t want to have any code in any of them.

These 3 requirements bring me to the following design:


The base class


The base class contains the functionality that is identical for all containers. As a result, this template class really needs 2 template arguments: the container class itself, and the type of the value items inside.


Initially I tried to have only one template argument, but that is not possible because e.g. a map and a vector have different typedefs for their value types. To be precise, for a map the value type is the pair of key / value items. Not the value item type itself so it is impossible to extract _ValType from _Container without any knowledge about _Container.


template <class _Container, class _ValType> class EnumeratorBase
{
protected:
  _Container *m_Container;
  typename _Container::iterator m_Iterator;
 
  virtual void GetVal(_ValType ** Element) = 0;
  virtual void GetVal(_ValType & Element) = 0;

public:
  EnumeratorBase( _Container &Container) :
      m_Container(&Container),
      m_Iterator(Container.end())
  {
  }
  EnumeratorBase(EnumeratorBase<_Container, _ValType> const& original):
      m_Container(original.m_Container),
      m_Iterator(original.m_Iterator)
  {
  }
  EnumeratorBase():
      m_Container(NULL)

  {
  }
  virtual ~EnumeratorBase(){}

  inline void VerifyState(void){}
  size_t Size(void){}
  void Attach( _Container &Container){}
  void Detach(void){}
  void Reset(void){}

  bool GetNextVal(_ValType ** Val)
  {
    if(m_Container->end() == m_Iterator)
      return false;
    GetVal(Val);
    m_Iterator++;
    return true;
  }
  bool GetNextVal(_ValType &Val)
  {
    if(m_Container->end() == m_Iterator)
      return false;
    GetVal(Val);
    m_Iterator++;
    return true;
  }
};
The protected section of the class has a pointer to the container and an iterator. These are not for outside use, but derived classes need them to be able to do something.


The 2 ‘GetVal’ overloads serve the same purpose. The container type determines the iterator type, and the iterator determines the way it has to be de-referenced. At this level in the class hierarchy it is impossible to get a value out of the iterator because we either have to call iterator::second or *iterator, depending on the type of _Container.


There are some other simple methods that have a straightforward implementation. VerifyState throws an exception if the enumerator is not attached to a container.


Size return the size of the container.


Attach allows you to attach the enumerator to a different container of the same type. Detach will do the opposite.


Reset resets the internal iterator to _Container::begin().


This class also has 3 constructors:  a normal one for simply creating a new enumerator for a given container, a copy constructor to make sure that a copy has its own independent internal iterator and a parameter less constructor that can be used for delayed container attachment.


This is the functionality that is required in all container iterators.


The SequenceContainer derivation


The implementation of the ItemSequenceEnumerator is very simple, because it doesn’t support anything that is not also supported by the base class.


template<  class _Container >
class ItemSequenceEnumerator :
  public EnumeratorBase< _Container, typename _Container::value_type>
{
public:
  typedef typename _Container::value_type _ValType;
  ItemSequenceEnumerator( _Container &Container) :
      EnumeratorBase< _Container,
                      typename _Container::value_type>(Container)
  {
  }
  ItemSequenceEnumerator(
      ItemSequenceEnumerator<_Container> const& original) :
      EnumeratorBase< _Container,
                      typename _Container::value_type>(original)
  {
  }
  ItemSequenceEnumerator() :
      EnumeratorBase< _Container, typename _Container::value_type>()
  {
  }


private:
  void GetVal(_ValType ** Element)
  {
    *Element = &(*m_Iterator);
  }

  void GetVal(_ValType &Element)
  {
    Element = *m_Iterator;
  }
};

The constructors basically do nothing more than delegating to the enumerator base class, and the GetVal overloads are the implementations of the pure virtual functions that were declared by EnumeratorBase.


The KeyValue pair derivation


The key – value container enumerator is similar to the item sequence enumerator. The only extra methods are implemented to support not only the retrieval of all the values, but also of all the keys, and the both the keys and the values at the same time.


template<  class _Container >
class KeyValPairEnumerator :
  public EnumeratorBase< _Container, typename _Container::mapped_type>
{
public:
  typedef typename _Container::mapped_type _ValType;
  typedef typename _Container::key_type _KeyType;
  KeyValPairEnumerator( _Container &Container) :
      EnumeratorBase< _Container,
      typename _Container::mapped_type>(Container)
  {
  }
     
  KeyValPairEnumerator(
      KeyValPairEnumerator<_Container> const& original) :
      EnumeratorBase< _Container,
                      typename _Container::mapped_type>(original)
  {
  }
  KeyValPairEnumerator() :
      EnumeratorBase< _Container, typename _Container::mapped_type>()
  {
  }

  bool GetNextKey(_KeyType ** Key){}
  bool GetNextKey(_KeyType &Key){}
  bool GetNextPair(_KeyType ** Key, _ValType **Val){}
  bool GetNextPair(_KeyType &Key, _ValType &Val){}
private:
  void GetVal(_ValType ** Element)
  {
    *Element = &(m_Iterator->second);
  }
  void GetVal(_ValType &Element)
  {
    Element = m_Iterator->second;
  }
};

Apart from the extra methods and the typedefs, this class is pretty much identical to the previous class. As you can see, the iterator dereferencing is different for key-value iterators. This is also why the base class can’t handle that.


StlEnumerator template class


The main implementation of the StlEnumerator template class is surprisingly terse:


template<  class _Container>
class StlEnumerator
{
private:
  StlEnumerator();
};

This class contains only a private constructor declaration that is never meant to be used. Any attempt to instantiate this class will result in compiler errors, and that is how we want it to be.


The only function of this class is to act as a coat hanger to which specializations for specific containers can be attached.


The idea is that the only way to instantiate an StlEnumerator is via a supplied specialization. If there is no specialization for a container type, then it will not be possible to use an StlEnumerator with that type.


To facilitate the specialization, a specialized class will contain no code or typedefs. Instead it will only inherit from either KeyValueEnumerator or ItemSequenceEnumerator, which will do all the heavy lifting.


Even then, writing the same declaration over and over again (there are 11 standard containers I support atm) would be tedious copy / paste stuff, and a bother to maintain.


This is where the power of C style macros can be used.


Specalization macro for the ItemSequenceEnumerator


This macro defines an StlEnumerator specialization that derives from ItemSequenceEnumerator. It takes the container type as an argument and uses that as the container type for the specialization.


#define IMPLEMENT_SEQUENCE_ENUM(_SeqType)\
template< class _ValType > \
class StlEnumerator< _SeqType <_ValType> > : \
  public ItemSequenceEnumerator< _SeqType <_ValType> > \
{\
public:\
  typedef _SeqType<_ValType> _ContainerType;\
  typedef _ValType _ValType;\
  StlEnumerator(_ContainerType &Container) :\
      ItemSequenceEnumerator<_ContainerType>(Container)\
      {\
      }\
  StlEnumerator(StlEnumerator<_SeqType <_ValType> > &original) :\
      ItemSequenceEnumerator<_ContainerType>(original)\
      {\
      }\
  StlEnumerator() :\
      ItemSequenceEnumerator<_ContainerType>()\
      {\
      }\
};


There is a similar macro for KeyValuePairEnumerator.


Example of a specialization for vector


With the class hierarchy and helper macros in place, we can now easily create new specializations like this:


#ifdef _VECTOR_
#ifndef _VECTOR_ENUM_
#define _VECTOR_ENUM_
IMPLEMENT_SEQUENCE_ENUM(std::vector);
#endif
#endif

Each container type for which we have an implementation has similar conditional compilation flags to surround it.


This is done for compilation optimization.


In my header I have defined implementations for all 11 STL containers that I support at the moment. If I would define them unconditionally, any source file that uses an enumerator would automatically include all STL headers.


This has a noticeable impact on compilation time.


Instead, I opted to make the following assumption: if the header for a container is not already included when the StlEnumerator header is included, you probably don’t want to have an enumerator for it either.


This is also the reason that these implementations are not surrounded by the include guards that surround the classes from which StlEnumerator derives.


If StlEnumerator.h gets included multiple times in a class hierarchy, the base classes and macros get included the first time. Each time the file gets included after that, the specializations for new container types get added in.


Testing the code


With templates you don’t know if they compile until you use them. That’s why I created a small console application that uses an enumerator for each of the container types I support, and then uses that functionality.


I am not going to cover the test in detail here. It is included with the download.


Conclusion


The amount of effort that went into this code is perhaps disproportional to the amount of time I win by using it, but on the other hand I now have a class that I can use in any project where I have to enumerator a container that is a member of another class, and I learned several new things about template specialization that I didn’t know was possible.


I know that there are still features missing from my enumerator. From the top of my head, I can imagine that it would be nice to implement a ++ and — operator to move the iterator without retrieving its value. On top of that it could come in handy if you can retrieve the current value.


Or maybe even cast an enumerator to its item type so that you can assign it directly. I have thought about this, but you can only do that for the value types. Key values cannot be cast like that because the cast would be ambiguous if the key type is the same as the value type.


The reason I didn’t bother with all that is that it would add nothing to this article except more features. It wouldn’t make any bit of difference in the design.


The code itself is up for download under the MIT license so that you can use it in any project, should you think the code is useful to you.


In the course of writing this article, I made a couple of attempts, and found a compiler bug in the VC++ compiler. The following projects are available in the solution:


  • Console1: the proof of concept code.
  • BugRepro: a reproduction of the compiler bug I found. It is only included in the solution to demonstrate the bug that I filed on connect.
  • Console2: an implementation of StlEnumerator according to my original idea, but with a workaround that makes the whole design pretty ugly. It works, but I wanted a different solution.
  • Console3: an implementation that uses the specialization mechanism of the final solution. This one was just to prove that the proposed design works.
  • Console4: the final implementation with correct error handling, all features implemented, helper macros and specialization for all container types and test code for all features.
Should you find bugs or if use this code in a project of your own, I would be glad to hear from you.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>