Word 2013 and text files as mail merge data sources

Once again, a forum question has prompted a topic for my blog. This time, it comes from the discussions about developing for Office 2013.

For many years, a main focus of my work with Word has been related to importing and exporting data into and out of Word documents. The built-in functionality for this has always been “mail merge”. So I’ve worked with mail merge a lot. Unfortunately, it hasn’t really evolved since the early days of Word, around 1990. The only significant change occurred in Word 2003, when OLE DB became a supported connection method to the data source (and the default).

At the same time, the older connection methods, such as ODBC, have been disappearing from the default version of MDAC (Microsoft Data Access Components) installed with newer versions of Windows. The newer connection methods do not necessarily work well with traditional data sources, a case in point being character-delimited text files.

When I read the original question in the forum thread, the first thing I did was to start up Word 2013 and set up a mail merge to a semi-colon-delimited text file I typed up quickly in Notepad with a couple of records. I then recorded a macro while linking to this file as that would give me the OpenDataSource syntax Word 2013 expects.

The first thing I noticed was that Windows 7 has no Text ODBC driver, which is what we’ve always used for setting up mail merge to text files. My only choice was OLE DB, which prompted me twice to specify the character used to delimit the fields in the data file. Other than that, everything appeared to work just fine, until I saved and closed the main merge document then re-opened it.

At this point, I again had to confirm the character-delimiter (only once) and I was prompted to choose the encoding used for the data source. Accepting the default (Simple Chinese) resulted in some rather odd characters in place of the non-standard ASCII characters (German umlauts) in the data source. What’s more, the prompt wasn’t shown to me again, so that I could change my selection. Definitely not “user-friendly”…

The next step was to look at the recorded macro. The OLE DB driver that Word decided to use was that of Access. And it was apparently not able to link up to the text file directly; instead, it generated a copy of the file in a temp directory – txt5C990.txt – that I was unable to open. Here’s the recorded code, which I’ve modified in order to correctly show the Connection information:

  Dim conn as String
  conn = "Provider=Microsoft.ACE.OLEDB.12.0;"
  conn = conn & "User ID=Admin;"
  conn = conn & "Data Source=C:\Users\CINDYM~1\AppData\Local\Temp;"
  conn = conn & "Mode=Read;Extended Properties=""HDR=YES;"";"
  conn = conn & "Jet OLEDB:System database="""";Jet OLEDB:Registry Path="""";"
  conn = conn & "Jet OLEDB:Engine Type=96;Jet OLEDB:Database Locking Mode=0;Je"
  ActiveDocument.MailMerge.OpenDataSource Name:= _
     "C:\Users\Cindy Meister\Documents\TextDatasource.txt", ConfirmConversions _
     :=False, ReadOnly:=False, LinkToSource:=True, AddToRecentFiles:=False, _
     PasswordDocument:="", PasswordTemplate:="", WritePasswordDocument:="", _
     WritePasswordTemplate:="", Revert:=False, Format:=wdOpenFormatAuto, _
     Connection:=conn, _
     SQLStatement:="SELECT * FROM `txt5C990.txt`", SQLStatement1:="", _

There is some question in my mind about how well this will work in an enterprise environment. In any case, all the prompts, on top of the standard warning about running an SQL statement that Word displays when opening a mail merge document, are a nuisance.

You may ask, why use a character-delimited text file as a data source, in the first place?

There are still many legacy systems in-use, and character-delimited text files have always been the universally accepted method to send data from one system to another. “Everyone” can understand text; not all systems can work with Excel files or properietary databases. These days, XML is supposed to fill that role, but XML has never been incorporated into Word’s mail merge functionality.

(One reason, certainly, is because XML is not constrained to any particular structure that can be interpreted as a “flat table format”. Working out an algorithm that can turn XML files into data sources recognizable by Word would certainly be non-trivial.)

So the delimited-character text file is an important type of data source for Word’s mail merge. But the lack of a default ODBC data connection method has become a possible obstacle to its use.

My mind, therefore, began to turn over possible ways to work around these problems on a default Windows installation without a Text ODBC driver.

Mail merge still works very well using a Word document as the data source. Usually, when we consider that, we think about putting the data into a Word table. This seemed like a lot of work, though, and considering that you’re limited to a maximum of 63 fields for mail merge, I wanted to find an alternate approach.

I then contemplated opening the data source using a TextReader and writing the data line-for-line into a Word document, but thinking about how I’d approach this in the Word UI, it still seemed like more work than it should be.

In the Word UI, I’d use Insert/File to bring the text into the Word document, then save it. If only there were an equivalent while using the Open XML SDK… I vaguely recalled there is a possibility to embed an outside file into a Word document and finally turned it up: altChunk.

altChunk is an “anchor for imported external content”. You can import an entire file of a supported file type as a separate part into a Word document (see AlternativeFormatImportPartType), plain text files being one of them. The content of the imported file appears as Word content and, if the document is later saved from within Word, that part is fully integrated into the Word document.xml.

Here’s the Open XML SDK code to do it:

static void EmbedTextfileAsDatasource()
    string fileName1 = @"c:\Test\Destination.docx";
    string fileName2 = @"c:\Test\CSV_Data.txt";
    string testFile = @"c:\Test\Test.docx";
    File.Copy(testFile, fileName1);
    using (WordprocessingDocument myDoc = 
           WordprocessingDocument.Open(fileName1, true))
        string altChunkId = "AltChunkId1";
        MainDocumentPart mainPart = myDoc.MainDocumentPart;
        AlternativeFormatImportPart chunk =
            AlternativeFormatImportPartType.TextPlain, altChunkId);
        using (FileStream fileStream = File.Open(fileName2, FileMode.Open))
        AltChunk altChunk = new AltChunk();
        altChunk.Id = altChunkId;

This is basically the code from the MSDN article on AltChunk, but with two important differences:

  1. it imports the file type AlternativeFormatImportPartType.TextPlain
  2. it inserts the altChunk at the beginning of the document and deletes the last paragraph (which is empty and which mail merge interprets as an “empty record”, causing Word to display a warning to that effect)

This approach avoids all the problems encountered when merging directly to a character-delimited text file. In addition, the integration of the text file can be done server-side, which is important when dealing with enterprise and legacy systems.

2 Responses to “Word 2013 and text files as mail merge data sources”

  1.   Brenda Evans Says:

    Ineed to adapt the following code for use on 2013.

    ReadOnly:False, Connection:=szConnection

    Ithink I need to change the subtype

    Any help would be appreciate

    •   wordmeister Says:

      Hi Brenda

      I recommend you ask in the Word for Developers forum, where it’s possible to have a good discussion with many people who have experience in mail merge connections. Please don’t forget to provide information about the type of datasource involved and what kinds of problems your current code is having. It would also help to see the information your passing in szConnection.

Leave a Reply