Web Add-ins: Using Office Open XML to extend the JavaScript APIs

In both the posts about Coercion types in Word Web Add-ins and reading built-in document properties a key point is using the Word Open XML file format in order to achieve something not available in the APIs. This approach works only in Word and it can be used for almost anything. Even the newer, extended APIs that will be released for Office 2016 won’t (initially) provide functionality for everything that can be contained in a Word document. So working with the Word Open XML file format is a useful tool for the JavaScript as well as the VBA and .NET developer.

If you’re not already familiar with the how the Office Open XML file formats are constructed, there’s a basic, not too technical overview on MSDN. As you can see, no special tools are required to look into what makes up an Office Open XML file and it can be edited by anyone.

From the developer’s point of view, tools are required in order to create and edit Office Open XML files outside their host application. Visual Studio provides these in the form of System.IO.Packaging and its standard XML namespaces, which is still a rather complex undertaking. The Open XML SDK makes work with the OOXML files more like using the COM object model, which makes the task more accessible, but the SDK is built for VB.NET and C# and isn’t designed to work with the JavaScript APIs. (There is an Office Open XML SDK for JavaScript, but it doesn’t provide the object model-like interfaces. I plan to make it a topic of a future post.)

Another obstacle is that the “package” format of an Office Open XML file is not something that can be written directly into a Word document. Word expects the content to be in the “OPC flat-file” format – a single string rather than multiple files in a zip package.

So, how does one go about getting a handle on using WordOpenXML in a Web Add-in?

A good starting place is the recently written article “Create better add-ins for Word with Office Open XML“. Among other things, it tells you the minimum Word Open XML in the OPC flat file format required for inserting content into a Word document and explains how this can change if you want to insert more complex information, such as a graphic.

This is all fine and good, but a single article cannot cover everything that a Word document can contain. So how do you find out what’s required if you want to do something else, such as inserting field codes?

One useful tool is the Open XML SDK Productivity Tool. This can show you the XML content of the XML files in a zip package without needing to rename the file and without needing to open each xml file in the package individually. After installing it, start the Productivity Tool and

  1. Open a file.
  2. In the Document Explorer, navigate to the part you’re interested in. Generally, this will probably be w:document (document.xml), or somewhere in that file. Clicking on the symbols at the left of the node tree expand the child element lists.
  3. Click on the element in question then execute Reflect code. The XML will be displayed at the right, on top, nicely formatted. The Open XML SDK code appears at the bottom, but since you’re not interested in that you can drag the split pane down to show more of the XML.

It’s not possible to edit XML in the Productivity Tool, but you can select, copy and paste to an editor.

If you’re not sure what part of the XML describes the functionality you’re interested in:

  1. Create two documents that are identical, except for the bit you want to inspect.
  2. Open one of them in the Productivity Tool then use Compare documents and select the other.
  3. You’ll see the files in the zip packages listed side-by-side; the parts with differences will be shaded gray. Double-click the XML file that (should) contain the information you need and the XML content of both files will be displayed, again with the differences shaded in gray.
  4. Dragging the scroller moves the content in-parallel, making it easy to compare.

If you’re unsure what the WordOpenXML means that you’re seeing, it’s documented in the ECMA standard 376 documentation. The current (4th, but when Office 2016 releases there will probably be an update) and older editions can be downloaded as a zipped set of PDF files, free-of-charge. For most research into what the XML is Part 1 is all you’ll need. If you have questions about how the file format specification is supposed to work, there’s a forum on MSDN supported by knowledgeable engineers.

The next post will look at saving “template” Word Open XML in a resource file, to be loaded and modified by JavaScript code before being inserted into a Word document.

Leave a Reply