Grouping with LINQ to XML

One of the major improvements in XSLT 2.0 is grouping where the xsl:for-each-group element supports four different grouping ways with group-by, group-starting-with, group-ending-with and group-adjacent. Unfortunately Microsoft does not support XSLT 2.0 thus if you need to group some XML input and don’t want to write XSLT 1.0 you will need to use third party XSLT 2.0 implementation like Saxon 9 or AltovaXML tools or you will need to use the grouping support in LINQ with LINQ to XML.

In this blog post I will look into solving the grouping examples in the XSLT 2.0 specification with LINQ to XML to find out whether/how LINQ to XML allows you to solve the same problems that group-by, group-starting-with, group-ending-with and group-adjacent in XSLT 2.0 allow you to solve.

The first example is rather straightforward, the group-by in XSLT 2.0 can be “translated” into a group by in LINQ. So with cities.xml being

<cities>
<city name="Milano" country="Italia" pop="5"/>
<city name="Paris" country="France" pop="7"/>
<city name="München" country="Deutschland" pop="4"/>
<city name="Lyon" country="France" pop="2"/>
<city name="Venezia" country="Italia" pop="1"/>
</cities>

we can use the following C# code

            XDocument input = XDocument.Load(@"cities.xml");

XDocument output = new XDocument(
new XElement("table",
new XElement("tr",
new XElement("th", "Position"),
new XElement("th", "Country"),
new XElement("th", "List of Cities"),
new XElement("th", "Population"),
(from city in input.Root.Elements("city")
group city by (string)city.Attribute("country"))
.Select((g, i) =>
new XElement("tr",
new XElement("td", i + 1),
new XElement("td", g.Key),
new XElement("td", string.Join(", ", g.Select(c => (string)c.Attribute("name")).ToArray())),
new XElement("td", g.Sum(c => (decimal)c.Attribute("pop"))))))));

output.Save(@"cities.html");

to get cities.html as follows:

<table>
<tr>
<th>Position</th>
<th>Country</th>
<th>List of Cities</th>
<th>Population</th>
<tr>
<td>1</td>
<td>Italia</td>
<td>Milano, Venezia</td>
<td>6</td>
</tr>
<tr>
<td>2</td>
<td>France</td>
<td>Paris, Lyon</td>
<td>9</td>
</tr>
<tr>
<td>3</td>
<td>Deutschland</td>
<td>München</td>
<td>4</td>
</tr>
</tr>
</table>

The only nuisance with the LINQ solution is that, to get the positional index, we need to switch from the query expression syntax to the functional syntax. And the index in XSLT/XPath is one based whereas in .NET is is zero based so we need to add one to get the same result.

The second example is called “A Composite Grouping Key” as it wants to compute the average population value for each (country, name) combination. However the XSLT 2.0 solution then suggests to nest two xsl:for-each-group as XSLT 2.0 does not allow a grouping key containing of two separate values. LINQ however allows that so with LINQ to XML we can use the approach shown below. Assuming the XML input cities2.xml is

<cities>
<city name="Milano" country="Italia" year="1950" pop="5.23"/>
<city name="Milano" country="Italia" year="1960" pop="5.29"/>
<city name="Padova" country="Italia" year="1950" pop="0.69"/>
<city name="Padova" country="Italia" year="1960" pop="0.93"/>
<city name="Paris" country="France" year="1951" pop="7.2"/>
<city name="Paris" country="France" year="1961" pop="7.6"/>
</cities>

then this C# code

            XDocument input = XDocument.Load(@"cities2.xml");

XDocument output = new XDocument(
new XElement("div",
from city in input.Root.Elements("city")
group city by new { country = (string)city.Attribute("country"), name = (string)city.Attribute("name") } into g
select new XElement("p",
g.Key.name + ", "
+ g.Key.country + ": "
+ g.Average(c => (decimal)c.Attribute("pop")).ToString(CultureInfo.InvariantCulture))));

output.Save(@"cities2.html");

produces the following output snippet:

<div>
<p>Milano, Italia: 5.26</p>
<p>Padova, Italia: 0.81</p>
<p>Paris, France: 7.4</p>
</div>

The last group-by example in the XSLT 2.0 specification is called “Adding an Element to Several Groups”. Here XSLT’s “for-each-group group-by” excels as it simply allows the group-by expression to evaluate to a sequence of grouping keys where an item can then belong to several groups. With LINQ that is not possible so we need to follow a different approach, we first need to find the distinct keys, then for each distinct key we need to process the elements containing the key. So assuming the XML input is

<titles>
<title>A Beginner's Guide to <ix>Java</ix></title>
<title>Learning <ix>XML</ix></title>
<title>Using <ix>XML</ix> with <ix>Java</ix></title>
</titles>

the following C# code

            XDocument input = XDocument.Load(@"titles.xml");

XDocument output = new XDocument(
new XElement("div",
from key in input.Root.Elements("title").Elements("ix").Select(i => i.Value).Distinct()
select
(new List<XElement>() {new XElement("h2", key)}).Union(
from title in input.Root.Elements("title")
where title.Elements("ix").Any(ix => (string)ix == key)
select new XElement("p", title.Value))));

output.Save("titles.html);

 

produces the output fragment

<div>
<h2>Java</h2>
<p>A Beginner's Guide to Java</p>
<p>Using XML with Java</p>
<h2>XML</h2>
<p>Learning XML</p>
<p>Using XML with Java</p>
</div>

 

Now let’s look at group-starting-with. With the sample input being

<body>
<h2>Introduction</h2>
<p>XSLT is used to write stylesheets.</p>
<p>XQuery is used to query XML databases.</p>
<h2>What is a stylesheet?</h2>
<p>A stylesheet is an XML document used to define a transformation.</p>
<p>Stylesheets may be written in XSLT.</p>
<p>XSLT 2.0 introduces new grouping constructs.</p>
</body>

the XSLT 2.0 solution has a template for the body element doing an <xsl:for-each-group select=”*” group-starting-with=”h2″> so it selects all child elements of the body and specifies a pattern ‘h2’ to identity the element type starting a group. The LINQ grouping construct does not allow to follow this same approach but we can use a different approach instead: we process the ‘p’ child elements and then group them by the immediately preceding ‘h2’ sibling. We can select that with LINQ to XML as p.NodesBeforeSelf().OfType<XElement>().LastOrDefault(e => e.Name == “h2”) which takes the preceding sibling nodes (NodesBeforeSelf()), restricts that to XElement element nodes (OfType<XElement>()) and then takes the last one in document order to have the name ‘h2’. With that approach the C# code looks as follows:

            XDocument input = XDocument.Load("input.xml");
XDocument output = new XDocument(
new XElement("chapter",
from p in input.Root.Elements("p")
group p by p.NodesBeforeSelf().OfType<XElement>().LastOrDefault(e => e.Name == "h2") into g
select new XElement("section",
new XAttribute("title", g.Key != null ? g.Key.Value : ""),
from p2 in g
select new XElement("para", p2.Value))));

output.Save("output.xml");

 

and creates the following XML output:

<chapter>
<section title="Introduction">
<para>XSLT is used to write stylesheets.</para>
<para>XQuery is used to query XML databases.</para>
</section>
<section title="What is a stylesheet?">
<para>A stylesheet is an XML document used to define a transformation.</para>
<para>Stylesheets may be written in XSLT.</para>
<para>XSLT 2.0 introduces new grouping constructs.</para>
</section>
</chapter>

which is the same result the XSLT stylesheet creates. So basically as LINQ allows the grouping key to be an object we simply need to identity the XElement object (or generally the XNode object) starting a group and then we can make use the the LINQ group by construct.

Not surprisingly a similar approach can be applied for the group-ending-with example. It has the following input

<doc>
<page continued="yes">Some text</page>
<page continued="yes">More text</page>
<page>Yet more text</page>
<page continued="yes">Some words</page>
<page continued="yes">More words</page>
<page>Yet more words</page>
</doc>

and then has a template matching the ‘doc’ element with an xsl:for-each-group select=”*” group-ending-with=”page[not(@continued=’yes’)]” meaning it processes all child element and provides a pattern to indentify the element node ending a group. With LINQ to XML we can select the ‘page’ elements having the continued=”yes” attribute and group them by the immediately following sibling ‘page’ element not having that attribute. That can be found using page.NodesAfterSelf().OfType<XElement>().FirstOrDefault(p => !((string)p.Attribute(“continued”) == “yes”)) which selects the following sibling nodes, restricts that to XElement nodes and then takes the first in document order which does not have the continued=”yes” attribute. So the C# code looks as follows:

            XDocument input = XDocument.Load("input.xml");

XDocument output = new XDocument(
new XElement(input.Root.Name,
from page in input.Root.Elements("page")
where (string)page.Attribute("continued") == "yes"
group page by page.NodesAfterSelf().OfType<XElement>().FirstOrDefault(p => !((string)p.Attribute("continued") == "yes")) into g
select new XElement("pageset",
from page2 in g
select new XElement("page", page2.Value),
g.Key)));

output.Save("output.xml");

 

and produces the following output:

<doc>
<pageset>
<page>Some text</page>
<page>More text</page>
<page>Yet more text</page>
</pageset>
<pageset>
<page>Some words</page>
<page>More words</page>
<page>Yet more words</page>
</pageset>
</doc>

Again, LINQ allowing us to have an object (e.g. XElement node) as the grouping key provides us with a way to use the LINQ group by to solve that problem.

That leaves us with the group-adjacent XSLT example to be solved with LINQ to XML. As doing that looks to be harder and I am thinking of implementing that with an extension method I will end this article here and treat group-adjacent in another article.

 

 

 

 

Parsing XHTML documents with .NET 4.0 and XmlPreloadedResolver

When I looked at “What’s new in System.Xml in .NET 4.0/Visual Studio 2010” with the beta 1 release I presented an example that shows how parsing an XHTML document referencing one of the W3C XHTML 1.0 DTDs can be sped up by using the new XmlReaderSettings.DtdProcessing set to DtdProcessing.Ignore. The drawback I mentioned is that any referenced entity in the document would then throw an exception.

What I overlooked at the time of the beta 1 release but I have found now in the recent beta 2 release is the new class XmlPreloadedResolver in System.Xml.Resolvers. It allows you to avoid any network access to the W3C’s server for the XHTML DTDs but nevertheless parse any XHTML document having entity references as it uses copies of those DTDs stored in an assembly deployed with the .NET framework.

If I use that class with an adaption of the older example the code looks as follows:

            Stopwatch watch = new Stopwatch();
XmlReaderSettings settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Parse;


string xhtml = @"<!DOCTYPE html
PUBLIC ""-//W3C//DTD XHTML 1.0 Strict//EN""
""http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"">
<html xml:lang=""en"">
<head>
<title>Example</title>
</head>
<body>
<p>Price is: 100 &euro;</p>
</body>
</html>"
;
watch.Start();
using (XmlReader reader = XmlReader.Create(new StringReader(xhtml), settings))
{
while (reader.Read())
{
if (reader.NodeType == XmlNodeType.Text)
{
Console.WriteLine(reader.Value);
}
}
}
watch.Stop(); ;
Console.WriteLine("First parse: elapsed time: {0}", watch.Elapsed);

watch.Reset();

settings.XmlResolver = new XmlPreloadedResolver(XmlKnownDtds.Xhtml10);

watch.Start();
using (XmlReader reader = XmlReader.Create(new StringReader(xhtml), settings))
{
while (reader.Read())
{
if (reader.NodeType == XmlNodeType.Text)
{
Console.WriteLine(reader.Value);
}
}
}
watch.Stop(); ;
Console.WriteLine("Second parse: elapsed time: {0}", watch.Elapsed);

Running that code here with Visual Studio 2010 Beta 2 in a virtual machine outputs numbers clearly showing the speed gained by parsing with the XmlPreloadedResolver:

First parse: elapsed time: 00:00:04.6378648
Second parse: elapsed time: 00:00:00.0441933

 

Exploiting covariance with LINQ to XML

In my last post I showed how the new contravariance feature in .NET 4.0/Visual Studio 2010 for type parameters of generic interfaces makes coding with LINQ to XML easier and more straightforward. In this post I will show how the covariance of the type parameter T of IEnumerable<T> also allows us to write LINQ to XML queries in a more straightforward way.

Let’s assume we have the following XML document:

<?xml version="1.0" encoding="utf-8" ?>
<root>
<!-- comment 1 -->
<foo>foo 1</foo>
<bar>bar 1</bar>
<!-- comment 2 -->
<foo>foo 2</foo>
<bar>2</bar>
<!-- comment 3
-->
</root>

and we want to transform that document into a second one with the same root element having the same child nodes, except where all ‘bar’ child elements of the ‘root’ element node have been removed. So the result should look as follows:

<?xml version="1.0" encoding="utf-8" ?>
<root>
<!-- comment 1 -->
<foo>foo 1</foo>
<!-- comment 2 -->
<foo>foo 2</foo>
<!-- comment 3
-->
</root>

A first attempt to achieve that with LINQ to XML could look as follows:

            XDocument doc1 = XDocument.Load(@"XMLFile1.xml");

XDocument doc2 =
new XDocument(
new XElement(doc1.Root.Name,
doc1
.Root
.Nodes()
.Except(
doc1
.Root
.Elements("bar"))));

doc2.Save(Console.Out);

So we create a new XDocument with a new root XElement having the same name as the Root of the first XDocument where all child nodes except of the ‘bar’ child elements are copied.

Looks nice and straightforward only if you try to compile that with Visual Studio 2008/.NET 3.5 you get the following error: “Argument ‘2’: cannot convert from ‘System.Collections.Generic.IEnumerable<System.Xml.Linq.XElement>’ to ‘System.Collections.Generic.IEnumerable<System.Xml.Linq.XNode>'”.

The problem is that the Nodes() call returns an IEnumerable<XNode> and then the following Except() call also needs an IEnumerable<XNode> as its argument while Elements(“bar”) gives us an IEnumerable<XElement>. With generic interfaces being invariant in .NET 3.5 we can’t pass that IEnumerable<XElement> in for an IEnumerable<XNode>, although XElement is a class derived from XNode.

As a workaround we can first cast the IEnumerable<XElement> to an IEnumerable<XNode>:

            XDocument doc1 = XDocument.Load(@"XMLFile1.xml");

XDocument doc2 =
new XDocument(
new XElement(doc1.Root.Name,
doc1
.Root
.Nodes()
.Except(
doc1
.Root
.Elements("bar")
.Cast<XNode>())));

doc2.Save(Console.Out);

That way it compiles fine and produces the wanted result with .NET 3.5, only it seems desirable that you would not need that Cast<XNode>() call.

The good news is that starting with .NET 4.0 the type parameter T of IEnumerable<T> is covariant meaning where an IEnumerable<T> of a certain type T is expected we can always pass in an IEnumerable<T2> where T2 is type derived from T, as in our example where XElement is a subclass of XNode (or subsubclass to be precise).

Thus with .NET 4.0 the following compiles and works fine:

            XDocument doc1 = XDocument.Load(@"XMLFile1.xml");

XDocument doc2 =
new XDocument(
new XElement(doc1.Root.Name,
doc1
.Root
.Nodes()
.Except(
doc1
.Root
.Elements("bar"))));

doc2.Save(Console.Out);

 

 

 

Exploiting contravariance with LINQ to XML

Covariance and contravariance for generic interfaces are new features in C# and VB.NET in Visual Studio 2010 respectively the .NET framework 4.0. Generic interfaces like IEnumerable<T> or IEqualityComparer<T> in the .NET framework 4.0 use these new features. Starting with .NET 4.0 the type parameter T in IEqualityComparer<T> is contravariant. That can make coding with LINQ to XML easier, as the class XNodeEqualityComparer implements IEqualityComparer<XNode> where XNode is a common base class for other LINQ to XML classes like XElement.

Let’s look at an example. Assume we have the following XML document

<?xml version="1.0" encoding="utf-8" ?>
<root>
<items>
<item>
<foo>a</foo>
<bar>1</bar>
</item>
<item>
<foo>b</foo>
<bar>2</bar>
</item>
<item>
<foo>a</foo>
<bar>1</bar>
</item>
<item>
<foo>c</foo>
<bar>3</bar>
</item>
<item>
<foo>c</foo>
<bar>3</bar>
</item>
</items>
</root>

and we want to use LINQ to XML to extract distinct items where we use XNodeEqualityComparer to compare the ‘item’ elements in the XML document.

You could be tempted to try it as follows:

            XDocument doc = XDocument.Load("XMLFile1.xml");

var distinctItems =
doc
.Root
.Element("items")
.Elements("item")
.Distinct(new XNodeEqualityComparer())
.Select(i => new { foo = (string)i.Element("foo"), bar = (int)i.Element("bar") });

foreach (var item in distinctItems)
{
Console.WriteLine(item);
}

but with .NET 3.5 that does not compile, complaining “Instance argument: cannot convert from ‘System.Collections.Generic.IEnumerable<System.Xml.Linq.XElement>’ to ‘System.Collections.Generic.IEnumerable<System.Xml.Linq.XNode>'” on the Distinct(new XNodeEqualityComparer()) call. That happens because Elements(“item”) gives us an IEnumerable<XElement> and subsequently the Distinct method wants an IEqualityComparer<XElement> to be passed in while we only pass in an IEqualityComparer<XNode>.

With .NET 3.5 to work around that problem we first have to cast IEnumerable<XElement> up to IEnumerable<XNode> before we call Distinct(new XNodeEqualityComparer()) and then down again after the Distinct() call:

            XDocument doc = XDocument.Load("XMLFile1.xml");

var distinctItems =
doc
.Root
.Element("items")
.Elements("item")
.Cast<XNode>()
.Distinct(new XNodeEqualityComparer())
.Cast<XElement>()
.Select(i => new { foo = (string)i.Element("foo"), bar = (int)i.Element("bar") });

foreach (var item in distinctItems)
{
Console.WriteLine(item);
}

That compiles fine and nicely returns only distinct items:

{ foo = a, bar = 1 }
{ foo = b, bar = 2 }
{ foo = c, bar = 3 }

With .NET 4.0 however the type parameter T of IEqualityComparer is contravariant meaning if we have a method expecting an IEqualityComparer<XElement> it suffices to use a base type of XElement like XNode and thus with .NET 4.0 our original attempt compiles and runs fine:

            XDocument doc = XDocument.Load("XMLFile1.xml");

var distinctItems =
doc
.Root
.Element("items")
.Elements("item")
.Distinct(new XNodeEqualityComparer())
.Select(i => new { foo = (string)i.Element("foo"), bar = (int)i.Element("bar") });

foreach (var item in distinctItems)
{
Console.WriteLine(item);
}

 

 

 

 

What is new in System.Xml in .NET 4.0/Visual Studio 2010

Beta 1 of the .NET framework 4.0 and of Visual Studio 2010 has been released a few days ago. Although the “What’s new” document does not list any new features in System.Xml or LINQ to XML I am browsing through the documentation to find new features or changes in APIs.

So far I have found the following:

With LINQ to XML the SaveOptions enumeration has a new flag named OmitDuplicateNamespaces. That is particularly useful with VB.NET XML literals as using them you might end up with more namespace declaration attributes as you want resulting in superfluous namespace declarations on child or descendant elements when you save/serialize a LINQ to XML XDocument or XElement.

Here is an example in VB.NET with .NET 3.5:

Imports System
Imports System.Xml.Linq
Imports <xmlns="http://www.w3.org/1999/xhtml">

Module Module1

Sub Main()
Dim html As XElement = _
<html>
<head>
<title>Example</title>
</head>
<body>
</body>
</html>

html.<body>(0).Add(GetParagraphs())

html.Save(Console.Out)

End Sub

Function GetParagraphs() As IEnumerable(Of XElement)
Dim ps() As String = {"Paragraph 1.", "Paragraph 2."}
Return (From p In ps Select <p><%= p %></p>)
End Function

End Module

Its output is as follows:

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Example</title>
</head>
<body>
<p xmlns="http://www.w3.org/1999/xhtml">Paragraph 1.</p>
<p xmlns="http://www.w3.org/1999/xhtml">Paragraph 2.</p>
</body>
</html>

As you can see, the namespace declarations on the ‘p’ elements are redundant as the namespace is already defined on the ‘html’ root element.

With .NET 4.0 and the SaveOptions.OmitDuplicateNamespaces flag you can avoid them as follows:

Imports System
Imports System.Xml.Linq
Imports <xmlns="http://www.w3.org/1999/xhtml">

Module Module1

Sub Main()
Dim html As XElement = _
<html>
<head>
<title>Example</title>
</head>
<body>
</body>
</html>

html.<body>(0).Add(GetParagraphs())


html.Save(Console.Out, SaveOptions.OmitDuplicateNamespaces)

End Sub

Function GetParagraphs() As IEnumerable(Of XElement)
Dim ps() As String = {"Paragraph 1.", "Paragraph 2."}
Return (From p In ps Select <p><%= p %></p>)
End Function

End Module

Now the output is fine:

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Example</title>
</head>
<body>
<p>Paragraph 1.</p>
<p>Paragraph 2.</p>
</body>
</html>

Have you ever wondered why an XDocument or XElement in .NET 3.5 could be saved to a TextWriter or a file or an XmlWriter but not directly to a Stream? In .NET 3.5 you need to construct an XmlWriter or TextWriter over a Stream but now in .NET 4.0 you can save directly to a Stream: XDocument.Save(Stream), XElement.Save(Stream). No functionality gain but a convenient addition, for instance when you want to send the serialization of an XDocument or XElement to the request stream of an HttpWebRequest. There are also corresponding Load methods taking a Stream as the input, XDocument.Load(Stream), XElement.Load(Stream).

 

There is also a new enumeration ReaderOptions in System.Xml.Linq but so far I have not found any method or property using that enumeration.

 

XmlReaderSettings has a new property DtdProcessing that replaces the now obsolete ProhibitDtd property. With the boolean property ProhibitDtd you could choose to either allow DTD parsing/processing or to prohibit it. With the new DtdProcessing property you have now three choices, prohibit, parse, or ignore. Ignore could give you performance benefits over parse. for instance the following parses the W3C home page twice, once ignoring the DTD, once parsing/processing it:

            Stopwatch watch = new Stopwatch();
XmlReaderSettings settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Ignore;
watch.Start();
using (XmlReader reader = XmlReader.Create(@"http://www.w3.org/", settings))
{
while (reader.Read())
{
}
}
watch.Stop();
Console.WriteLine("DtdProcessing.Ignore: Elapsed time: {0}", watch.Elapsed);

settings.DtdProcessing = DtdProcessing.Parse;
watch.Start();
using (XmlReader reader = XmlReader.Create(@"http://www.w3.org/", settings))
{
while (reader.Read())
{
}
}
watch.Stop();
Console.WriteLine("DtdProcessing.Parse: Elapsed time: {0}", watch.Elapsed);

 

The output for me here is

DtdProcessing.Ignore: Elapsed time: 00:00:01.5245222
DtdProcessing.Parse: Elapsed time: 00:00:09.1677892

so ignoring the DTD is about nine times faster for that sample document. On the other hand if the DTD defines any entities that are then referenced in the XML document ignoring the DTD would give you an exception:

            XmlReaderSettings settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Ignore;

string xhtml = @"<!DOCTYPE html
PUBLIC ""-//W3C//DTD XHTML 1.0 Strict//EN""
""http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"">
<html xml:lang=""en"">
<head>
<title>Example</title>
</head>
<body>
<p>Price is: 100 &euro;</p>
</body>
</html>"
;
using (XmlReader reader = XmlReader.Create(new StringReader(xhtml), settings))
{
while (reader.Read()) { }
}

throws the exception “Reference to undeclared entity ‘euro'”.

 

That’s all I have found so far, I will edit this post when I find more.

[edit 2009-05-26] I have now found a page “What’s new in System.Xml” in the .NET framework 4 Beta 1 documentation. Oddly enough it lists LINQ to XML and the XSLT compiler as new features although both were introduced in .NET 3.5. It also mentions new methods in the XmlConvert class.

Creating XML with namespaces with JavaScript and MSXML

In my previous post I showed how to use the W3C DOM API to create XML with namespaces, using namespace aware methods like createElementNS or setAttributeNS of the W3C DOM Level 2 and 3 Core API. While MSXML (all versions including the latest, MSXML 6) implements the W3C DOM Level 1 Core API it does not implement any of the methods used in the previous post, like createElementNS or setAttributeNS. Instead it has a single method createNode that takes as its first argument the node type, as its second argument the qualified name and as its third argument the namespace you want to create a node in. This post shows how to use that method to create XML with namespaces with MSXML and JavaScript. The examples use MSXML 3 but later versions of MSXML (e.g. 4, 5, 6) expose exactly the same API.

Let’s assume you want to create the following XML document with JavaScript and the MSXML DOM API:

<root xmlns="http://example.com/ns1"><foo><bar>foobar</bar></foo></root>

The key to doing that properly is to undestand the following: in the
XML markup there is an XML default namespace declaration attribute
xmlns=”http://example.com/ns1″ on the “root” element that is in scope
for the “root” element and all its descendant elements (e.g. the “foo”
and the “bar” element) meaning all three elements, the “root” element,
the “foo” element and the “bar” element are in that namespace
http://example.com/ns1. To create that XML document programmatically
you have to create all three elements in that namespace. Thus to create those three elements, you need to call createNode three times, each time passing in the node type (1 for element node), the element name (e.g. ‘root’) and the namespace (e.g. ‘http://example.com/ns1’):

    var doc = new ActiveXObject('Msxml2.DOMDocument.3.0');

var ns1 = 'http://example.com/ns1';

var root = doc.createNode(1, 'root', ns1);
var foo = doc.createNode(1, 'foo', ns1);
var bar = doc.createNode(1, 'bar', ns1);
bar.appendChild(doc.createTextNode('foobar'));
foo.appendChild(bar);
root.appendChild(foo);
doc.appendChild(root);

That creates an XML DOM document that, when serialized, looks as the above document. Here is an example
showing that. As you can see, the DOM code did not need to create the
default namespace declaration attribute at all, nevertheless, when the document is serialized by accessing the xml property, the serialization adds it.

Let’s look at a further example that includes elements and attributes in two namespaces:

<root xmlns="http://example.com/ns1"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://example.com/n1 schema.xsd">
<foo>
<bar>foobar</bar>
</foo>
</root>

[Note: for better reading I have inserted whitespace in the XML
markup but the code I will show will focus on creating the elements and
attributes only, not the whitespace.] In the above XML sample we now
have two additional attributes, one namespace declaration attribute
xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” and one attribute
in that namespace, the xsi:schemaLocation attribute. Again we do not
have to create any namespace declaration attributes at all, it suffices
to use the previous code, add a call to createNode and pass in
2 as the node type for attribute, the qualified name of the attribute (e.g. ‘xsi:schemaLocation’) and the namespace (e.g. ‘http://www.w3.org/2001/XMLSchema-instance’) to create that attribute and use setAttributeNode to add the attribute to the ‘root’ element:

    var doc = new ActiveXObject('Msxml2.DOMDocument.3.0');

var ns1 = 'http://example.com/ns1';
var xsi = 'http://www.w3.org/2001/XMLSchema-instance';

var root = doc.createNode(1, 'root', ns1);
var schemaLocation = doc.createNode(2, 'xsi:schemaLocation', xsi);
schemaLocation.nodeValue = 'http://example.com/n1 schema.xsd';
root.setAttributeNode(schemaLocation);

var foo = doc.createNode(1, 'foo', ns1);
var bar = doc.createNode(1, 'bar', ns1);
bar.appendChild(doc.createTextNode('foobar'));
foo.appendChild(bar);
root.appendChild(foo);
doc.appendChild(root);

 

Here is an example
showing the result. Again, serialization (i.e. accessing the the xml property) creates all necessary
namespace declaration attributes, as long as elements and attributes
have been created in the namespaces they belong to. The only reason to
create a namespace declaration attribute explicitly is to enforce its
output on an element where the namespace is not used. Let’s look at a
further example:

<svg xmlns="http://www.w3.org/2000/svg"
xmlns:xlink="http://www.w3.org/1999/xlink">
<script type="text/ecmascript" xlink:href="foo.js"/>
</svg>

In that XML document the XLink namespace
http://www.w3.org/1999/xlink is defined on the root element, the ‘svg’
element, although the namespace is only used on the ‘script’ child
element’s attribute xlink:href. This time, if we want to ensure the
namespace declaration attribute appears on the ‘svg’ element, we have
to explicitly set it on that element (and need to know that namespace
declaration attributes are per definition in the namespace
http://www.w3.org/2000/xmlns/):

    var doc = new ActiveXObject('Msxml2.DOMDocument.3.0');

var svgNs = 'http://www.w3.org/2000/svg';
var xlinkNs = 'http://www.w3.org/1999/xlink';

var root = doc.createNode(1, 'svg', svgNs);
var xlink = doc.createNode(2, 'xmlns:xlink', 'http://www.w3.org/2000/xmlns/');
xlink.nodeValue = xlinkNs;
root.setAttributeNode(xlink);

var script = doc.createNode(1, 'script', svgNs);
script.setAttribute('type', 'text/ecmascript');
var href = doc.createNode(2, 'xlink:href', xlinkNs);
href.nodeValue = 'foo.js';
script.setAttributeNode(href);

root.appendChild(script);
doc.appendChild(root);

 

Here is an example showing the result. If we did not set the namespace declaration
attribute on the ‘svg’ element then the resulting serialized document
would nevertheless be namespace well-formed XML, only serialization
would add the namespace declaration on the ‘script’ element.

So keep two things in mind when creating XML with namespaces
programmatically with the MSXML DOM API: you need to create each element
and attribute in the namespace it belongs to, passing in the node type, the qualified name and the namespace
URI to the createNode method. And you do not
need to create namespace declaration attribute explicitly, unless you
want to enforce its appearance on an element where the namespace is not
used (like the root element of your document).

The first rule is also of importance when you want to add elements
or attributes to an already loaded document. Assuming we have loaded
the following document

<root xmlns="http://example.com/ns1">
<foo/>
</root>

and want to add a ‘bar’ element in the same namespace as the other
elements then often people assume they can create a ‘bar’ element with
createElement(‘bar’) and add it to the root element and that it then
takes on the namespace of the root element. That is not the case
however, createElement(‘bar’) creates a ‘bar’ element in no namespace
and when you insert that as a child of the above ‘root’ element and
serialize the serialization will add <bar xmlns=””/> to ensure the
created element is serialized in no namespace. So to properly add a
‘bar’ element in the same namespace as the ‘root’ element you again
need to use createNode and pass in the namespace URI:

var bar = doc.createNode(1, 'bar', doc.documentElement.namespaceURI);
doc.documentElement.appendChild(bar);

 

Creating XML with namespaces with JavaScript and the W3C DOM

Let’s assume you want to create the following XML document with JavaScript and the W3C DOM API:

<root xmlns="http://example.com/ns1"><foo><bar>foobar</bar></foo></root>

The key to doing that properly is to undestand the following: in the XML markup there is an XML default namespace declaration attribute xmlns=”http://example.com/ns1″ on the “root” element that is in scope for the “root” element and all its descendant elements (e.g. the “foo” and the “bar” element) meaning all three elements, the “root” element, the “foo” element and the “bar” element are in that namespace http://example.com/ns1. To create that XML document programmatically you have to create all three elements in that namespace. Thus to create the “root” element you use the createDocument method and pass in namespace and name and to create the descendant elements you use the method createElementNS method and each time pass in the namespace and the name:

var ns1 = 'http://example.com/ns1';

var doc = document.implementation.createDocument(ns1, 'root', null);
var foo = doc.createElementNS(ns1, 'foo');
var bar = doc.createElementNS(ns1, 'bar');
bar.appendChild(document.createTextNode('foobar'));
foo.appendChild(bar);
doc.documentElement.appendChild(foo);

That creates an XML DOM document that, when serialized, looks as the above document. Here is an example showing that. As you can see, the DOM code did not need to create the default namespace declaration attribute at all, nevertheless, the serializer, when serializing the DOM tree, adds it.

Let’s look at a further example that includes elements and attributes in two namespaces:

<root xmlns="http://example.com/ns1"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://example.com/n1 schema.xsd">
<foo>
<bar>foobar</bar>
</foo>
</root>

[Note: for better reading I have inserted whitespace in the XML markup but the code I will show will focus on creating the elements and attributes only, not the whitespace.] In the above XML sample we now have two additional attributes, one namespace declaration attribute xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” and one attribute in that namespace, the xsi:schemaLocation attribute. Again we do not have to create any namespace declaration attributes at all, it suffices to use the previous code and add a call to setAttributeNS and pass in the namespace and the qualified name and the value to create the xsi:schemaLocation attribute:

var ns1 = 'http://example.com/ns1';
var xsi = 'http://www.w3.org/2001/XMLSchema-instance';

var doc = document.implementation.createDocument(ns1, 'root', null);
doc.documentElement.setAttributeNS(xsi, 'xsi:schemaLocation', 'http://example.com/n1 schema.xsd');

var foo = doc.createElementNS(ns1, 'foo');
var bar = doc.createElementNS(ns1, 'bar');
bar.appendChild(document.createTextNode('foobar'));
foo.appendChild(bar);
doc.documentElement.appendChild(foo);

 

Here is an example showing the result. Again, the serializer creates all necessary namespace declaration attributes, as long as elements and attributes have been created in the namespaces they belong to. The only reason to create a namespace declaration attribute explicitly is to enforce its output on an element where the namespace is not used. Let’s look at a further example:

<svg xmlns="http://www.w3.org/2000/svg"
xmlns:xlink="http://www.w3.org/1999/xlink">
<script type="text/ecmascript" xlink:href="foo.js"/>
</svg>

In that XML document the XLink namespace http://www.w3.org/1999/xlink is defined on the root element, the ‘svg’ element, although the namespace is only used on the ‘script’ child element’s attribute xlink:href. This time, if we want to ensure the namespace declaration attribute appears on the ‘svg’ element, we have to explicitly set it on that element (and need to know that namespace declaration attributes are per definition in the namespace http://www.w3.org/2000/xmlns/):

var svgNs = 'http://www.w3.org/2000/svg';
var xlinkNs = 'http://www.w3.org/1999/xlink';

var doc = document.implementation.createDocument(svgNs, 'svg', null);
doc.documentElement.setAttributeNS('http://www.w3.org/2000/xmlns/', 'xmlns:xlink', xlinkNs);

var script = doc.createElementNS(svgNs, 'script');
script.setAttributeNS(null, 'type', 'text/ecmascript');
script.setAttributeNS(xlinkNs, 'xlink:href', 'foo.js');

doc.documentElement.appendChild(script);

 

Here is an example showing the result. If we did not set the namespace declaration attribute on the ‘svg’ element then the resulting serialized document would nevertheless be namespace well-formed XML, only the serializer would add the namespace declaration on the ‘script’ element.

So keep two things in mind when creating XML with namespaces programmatically with the W3C DOM API: you need to create each element and attribute in the namespace it belongs to, passing in the namespace URI to methods like createElementNS or setAttributeNS. And you do not need to create namespace declaration attribute explicitly, unless you want to enforce its appearance on an element where the namespace is not used (like the root element of your document).

The first rule is also of importance when you want to add elements or attributes to an already loaded document. Assuming we have loaded the following document

<root xmlns="http://example.com/ns1">
<foo/>
</root>

and want to add a ‘bar’ element in the same namespace as the other elements then often people assume they can create a ‘bar’ element with createElement(‘bar’) and add it to the root element and that it then takes on the namespace of the root element. That is not the case however, createElement(‘bar’) creates a ‘bar’ element in no namespace and when you insert that as a child of the above ‘root’ element and serialize the serializer will add <bar xmlns=””/> to ensure the created element is serialized in no namespace. So to properly add a ‘bar’ element in the same namespace as the ‘root’ element you again need to use createElementNS and pass in the namespace URI:

var bar = doc.createElementNS(doc.documentElement.namespaceURI, 'bar');
doc.documentElement.appendChild(bar);

 

XPath over DOM with adjacent text nodes or CDATA section nodes

XPath has its own data model that differs from the W3C DOM data model. The DOM data model distinguishes between Text nodes and CDATASection nodes and allows adjacent Text nodes or CDATASection nodes. The XPath data model on the other hand only knows one type of Text nodes and there can never be adjacent Text nodes. So with the following XML snippet

<root>foo and bar <![CDATA[foo & bar]]></root>

the ‘root’ element in the W3C DOM model has two child nodes, a Text node with node value “foo and bar ” and a CDATASection node with node value “foo & bar”, whilst in the XPath data model the ‘root’ element has only one child node, a Text node with string value “foo and bar foo & bar” which could be selected by the XPath expression /root/text().

There are however lots of XPath implementations that operate on the W3C DOM model, in particular browsers like Mozilla browsers (e.g. Firefox, SeaMonkey) or Opera or Safari use the W3C DOM to represent XML or HTML documents and expose an XPath 1.0 API to JavaScript, based on the W3C DOM Level 3 XPath specification. Microsoft has various implementations of the W3C DOM, in the ActiveX/COM world there are the various MSXML versions where MSXML 3, 4, 5, and 6 expose an XPath 1.0 API that can be used inside of Internet explorer for instance, and in the .NET framework there is System.Xml.XmlNode and its subclasses as a DOM implementation also exposing an XPath 1.0 API. In the Java world there is the JAXP XPath API in the package javax.xml.xpath which is said to be object model neutral but uses the W3C DOM model as its default model.

With the XPath data model the XPath expression /root/text() returns a node-set with a single text node but what should an API operating on a W3C DOM model return? The various implementations seem to have choosen two approaches: one is to return all W3C DOM Text and CDATASection nodes where XPath sees only a single text node, the other is to return only the first W3C DOM Text or CDATASection node. The latter is what the W3C DOM Level 3 XPath specification says in its mapping between the DOM and the XPath model: “Instead of returning multiple nodes where XPath sees a single logical text node, only the first non-empty DOM Text or CDATASection node of any logical XPath text will be returned in the node set.”

Interestingly enough, a test case for the W3C DOM Level 3 XPath API shows that only Opera implements it that way, while Mozilla and Safari don’t follow the specification but instead return a node-set with the two adjacent DOM nodes, a Text node and a CDATASection node. Tests were done with Firefox 3.0, Opera 9.6 and Safari 3.2, all on Windows. If only the first Text node is returned you need a way to access the remaining text that a pure XPath model would give you; the W3C DOM Level 3 XPath specification recommends “Applications using XPath in an environment with fragmented text nodes must manually gather the text of a single logical text node possibly from multiple nodes beginning with the first Text node or CDATASection node returned by the implementation. … In an attempt to better implement the XML Information Set, DOM Level 3 Core [DOM Level 3 Core] adds the attribute wholeText on the Text interface for retrieving the whole text for logically-adjacent Text nodes”. Of the tree tested browsers only Opera currently supports that wholeText property. Whether that is the reason that Opera follows the W3C DOM Level 3 XPath specification but Mozilla and Opera do not I don’t know.

A similar test case for Internet Explorer, testing all available MSXML versions 3, 4, and 6, shows that the different MSXML versions consistently return two DOM nodes. On the other hand if we translate the test case into the .NET world to test a different DOM implementation from the same vendor:

            string xml = @"<root>foo and bar <![CDATA[foo & bar]]></root>";
string path = "/root/text()";

XmlDocument doc1 = new XmlDocument();
doc1.LoadXml(xml);

XmlNodeList nodes1 = doc1.SelectNodes(path);
Console.WriteLine("Found {0} node(s):", nodes1.Count);
foreach (XmlNode node in nodes1)
{
Console.WriteLine("NodeType: {0}; Value: \"{1}\"", node.NodeType, node.Value);
}

then it shows that only the first XmlText node is returned.

 

What is the best way to deal with such differences, in particular with client-side JavaScript where your same code might run in the different browsers? First of all consider whether you need to access Text or CDATASection nodes at all, often you only need to access Element nodes and that way for instance in the example you could choose the XPath expression /root instead of /root/text() and then you could use properties on the Element node to access the text content of that node. That would be the textContent property for implementations following the W3C DOM (and although that is part of W3C DOM Level 3 Core which as a whole is not supported very well the textContent property is supported at least in current versions of Mozilla, Safari, Opera) and the text property for MSXML. In the .NET framework you could use the InnerText property. On the other hand all those properties concatenate text in children and descendants of the elements while the XPath text() selects only child text nodes.