Deborah's Developer MindScape

         Tips and Techniques for Web and .NET developers.

February 19, 2010

XML Literals: Escaping Characters

Filed under: VB.NET,XML @ 8:14 pm

In XML, there are several characters that have special meaning, such as the less than (<), greater than (>) and quotation mark ("). If you just type these characters in to your XML literal, your application won’t understand them.

NOTE: All of the code in this post is in Visual Basic since C# does not directly support XML literals.

For example, the following code will not compile. It tries to interpret the less than and greater than signs as XML elements.

In VB:

Dim myXML As XElement = <Product>
                      This is a product that has special characters: 
                      Ampersand: &, Apostrophe: ‘, Quote: " 
                      Less than: <, Greater than: >"

To get this code to compile, replace the problematic symbols with HTML character entities:

  • < becomes &lt;
  • > becomes &gt;
  • & becomes &amp;

The resulting code is as follows:

Dim myXML As XElement = <Product>
                      This is a product that has special characters: 
                      Ampersand: &amp;, Apostrophe: ‘, Quote: " 
                      Less than: &lt;, Greater than: &gt;"

This works, but is not very nice to look at, especially if you are not familiar with HTML character entities.

Another option is to use the RegularExpression Escape and Unescape methods.

NOTE: Be sure to import the System.Text.RegularExpression namespace.

In VB:

Dim s As String= "This is a product that has special characters: " & _
                    Environment.NewLine & _
                    "Ampersand: &, Apostrophe: ‘, Quote: "" " & _
                    Environment.NewLine & _
                    "Less than: <, Greater than: >"

Dim myXML As XElement = <Product>
                                <%= Regex.Escape(s) %>
Dim description As String = Regex.Unescape(myXML.<Description>.Value)


This code results in the following MessageBox:


The only character that you still need to escape is the quotation mark. You can escape it in the string by doubling it ("").

The RegEx.Escape takes care of escaping the necessary characters for the XML string. The RegEx.Unescape takes care of unescaping the characters for viewing.



  1.   Emmanuel Huna — February 22, 2010 @ 1:09 pm    Reply

    What about the the single quote? It does not seem to be escaped either.

  2.   Emmanuel Huna — February 22, 2010 @ 1:22 pm    Reply

    Deborah: this works great if I’m doing an internal implementation of the XML – I can encode and decode it.

    But if I’m sending the XML to a 3rd party, the Regex.Encode is not really encoding the data to XML standards.

    For example, there’s no need to add a backslash to a space (“\ “), and both single and double quotes should be escaped using ' and " –

    Also, a carriage return / line feed should be escaped using &#A;

    I tried posting this twice on this site, but both times the site crashed and the comment was not accepted. Maybe it’s the special characters in the comment?

  3.   DeborahK — February 22, 2010 @ 5:07 pm    Reply

    Emmanuel -

    Regarding the single quote, I called it an apostrophe in my example above. But it is the same as a single quote.

    Hope this helps.

  4.   DeborahK — February 22, 2010 @ 5:08 pm    Reply

    Emmanuel –

    Regarding the comment not showing up … due to a HUGE amount of spam, i have the blog set up so I have to approve any comments before they are posted to the site. This sometimes causes a delay if I am away from the office.

    Hope this helps.

RSS feed for comments on this post. TrackBack URI

Leave a comment

© 2014 Deborah's Developer MindScape   Provided by WPMU DEV -The WordPress Experts   Hosted by Microsoft MVPs