What does <![CDATA[]]> node in a XML document

What does <![CDATA[]]> node in a XML document

What and why use it

CDATA stands for Character Data. You can use this to escape some characters which otherwise will be treated as regular XML. The data inside this will not be parsed. For example, if you want to pass a math equation that contains < or > on it, you can use CDATA to do it. Otherwise, you will get an error as it will be parsed as regular XML and will be invalid.

The following xml document will be wrong parsed, as the description of the Belgian Waffles contains "5 < 6":

<?xml version="1.0" encoding="UTF-8"?>
<breakfast_menu>
<food>
  <name>Belgian Waffles</name>
  <price>$5.95</price>
  <description>
    Our famous Belgian Waffles with plenty of real maple syrup. 5 < 6
  </description>
  <calories>650</calories>
</food>
<food>
  <name>French Toast</name>
  <price>$4.50</price>
  <description>Thick slices made from our homemade sourdough bread</description>
  <calories>600</calories>
</food>
</breakfast_menu>

If you validate the previous xml document, you'll get that :

XML Invalid data or markup

Note that you can to solve this issue too, encoding those characters to it's html entities (using html_entities in php for example).

<example-code>
    while (x &lt; len &amp;&amp; !done) {
        print( &quot;Hello.&quot; );
        ++x;
    }
</example-code>

However this can become tricky as you need to watchout when to encode and decode those entities, therefore is recommendable to use CDATA instead.

How to use it

CDATA sections may be added anywhere character data may occur; they are used to escape blocks of text containing characters which would otherwise be recognized as markup. CDATA sections begin with the string " <![CDATA[ " and end with the string " ]]> ".

Just wrap the content of a node with the CDATA tag :

<node><![CDATA[
  Hello, here its my <b>HTML content</b> parsed with html tags as data without issues. And yes, 5 < 6 :)
]]></node>

So the first example document problem should be solved wrapping the content within a cdata tag:

<?xml version="1.0" encoding="UTF-8"?>
<breakfast_menu>
<food>
  <name>Belgian Waffles</name>
  <price>$5.95</price>
  <description><![CDATA[
    Our famous Belgian Waffles with plenty of real maple syrup. 5 < 6
  ]]></description>
  <calories>650</calories>
</food>
<food>
  <name>French Toast</name>
  <price>$4.50</price>
  <description>Thick slices made from our homemade sourdough bread</description>
  <calories>600</calories>
</food>
</breakfast_menu>

And instead of convert the characters to it's html entities we would use instead:

<example-code><![CDATA[
while (x < len && !done) {
    print( "Hello." );
    ++x;
    }
]]></example-code>

Comes in handy isn't?

Final tips and conclusions

  • In CDATA you cannot include the string ]]> (CDEnd) in the content, otherwise the xml will be wrong parsed too.
  • Syntactically, it behaves similarly to a comment but cdata tag still part of the document.
  • CDATA sections cannot nest document nodes (even if are valid xml nodes they will be not parsed as it's the content of that node)
Become a more social person