XML External Entity (XXE) Vulnerability

Overview

Your website was temporarily unavailable due to a DoS attack. Following an examination, it was discovered that several LOLs were using up 3GB of your resources, which is what caused the outage. Attackers used the XXE Vulnerability to launch the famed Billion Laughs attack on your website. Although the XXE vulnerability has been present since the early 2000s, it was ranked 4 in the OWASP Top 10 in 2017 because of the ubiquity of its underlying vector, XML, and the significant risk associated with most XML parsers.

But what makes XXE assaults so powerful is that they can be used against several computer languages, including C/C++, Java,.Net, and iOS. For web app security to be improved, it is essential to identify and address these vulnerabilities. This article will go in-depth on XXE Vulnerabilities, including how they operate, their kinds, detection, and mitigation.

What is XXE Vulnerability?

The XML external entity vulnerability, often known as XXE, is a web security flaw that enables an attacker to interfere with how an application processes XML input. A lot of the time, it enables an attacker to interact with any external or back-end systems that the program itself can access as well as see files on the application server disc.

In some circumstances, an attacker can use the XXE vulnerability to launch server-side request forgery (SSRF) attacks, which can escalate an XXE assault to compromise the underlying server or other back-end infrastructure.

How Does XXE Work?

XML is a common standard among developers for transferring data between a web browser and a server.

A parser is necessary for XML, and here is frequently where weaknesses are generated. With XXE, an entity may be defined based on the information contained in a file path or URL. As soon as the server receives the XML attack payload, it parses the external object, incorporates it into the finished document, and then sends the final, sensitive document back to the user.

An attacker can use XXE attacks to launch server-side request forgery (SSRF) attacks to compromise the underlying server.

How Do XXE Vulnerabilities Get Executed?

The XXE vulnerability occurs when potentially harmful features are included in the XML standard and supported by XML parsers. Attackers that submit XML requests carrying harmful payloads within the Document Type Declaration identify applications with these vulnerabilities (DTD). The inadequate XML parser retrieves, validates, and resolves the harmful external entities included inside these DTDs.

As a result, the attacker is given the ability to access private information or resources and achieve their objectives.

Types of XXE Attacks (With Code Examples)

Attackers frequently target External XML Entities because an XML parser is not designed to examine external material. These resolved external materials can include anything, including malicious payloads, potentially making XXE attacks risky.

XXE attacks can be carried out via a number of techniques, including:

Billion Laughs Attack

There are two standards—XSD and DTD—that may be used to describe the type of an XML document. A DTD-defined XML document is susceptible to XXE assaults.

See the example below, which makes use of the DTD mytype. Name is an XML entity that is defined by this DTD. The XML parser reads the DTD and replaces it with a value when this element is called in the HTML output.

Let's now examine how an attacker may execute the "billion chuckles attack." This attack employs a recursive method to overload the memory of the XML parser if it does not have a memory use cap.

This is essentially a type of denial of service (DoS) attack that can prevent users from accessing an application that uses an XML parser.

XXE SSRF Attack

Let's now examine how a similar technique may be utilized to carry out server-side request forgery (SSRF).

The attacker in this instance employs XML entities from other sources. When this occurs, XXE turns into an SSRF (server-side request forgery) assault.

An XML system identifier can be used by an attacker to execute a system command. Because the majority of XML parsers automatically handle external entities, the server executes the malicious XML element's system code.

The code below demonstrates how to use XXE to retrieve the contents of a sensitive file, the etc/hosts file.

This technique may be expanded to acquire access to files other than system files on the server. Some XML parsers enable the retrieval of directory listings, which can then be used to locate additional sensitive material on the system.

The XXE assaults, however, are only capable of accessing files that contain valid XML or plain text. They cannot be used to acquire binary files or documents that include XML-like code but are not genuine XML. The attacker won't be able to view their content as this will return a parser error.

Blind XXE vulnerabilities occur when an application processes external XML entities in an unsafe manner but does not return those entities in its answers. As a result, to find and exploit the vulnerability, attackers will need to apply sophisticated approaches.

Blind XXE still allows for data exfiltration by attackers, for instance by forcing the server to connect to an attacker-controlled URL.

Real-Life Examples of XXE Vulnerability

Here are some cases of XXE vulnerabilities in the real world:

Android development tools - Android Studio, Eclipse, and APKTool are some of the most well-liked Android programming tools. All of these tools parse XML in such a way that attackers may acquire access through external entities, resulting in a massive vulnerability in these programs. This was fixed in newer versions, but it's still a good reminder that security must always be a top priority during development, even when using the best third-party tools available.
Wordpress - WordPress, the most widely used content management system in the world, which powers about 40% of all websites. It also has XXE vulnerabilities. It affected the WordPress versions older than 5.7.1 and gave remote attackers access to Server-Side Request Forgery (SSRF) and Arbitrary File Disclosure (AFD) attacks. However, the WordPress security team was later made aware of this code vulnerability, and they fixed it in the most recent version.

Examples of XXE Attack Payloads

Accessing a Nearby Resource That Might Not Return

Execution of Remote Code

RCE may be obtained if luck favours the attackter and the PHP "expect" module is loaded. Let's alter the payload.

Relieving /etc/passwd or other specific files

Identifying XXE Vulnerabilities

The majority of XXE vulnerabilities may be efficiently found by utilizing a sophisticated and thorough web application scanner. With the use of AI-ML and global threat intelligence, the web application scanner can consistently, swiftly, and precisely identify the majority of these vulnerabilities.

Due to the possibility that some of these vulnerabilities may not be discovered by automated scanning, manual testing by licensed security specialists is crucial.

For the following XXE kinds, the following are manually tested:

File retrieval: An external object specified on a well-known OS file is utilized in the data collected from the application's response.
Blind XXE vulnerabilities: An external object is established based on a URL to a tester or developer-controlled system, and the interaction is tracked there.
XInclude attacks: Using these techniques, testers try to get a well-known OS file. Here, it is evaluated whether the program uses non-XML data provided by the user inside a server-side XML.

How to Prevent XXE Vulnerability?

It is exceedingly challenging to properly validate the XML document in a way to prevent this sort of attack since user-supplied XML input originates from an "untrusted source."

Instead, the XML processor should be set up to only accept locally defined Document Type Definitions (DTDs) and to reject any inline DTDs that are included in XML documents that users supply.

Since there are several XML parsing engines for various programming languages, each has a unique way of turning off inline DTD to avoid XXE. The documentation for your XML parser may include instructions on how to selectively deactivate inline DTD.

Preventing XXE in C/C++

XXE often occurs in C/C++. The usage of the XML parser Libxml2 is the cause of this problem. But the problem is that libxml2 by default permits external entities.

Fortunately, there is a technique to stop this from occurring. With the help of xmlSetExternalEntityLoader, you may install your entity loader and choose which URLs to load, protecting your application from unintended behavior.

Preventing XXE in Java

The majority of Java's XML parsers are XXE-vulnerable, which makes things challenging for you and hackers utilizing XXE assaults.

For instance, the most widely used Java parser, dom4j, once had the XXE vulnerability, and it's highly likely that most Java applications still do as well. However, to avoid this behavior and stop XXE attacks, you should update dom4js to at least version 2.1.3.

Here is an illustration of unsafe Java code that is susceptible to XXE attack:

However, this is simply avoidable by including some code to deactivate DOCTYPES:

Preventing XXE in PHP

As you are all likely aware, PHP is one of the most widely used server-side languages available. Due to its widespread use in online applications, it is the ideal target for malicious attacks.

Due to its frequent use with PHP, this is particularly true with XML processing. However, the good news is that XXE attack avoidance is a rather simple thing to design. All you need to do to use PHP's built-in XML Parser is to add the line of code shown below to your code:

This keeps your application secure by disabling the ability to load external things.

Extra Prevention Advice

Here are some broad recommendations to assist you to avoid XXE:

Disable DTDs manually. Configure your apps' XML parsers to disallow custom document type declarations (DTDs). Since the majority of programs don't utilize DTDs, this shouldn't affect functionality but may thwart XXE attempts.
Make your Application Server Instrumented. Add checkpoints to specified sections of your code to track how it is being executed at runtime and to identify and stop classes involved in XML processing. This can handle any XML parsers you might have forgotten to include in your application's code and guard against the worst XXE attacks that result in remote code execution.
Deploying Security Technologies into Action. Clear XXE inputs can be blocked by built-in rules in Web Application Firewalls (WAF). Early in the development phase, XXE vulnerabilities may be checked for and suggested fixes using Dynamic Application Security Testing (DAST) technologies.
Strengthening the Setup Against XXE. The standard recommended approaches for application hardening will still work against XXE. Limit access rights, check all inputs to make sure XML processing logic is not reached, deal with failures, employ authentication and encryption, restrict outbound traffic, and restrict DNS communications.

Conclusion

Attackers' effective use of XXE Vulnerabilities has a significant impact. Not only will application availability be compromised, but a huge gateway will be established for various types of cyber-attacks and data theft.
The web security of applications cannot be strengthened without protecting applications from XML External Entity Vulnerabilities.
Using and correctly setting a decent XML parser, as well as guaranteeing input validation, appropriate error handling, and reducing filesystem permissions, one may easily prevent XXE assaults, even though they constitute a significant danger.