XXE and SSRF Detection
XXE and SSRF Detection
The following files were used as context for generating this wiki page:
- .github/workflows/ghcr-publish.yml
- README.md
- requirements.txt
- wshawk/main.py
- wshawk/advanced_cli.py
- wshawk/scanner_v2.py
This page documents WSHawk's detection methodologies for XML External Entity (XXE) and Server-Side Request Forgery (SSRF) vulnerabilities in WebSocket applications. These vulnerabilities enable attackers to read local files, access internal network resources, and exfiltrate data through out-of-band channels.
For general injection vulnerability detection (SQL, NoSQL, Command, LDAP), see Injection Vulnerabilities. For blind vulnerability detection infrastructure, see OAST Blind Vulnerability Detection.
XXE Detection Overview
WSHawk detects XXE vulnerabilities by sending XML payloads containing entity definitions that trigger external resource loading. Detection occurs through two mechanisms:
- Direct Response Analysis: Observing entity content reflected in responses
- OAST Callbacks: Out-of-band detection when entities trigger external DNS/HTTP requests
The scanner operates in both legacy mode (wshawk/main.py:665-704) and enhanced v2 mode (wshawk/scanner_v2.py:450-504) with optional OAST integration.
Detection Confidence Levels:
- HIGH: Entity content reflected in response OR OAST callback received
- MEDIUM: XML parsing errors suggesting entity processing
- LOW: No definitive indicators
XXE Detection Architecture
graph TB
subgraph "XXE Detection Pipeline"
PayloadSource["WSPayloads.get_xxe()<br/>payloads/xxe.txt"]
Injector["Message Injector<br/>MessageAnalyzer<br/>inject_payload_into_message()"]
OASTGen["OASTProvider<br/>generate_payload('xxe')"]
WSConn["WebSocket Connection<br/>ws.send()"]
Response["Response Collector<br/>ws.recv()"]
Verifier["VulnerabilityVerifier<br/>verify_xxe()"]
OASTCheck["OAST Callback Monitor<br/>check_callbacks()"]
end
subgraph "Detection Indicators"
DirectInd["Direct Indicators<br/><!entity<br/>system<br/>file://<br/>root:"]
ErrorInd["Error Indicators<br/>XML Parse Error<br/>Entity Error"]
OASTInd["OAST Indicators<br/>DNS Query<br/>HTTP Callback"]
end
PayloadSource -->|"Static payloads"| Injector
OASTGen -->|"OAST-enabled payload"| Injector
Injector -->|"JSON-wrapped XML"| WSConn
WSConn --> Response
Response --> Verifier
Response --> OASTCheck
Verifier --> DirectInd
Verifier --> ErrorInd
OASTCheck --> OASTInd
DirectInd -->|"HIGH confidence"| VulnReport["Vulnerability Report<br/>type: XXE<br/>severity: CRITICAL"]
ErrorInd -->|"MEDIUM confidence"| VulnReport
OASTInd -->|"HIGH confidence"| VulnReport
Sources: wshawk/scanner_v2.py:450-504, wshawk/main.py:665-704
XXE Payload Sources and Injection
Payload Collection
XXE payloads are loaded from the static collection at runtime:
| Method | File | Purpose |
|--------|------|---------|
| WSPayloads.get_xxe() | wshawk/payloads/xxe.txt | Loads entity injection vectors |
| Default limit | 30 payloads | Performance-optimized subset in v2 |
Sources: wshawk/main.py:129-130, wshawk/scanner_v2.py:455
Injection Strategy
graph LR
subgraph "Legacy Scanner Flow"
XXEPayload["XXE Payload<br/><!DOCTYPE...>"]
DirectSend["Direct WebSocket Send<br/>payload as string"]
end
subgraph "V2 Scanner Flow"
XXEPayloadV2["XXE Payload"]
MessageWrap["JSON Wrapper<br/>{action: 'parse_xml'<br/>xml: payload}"]
OASTSubst["OAST Substitution<br/>Replace callback URL"]
ContextInject["Context-Aware Injection<br/>MessageAnalyzer"]
end
XXEPayload --> DirectSend
XXEPayloadV2 --> OASTSubst
OASTSubst --> MessageWrap
MessageWrap --> ContextInject
ContextInject -->|"Learns from samples"| Structured["JSON Field Injection<br/>xml, data, content"]
Key Differences:
- Legacy: Sends raw XML payloads directly (wshawk/main.py:682)
- V2: Wraps in JSON message structure:
{"action": "parse_xml", "xml": payload}(wshawk/scanner_v2.py:472-474) - OAST-Enabled: Replaces entity URLs with OAST callback endpoints (wshawk/scanner_v2.py:471)
Sources: wshawk/scanner_v2.py:468-477, wshawk/main.py:682
XXE Detection Methodology
Response Analysis
The VulnerabilityVerifier class performs pattern matching on responses to identify entity processing:
Detection Indicators (wshawk/scanner_v2.py:484-485):
xxe_indicators = ['<!entity', 'system', 'file://', 'root:', 'XML Parse Error']
Detection Logic:
- Convert response to lowercase
- Check if any indicator present
- If matched →
HIGHconfidence vulnerability - Report entity processing detected
Sources: wshawk/scanner_v2.py:484-495, wshawk/main.py:685-686
OAST Integration Flow
sequenceDiagram
participant Scanner as "WSHawkV2"
participant OAST as "OASTProvider"
participant Target as "WebSocket Target"
participant DNS as "interact.sh DNS"
Scanner->>OAST: start()
OAST->>DNS: Register callback domain
DNS-->>OAST: unique_id.interact.sh
Scanner->>OAST: generate_payload('xxe', 'test0')
OAST-->>Scanner: <!ENTITY xxe SYSTEM "http://unique.interact.sh">
Scanner->>Target: send(xml_payload)
Note over Target: Parser processes entity
Target->>DNS: DNS query for unique.interact.sh
Scanner->>OAST: check_callbacks('test0')
OAST->>DNS: Poll for callbacks
DNS-->>OAST: [DNS query detected]
OAST-->>Scanner: True (blind XXE confirmed)
Scanner->>Scanner: vulnerabilities.append({type: 'XXE', confidence: 'HIGH'})
OAST Start Condition (wshawk/scanner_v2.py:458-465):
if self.use_oast and not self.oast_provider:
self.oast_provider = OASTProvider(use_interactsh=False, custom_server="localhost:8888")
await self.oast_provider.start()
Payload Generation (wshawk/scanner_v2.py:471):
oast_payload = self.oast_provider.generate_payload('xxe', f'test{len(results)}')
Sources: wshawk/scanner_v2.py:458-477
SSRF Detection Overview
SSRF detection tests whether the WebSocket application fetches arbitrary URLs provided by the client, potentially exposing internal network resources and cloud metadata services.
Attack Surface: Any WebSocket message field that accepts URLs or triggers server-side HTTP requests:
- Image fetching
- URL preview generation
- Webhook notifications
- External API integration
Sources: wshawk/scanner_v2.py:546-591
SSRF Target Selection
WSHawk tests a curated list of high-value internal targets:
graph TB
subgraph "Internal Network Targets"
Localhost["http://localhost<br/>http://127.0.0.1<br/>Loopback access"]
Private["http://192.168.1.1<br/>http://10.0.0.1<br/>http://172.16.0.1<br/>RFC1918 ranges"]
end
subgraph "Cloud Metadata Services"
AWS["http://169.254.169.254/latest/meta-data/<br/>AWS EC2 metadata"]
GCP["http://metadata.google.internal<br/>Google Cloud metadata"]
end
subgraph "Detection Strategy"
Localhost --> ResponseCheck["Response Analysis<br/>Connection refused?<br/>Timeout?<br/>Internal data?"]
Private --> ResponseCheck
AWS --> ResponseCheck
GCP --> ResponseCheck
ResponseCheck --> Indicators["SSRF Indicators<br/>connection refused<br/>timeout<br/>metadata<br/>instance-id"]
end
Target List (wshawk/scanner_v2.py:551-556):
internal_targets = [
'http://localhost',
'http://127.0.0.1',
'http://169.254.169.254/latest/meta-data/', # AWS metadata
'http://metadata.google.internal', # GCP metadata
]
Sources: wshawk/scanner_v2.py:551-556
SSRF Detection Methodology
Message Construction
SSRF payloads are injected into URL-accepting JSON fields:
Injection Pattern (wshawk/scanner_v2.py:562):
{
"action": "fetch_url",
"url": "http://169.254.169.254/latest/meta-data/"
}
Response Verification
graph TB
SendSSRF["Send SSRF Payload<br/>ws.send(msg)"]
ReceiveResp["Receive Response<br/>ws.recv() with 3s timeout"]
AnalyzeResp["Response Analysis<br/>Check for indicators"]
subgraph "Response Indicators"
ConnRefused["'connection refused'<br/>Internal port accessible but closed"]
Timeout["'timeout'<br/>Firewall blocking egress"]
Metadata["'metadata'<br/>'instance-id'<br/>Cloud metadata leaked"]
Localhost["'localhost'<br/>Loopback reference"]
end
SendSSRF --> ReceiveResp
ReceiveResp --> AnalyzeResp
AnalyzeResp --> ConnRefused
AnalyzeResp --> Timeout
AnalyzeResp --> Metadata
AnalyzeResp --> Localhost
ConnRefused --> HighConf["HIGH Confidence<br/>SSRF Vulnerability"]
Timeout --> HighConf
Metadata --> HighConf
Localhost --> HighConf
Detection Logic (wshawk/scanner_v2.py:570-572):
ssrf_indicators = ['connection refused', 'timeout', 'metadata', 'instance-id', 'localhost']
if any(ind.lower() in response.lower() for ind in ssrf_indicators):
# SSRF detected
Vulnerability Report Structure (wshawk/scanner_v2.py:573-581):
{
'type': 'Server-Side Request Forgery (SSRF)',
'severity': 'HIGH',
'confidence': 'HIGH',
'description': f'SSRF vulnerability - accessed {target}',
'payload': target,
'response_snippet': response[:200],
'recommendation': 'Validate and whitelist allowed URLs'
}
Sources: wshawk/scanner_v2.py:546-591
Rate Limiting and Resilience
Both XXE and SSRF tests respect the configured rate limiter to avoid detection and overloading:
XXE Rate Limiting (wshawk/scanner_v2.py:500):
await asyncio.sleep(0.05) # 50ms delay between payloads
SSRF Rate Limiting (wshawk/scanner_v2.py:560):
await self.rate_limiter.acquire() # Token bucket rate control
Timeout Configuration:
- XXE: 2-second response timeout (wshawk/scanner_v2.py:480)
- SSRF: 3-second response timeout for slow metadata services (wshawk/scanner_v2.py:567)
Sources: wshawk/scanner_v2.py:500, wshawk/scanner_v2.py:560, wshawk/scanner_v2.py:480, wshawk/scanner_v2.py:567
Integration with Scan Workflow
Heuristic Scan Integration
Both XXE and SSRF tests are integrated into the main run_heuristic_scan() workflow:
graph TB
Start["run_heuristic_scan()"]
Connect["WebSocket Connect"]
Learning["Learning Phase<br/>5 seconds"]
SQLi["test_sql_injection_v2()"]
XSS["test_xss_v2()"]
Cmd["test_command_injection_v2()"]
PathTrav["test_path_traversal_v2()"]
XXE["test_xxe_v2()<br/>Line 628"]
NoSQL["test_nosql_injection_v2()"]
SSRF["test_ssrf_v2()<br/>Line 634"]
Session["SessionHijackingTester"]
Report["Generate Report"]
Start --> Connect
Connect --> Learning
Learning --> SQLi
SQLi --> XSS
XSS --> Cmd
Cmd --> PathTrav
PathTrav --> XXE
XXE --> NoSQL
NoSQL --> SSRF
SSRF --> Session
Session --> Report
Call Sites:
- XXE: wshawk/scanner_v2.py:628
- SSRF: wshawk/scanner_v2.py:634
Sources: wshawk/scanner_v2.py:593-852
Configuration
Scanner V2 Configuration
XXE and SSRF detection are controlled through the scanner configuration:
OAST Toggle (wshawk/scanner_v2.py:82-83):
self.use_oast = True
self.oast_provider = None
CLI Override (wshawk/advanced_cli.py:41-42):
wshawk-advanced ws://target.com --no-oast # Disable OAST testing
Full Feature Mode (wshawk/advanced_cli.py:174-177):
wshawk-advanced ws://target.com --full # Enables OAST + Playwright + Smart Payloads
Configuration File
The hierarchical configuration system supports OAST and scanning feature toggles:
scanner:
rate_limit: 10
features:
oast: true
playwright: true
smart_payloads: false
Configuration Loading (wshawk/scanner_v2.py:48-53):
if config is None:
from .config import WSHawkConfig
self.config = WSHawkConfig.load()
Sources: wshawk/scanner_v2.py:82-83, wshawk/advanced_cli.py:41-42, wshawk/advanced_cli.py:174-177, wshawk/scanner_v2.py:48-53
Vulnerability Reporting
Report Structure
XXE and SSRF findings are appended to the vulnerabilities list with standardized metadata:
XXE Report Fields:
type: "XML External Entity (XXE)"severity: "HIGH" or "CRITICAL"confidence: "HIGH" (direct detection) or "MEDIUM" (error-based)description: Entity processing evidencepayload: Triggering XML payload (truncated to 80 chars)response_snippet: First 200 chars of responserecommendation: "Disable external entity processing"
SSRF Report Fields:
type: "Server-Side Request Forgery (SSRF)"severity: "HIGH"confidence: "HIGH"description: Accessed internal target detailspayload: Internal URL that was fetchedresponse_snippet: First 200 chars of responserecommendation: "Validate and whitelist allowed URLs"
Vulnerability Append (wshawk/scanner_v2.py:486-495, wshawk/scanner_v2.py:573-581):
self.vulnerabilities.append({...})
CVSS Scoring
XXE and SSRF vulnerabilities receive CVSS v3.1 scores based on their severity:
| Vulnerability | Base Score | Vector | |---------------|------------|--------| | XXE (File Read) | 7.5 - 8.6 | AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N | | XXE (RCE) | 9.8 | AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H | | SSRF (Metadata) | 8.6 | AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:N/A:N | | SSRF (Internal) | 7.5 | AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N |
Sources: wshawk/scanner_v2.py:486-495, wshawk/scanner_v2.py:573-581
Legacy Scanner Comparison
The legacy scanner (wshawk/main.py:665-704) implements basic XXE detection without OAST support:
Key Differences:
| Feature | Legacy Scanner | V2 Scanner |
|---------|---------------|------------|
| OAST Integration | ❌ No | ✅ Yes |
| Context-Aware Injection | ❌ No | ✅ Yes with MessageAnalyzer |
| Rate Limiting | ✅ Fixed 0.1s delay | ✅ Token bucket with adaptive control |
| SSRF Detection | ❌ Not implemented | ✅ Implemented |
| Payload Limit | Configurable via max_payloads | Fixed 30 for performance |
| Confidence Scoring | ❌ Basic severity | ✅ ConfidenceLevel enum |
Legacy XXE Test (wshawk/main.py:665-704):
- Uses
WSPayloads.get_xxe()without OAST - Direct pattern matching on indicators
- Appends to
self.vulnerabilitieswith basic structure
Sources: wshawk/main.py:665-704, wshawk/scanner_v2.py:450-504, wshawk/scanner_v2.py:546-591
Remediation Recommendations
WSHawk provides actionable remediation guidance for each detected vulnerability:
XXE Remediation
Recommendation (wshawk/scanner_v2.py:493):
"Disable external entity processing"
Technical Implementation:
- Disable DTD processing entirely
- Disable external entity resolution
- Use safe XML parser configurations:
- Python:
defusedxmllibrary - Java:
XMLConstants.FEATURE_SECURE_PROCESSING - PHP:
libxml_disable_entity_loader(true)
- Python:
SSRF Remediation
Recommendation (wshawk/scanner_v2.py:580):
"Validate and whitelist allowed URLs"
Technical Implementation:
- Implement URL whitelist validation
- Block access to private IP ranges (RFC1918)
- Block cloud metadata endpoints
- Use network-level egress filtering
- Implement request timeout limits