XXE and SSRF Detection

The following files were used as context for generating this wiki page:

This page documents WSHawk's detection methodologies for XML External Entity (XXE) and Server-Side Request Forgery (SSRF) vulnerabilities in WebSocket applications. These vulnerabilities enable attackers to read local files, access internal network resources, and exfiltrate data through out-of-band channels.

For general injection vulnerability detection (SQL, NoSQL, Command, LDAP), see Injection Vulnerabilities. For blind vulnerability detection infrastructure, see OAST Blind Vulnerability Detection.

XXE Detection Overview

WSHawk detects XXE vulnerabilities by sending XML payloads containing entity definitions that trigger external resource loading. Detection occurs through two mechanisms:

Direct Response Analysis: Observing entity content reflected in responses
OAST Callbacks: Out-of-band detection when entities trigger external DNS/HTTP requests

The scanner operates in both legacy mode (wshawk/main.py:665-704) and enhanced v2 mode (wshawk/scanner_v2.py:450-504) with optional OAST integration.

Detection Confidence Levels:

HIGH: Entity content reflected in response OR OAST callback received
MEDIUM: XML parsing errors suggesting entity processing
LOW: No definitive indicators

XXE Detection Architecture

graph TB
    subgraph "XXE Detection Pipeline"
        PayloadSource["WSPayloads.get_xxe()<br/>payloads/xxe.txt"]
        Injector["Message Injector<br/>MessageAnalyzer<br/>inject_payload_into_message()"]
        OASTGen["OASTProvider<br/>generate_payload('xxe')"]
        WSConn["WebSocket Connection<br/>ws.send()"]
        Response["Response Collector<br/>ws.recv()"]
        Verifier["VulnerabilityVerifier<br/>verify_xxe()"]
        OASTCheck["OAST Callback Monitor<br/>check_callbacks()"]
    end
    
    subgraph "Detection Indicators"
        DirectInd["Direct Indicators<br/><!entity<br/>system<br/>file://<br/>root:"]
        ErrorInd["Error Indicators<br/>XML Parse Error<br/>Entity Error"]
        OASTInd["OAST Indicators<br/>DNS Query<br/>HTTP Callback"]
    end
    
    PayloadSource -->|"Static payloads"| Injector
    OASTGen -->|"OAST-enabled payload"| Injector
    Injector -->|"JSON-wrapped XML"| WSConn
    WSConn --> Response
    Response --> Verifier
    Response --> OASTCheck
    
    Verifier --> DirectInd
    Verifier --> ErrorInd
    OASTCheck --> OASTInd
    
    DirectInd -->|"HIGH confidence"| VulnReport["Vulnerability Report<br/>type: XXE<br/>severity: CRITICAL"]
    ErrorInd -->|"MEDIUM confidence"| VulnReport
    OASTInd -->|"HIGH confidence"| VulnReport

Sources: wshawk/scanner_v2.py:450-504, wshawk/main.py:665-704

XXE Payload Sources and Injection

Payload Collection

XXE payloads are loaded from the static collection at runtime:

| Method | File | Purpose | |--------|------|---------| | WSPayloads.get_xxe() | wshawk/payloads/xxe.txt | Loads entity injection vectors | | Default limit | 30 payloads | Performance-optimized subset in v2 |

Sources: wshawk/main.py:129-130, wshawk/scanner_v2.py:455

Injection Strategy

graph LR
    subgraph "Legacy Scanner Flow"
        XXEPayload["XXE Payload<br/><!DOCTYPE...>"]
        DirectSend["Direct WebSocket Send<br/>payload as string"]
    end
    
    subgraph "V2 Scanner Flow"
        XXEPayloadV2["XXE Payload"]
        MessageWrap["JSON Wrapper<br/>{action: 'parse_xml'<br/>xml: payload}"]
        OASTSubst["OAST Substitution<br/>Replace callback URL"]
        ContextInject["Context-Aware Injection<br/>MessageAnalyzer"]
    end
    
    XXEPayload --> DirectSend
    XXEPayloadV2 --> OASTSubst
    OASTSubst --> MessageWrap
    MessageWrap --> ContextInject
    
    ContextInject -->|"Learns from samples"| Structured["JSON Field Injection<br/>xml, data, content"]

Key Differences:

Legacy: Sends raw XML payloads directly (wshawk/main.py:682)
V2: Wraps in JSON message structure: {"action": "parse_xml", "xml": payload} (wshawk/scanner_v2.py:472-474)
OAST-Enabled: Replaces entity URLs with OAST callback endpoints (wshawk/scanner_v2.py:471)

Sources: wshawk/scanner_v2.py:468-477, wshawk/main.py:682

XXE Detection Methodology

Response Analysis

The VulnerabilityVerifier class performs pattern matching on responses to identify entity processing:

Detection Indicators (wshawk/scanner_v2.py:484-485):

xxe_indicators = ['<!entity', 'system', 'file://', 'root:', 'XML Parse Error']

Detection Logic:

Convert response to lowercase
Check if any indicator present
If matched → HIGH confidence vulnerability
Report entity processing detected

Sources: wshawk/scanner_v2.py:484-495, wshawk/main.py:685-686

OAST Integration Flow

sequenceDiagram
    participant Scanner as "WSHawkV2"
    participant OAST as "OASTProvider"
    participant Target as "WebSocket Target"
    participant DNS as "interact.sh DNS"
    
    Scanner->>OAST: start()
    OAST->>DNS: Register callback domain
    DNS-->>OAST: unique_id.interact.sh
    
    Scanner->>OAST: generate_payload('xxe', 'test0')
    OAST-->>Scanner: <!ENTITY xxe SYSTEM "http://unique.interact.sh">
    
    Scanner->>Target: send(xml_payload)
    Note over Target: Parser processes entity
    Target->>DNS: DNS query for unique.interact.sh
    
    Scanner->>OAST: check_callbacks('test0')
    OAST->>DNS: Poll for callbacks
    DNS-->>OAST: [DNS query detected]
    OAST-->>Scanner: True (blind XXE confirmed)
    
    Scanner->>Scanner: vulnerabilities.append({type: 'XXE', confidence: 'HIGH'})

OAST Start Condition (wshawk/scanner_v2.py:458-465):

if self.use_oast and not self.oast_provider:
    self.oast_provider = OASTProvider(use_interactsh=False, custom_server="localhost:8888")
    await self.oast_provider.start()

Payload Generation (wshawk/scanner_v2.py:471):

oast_payload = self.oast_provider.generate_payload('xxe', f'test{len(results)}')

Sources: wshawk/scanner_v2.py:458-477

SSRF Detection Overview

SSRF detection tests whether the WebSocket application fetches arbitrary URLs provided by the client, potentially exposing internal network resources and cloud metadata services.

Attack Surface: Any WebSocket message field that accepts URLs or triggers server-side HTTP requests:

Image fetching
URL preview generation
Webhook notifications
External API integration

Sources: wshawk/scanner_v2.py:546-591

SSRF Target Selection

WSHawk tests a curated list of high-value internal targets:

graph TB
    subgraph "Internal Network Targets"
        Localhost["http://localhost<br/>http://127.0.0.1<br/>Loopback access"]
        Private["http://192.168.1.1<br/>http://10.0.0.1<br/>http://172.16.0.1<br/>RFC1918 ranges"]
    end
    
    subgraph "Cloud Metadata Services"
        AWS["http://169.254.169.254/latest/meta-data/<br/>AWS EC2 metadata"]
        GCP["http://metadata.google.internal<br/>Google Cloud metadata"]
    end
    
    subgraph "Detection Strategy"
        Localhost --> ResponseCheck["Response Analysis<br/>Connection refused?<br/>Timeout?<br/>Internal data?"]
        Private --> ResponseCheck
        AWS --> ResponseCheck
        GCP --> ResponseCheck
        
        ResponseCheck --> Indicators["SSRF Indicators<br/>connection refused<br/>timeout<br/>metadata<br/>instance-id"]
    end

Target List (wshawk/scanner_v2.py:551-556):

internal_targets = [
    'http://localhost',
    'http://127.0.0.1',
    'http://169.254.169.254/latest/meta-data/',  # AWS metadata
    'http://metadata.google.internal',            # GCP metadata
]

Sources: wshawk/scanner_v2.py:551-556

SSRF Detection Methodology

Message Construction

SSRF payloads are injected into URL-accepting JSON fields:

Injection Pattern (wshawk/scanner_v2.py:562):

{
  "action": "fetch_url",
  "url": "http://169.254.169.254/latest/meta-data/"
}

Response Verification

graph TB
    SendSSRF["Send SSRF Payload<br/>ws.send(msg)"]
    ReceiveResp["Receive Response<br/>ws.recv() with 3s timeout"]
    AnalyzeResp["Response Analysis<br/>Check for indicators"]
    
    subgraph "Response Indicators"
        ConnRefused["'connection refused'<br/>Internal port accessible but closed"]
        Timeout["'timeout'<br/>Firewall blocking egress"]
        Metadata["'metadata'<br/>'instance-id'<br/>Cloud metadata leaked"]
        Localhost["'localhost'<br/>Loopback reference"]
    end
    
    SendSSRF --> ReceiveResp
    ReceiveResp --> AnalyzeResp
    
    AnalyzeResp --> ConnRefused
    AnalyzeResp --> Timeout
    AnalyzeResp --> Metadata
    AnalyzeResp --> Localhost
    
    ConnRefused --> HighConf["HIGH Confidence<br/>SSRF Vulnerability"]
    Timeout --> HighConf
    Metadata --> HighConf
    Localhost --> HighConf

Detection Logic (wshawk/scanner_v2.py:570-572):

ssrf_indicators = ['connection refused', 'timeout', 'metadata', 'instance-id', 'localhost']
if any(ind.lower() in response.lower() for ind in ssrf_indicators):
    # SSRF detected

Vulnerability Report Structure (wshawk/scanner_v2.py:573-581):

{
    'type': 'Server-Side Request Forgery (SSRF)',
    'severity': 'HIGH',
    'confidence': 'HIGH',
    'description': f'SSRF vulnerability - accessed {target}',
    'payload': target,
    'response_snippet': response[:200],
    'recommendation': 'Validate and whitelist allowed URLs'
}

Sources: wshawk/scanner_v2.py:546-591

Rate Limiting and Resilience

Both XXE and SSRF tests respect the configured rate limiter to avoid detection and overloading:

XXE Rate Limiting (wshawk/scanner_v2.py:500):

await asyncio.sleep(0.05)  # 50ms delay between payloads

SSRF Rate Limiting (wshawk/scanner_v2.py:560):

await self.rate_limiter.acquire()  # Token bucket rate control

Timeout Configuration:

XXE: 2-second response timeout (wshawk/scanner_v2.py:480)
SSRF: 3-second response timeout for slow metadata services (wshawk/scanner_v2.py:567)

Sources: wshawk/scanner_v2.py:500, wshawk/scanner_v2.py:560, wshawk/scanner_v2.py:480, wshawk/scanner_v2.py:567

Integration with Scan Workflow

Heuristic Scan Integration

Both XXE and SSRF tests are integrated into the main run_heuristic_scan() workflow:

graph TB
    Start["run_heuristic_scan()"]
    Connect["WebSocket Connect"]
    Learning["Learning Phase<br/>5 seconds"]
    SQLi["test_sql_injection_v2()"]
    XSS["test_xss_v2()"]
    Cmd["test_command_injection_v2()"]
    PathTrav["test_path_traversal_v2()"]
    XXE["test_xxe_v2()<br/>Line 628"]
    NoSQL["test_nosql_injection_v2()"]
    SSRF["test_ssrf_v2()<br/>Line 634"]
    Session["SessionHijackingTester"]
    Report["Generate Report"]
    
    Start --> Connect
    Connect --> Learning
    Learning --> SQLi
    SQLi --> XSS
    XSS --> Cmd
    Cmd --> PathTrav
    PathTrav --> XXE
    XXE --> NoSQL
    NoSQL --> SSRF
    SSRF --> Session
    Session --> Report

Call Sites:

XXE: wshawk/scanner_v2.py:628
SSRF: wshawk/scanner_v2.py:634

Sources: wshawk/scanner_v2.py:593-852

Configuration

Scanner V2 Configuration

XXE and SSRF detection are controlled through the scanner configuration:

OAST Toggle (wshawk/scanner_v2.py:82-83):

self.use_oast = True
self.oast_provider = None

CLI Override (wshawk/advanced_cli.py:41-42):

wshawk-advanced ws://target.com --no-oast  # Disable OAST testing

Full Feature Mode (wshawk/advanced_cli.py:174-177):

wshawk-advanced ws://target.com --full  # Enables OAST + Playwright + Smart Payloads

Configuration File

The hierarchical configuration system supports OAST and scanning feature toggles:

scanner:
  rate_limit: 10
  features:
    oast: true
    playwright: true
    smart_payloads: false

Configuration Loading (wshawk/scanner_v2.py:48-53):

if config is None:
    from .config import WSHawkConfig
    self.config = WSHawkConfig.load()

Sources: wshawk/scanner_v2.py:82-83, wshawk/advanced_cli.py:41-42, wshawk/advanced_cli.py:174-177, wshawk/scanner_v2.py:48-53

Vulnerability Reporting

Report Structure

XXE and SSRF findings are appended to the vulnerabilities list with standardized metadata:

XXE Report Fields:

type: "XML External Entity (XXE)"
severity: "HIGH" or "CRITICAL"
confidence: "HIGH" (direct detection) or "MEDIUM" (error-based)
description: Entity processing evidence
payload: Triggering XML payload (truncated to 80 chars)
response_snippet: First 200 chars of response
recommendation: "Disable external entity processing"

SSRF Report Fields:

type: "Server-Side Request Forgery (SSRF)"
severity: "HIGH"
confidence: "HIGH"
description: Accessed internal target details
payload: Internal URL that was fetched
response_snippet: First 200 chars of response
recommendation: "Validate and whitelist allowed URLs"

Vulnerability Append (wshawk/scanner_v2.py:486-495, wshawk/scanner_v2.py:573-581):

self.vulnerabilities.append({...})

CVSS Scoring

XXE and SSRF vulnerabilities receive CVSS v3.1 scores based on their severity:

| Vulnerability | Base Score | Vector | |---------------|------------|--------| | XXE (File Read) | 7.5 - 8.6 | AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N | | XXE (RCE) | 9.8 | AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H | | SSRF (Metadata) | 8.6 | AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:N/A:N | | SSRF (Internal) | 7.5 | AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N |

Sources: wshawk/scanner_v2.py:486-495, wshawk/scanner_v2.py:573-581

Legacy Scanner Comparison

The legacy scanner (wshawk/main.py:665-704) implements basic XXE detection without OAST support:

Key Differences:

| Feature | Legacy Scanner | V2 Scanner | |---------|---------------|------------| | OAST Integration | ❌ No | ✅ Yes | | Context-Aware Injection | ❌ No | ✅ Yes with MessageAnalyzer | | Rate Limiting | ✅ Fixed 0.1s delay | ✅ Token bucket with adaptive control | | SSRF Detection | ❌ Not implemented | ✅ Implemented | | Payload Limit | Configurable via max_payloads | Fixed 30 for performance | | Confidence Scoring | ❌ Basic severity | ✅ ConfidenceLevel enum |

Legacy XXE Test (wshawk/main.py:665-704):

Uses WSPayloads.get_xxe() without OAST
Direct pattern matching on indicators
Appends to self.vulnerabilities with basic structure

Sources: wshawk/main.py:665-704, wshawk/scanner_v2.py:450-504, wshawk/scanner_v2.py:546-591

Remediation Recommendations

WSHawk provides actionable remediation guidance for each detected vulnerability:

XXE Remediation

Recommendation (wshawk/scanner_v2.py:493):

"Disable external entity processing"

Technical Implementation:

Disable DTD processing entirely
Disable external entity resolution
Use safe XML parser configurations:
- Python: defusedxml library
- Java: XMLConstants.FEATURE_SECURE_PROCESSING
- PHP: libxml_disable_entity_loader(true)

SSRF Remediation

Recommendation (wshawk/scanner_v2.py:580):

"Validate and whitelist allowed URLs"

Technical Implementation:

Implement URL whitelist validation
Block access to private IP ranges (RFC1918)
Block cloud metadata endpoints
Use network-level egress filtering
Implement request timeout limits

Sources: wshawk/scanner_v2.py:493, wshawk/scanner_v2.py:580