Cross-Site Scripting (XSS) Detection

The following files were used as context for generating this wiki page:

Purpose and Scope

This document details WSHawk's Cross-Site Scripting (XSS) detection methodology, including payload injection strategies, reflection analysis, confidence scoring, and real browser verification using Playwright. The system implements a multi-stage detection pipeline that progresses from basic reflection detection to automated browser execution confirmation.

For information about other injection vulnerabilities, see Injection Vulnerabilities. For details on OAST-based blind vulnerability detection, see OAST Blind Vulnerability Detection. For payload mutation and WAF evasion techniques, see Payload Mutation and WAF Evasion.

XSS Detection Architecture

WSHawk implements a three-tier XSS detection system that separates static payload testing, heuristic analysis, and browser-based verification into distinct stages.

Detection Pipeline Overview

graph TB
    subgraph "Payload Source"
        StaticPayloads["WSPayloads.get_xss()"]
        SmartGen["ContextAwareGenerator<br/>Smart Payload Evolution"]
        Evolver["PayloadEvolver<br/>Genetic Algorithm"]
    end
    
    subgraph "Message Preparation"
        Analyzer["MessageAnalyzer<br/>Format Detection"]
        Injector["inject_payload_into_message()<br/>Field-Aware Injection"]
    end
    
    subgraph "Transmission Layer"
        Send["WebSocket Send"]
        Receive["WebSocket Receive"]
        RateLimit["TokenBucketRateLimiter<br/>Adaptive Rate Control"]
    end
    
    subgraph "Detection Engine"
        Verifier["VulnerabilityVerifier.verify_xss()"]
        ConfScore["ConfidenceLevel Scoring<br/>LOW/MEDIUM/HIGH/CRITICAL"]
        Context["Context Analysis<br/>HTML/JS/Attribute"]
    end
    
    subgraph "Advanced Verification"
        BrowserCheck{"confidence == HIGH?"}
        Playwright["HeadlessBrowserXSSVerifier"]
        Execute["Real Browser Execution"]
        Screenshot["Screenshot Capture"]
        Evidence["Evidence Collection"]
    end
    
    subgraph "Result Storage"
        VulnList["vulnerabilities[]<br/>severity + confidence"]
        FeedbackLoop["FeedbackLoop<br/>Response Analysis"]
        EvolutionUpdate["update_fitness()<br/>Seed Successful Payloads"]
    end
    
    StaticPayloads --> Injector
    SmartGen --> Injector
    Evolver --> Injector
    
    Analyzer --> Injector
    
    Injector --> Send
    Send --> RateLimit
    RateLimit --> Receive
    
    Receive --> Verifier
    Verifier --> ConfScore
    ConfScore --> Context
    
    Context --> BrowserCheck
    BrowserCheck -->|Yes| Playwright
    BrowserCheck -->|No| VulnList
    
    Playwright --> Execute
    Execute --> Screenshot
    Screenshot --> Evidence
    
    Evidence --> VulnList
    VulnList --> FeedbackLoop
    FeedbackLoop --> EvolutionUpdate
    EvolutionUpdate --> Evolver

Sources: wshawk/scanner_v2.py:258-341, wshawk/main.py:399-438

XSS Payload System

Static Payload Collection

WSHawk maintains a comprehensive XSS payload database loaded from external files. The payload system uses lazy loading with caching for performance.

| Component | Description | Location | |-----------|-------------|----------| | Payload File | xss.txt containing XSS vectors | wshawk/payloads/xss.txt | | Loader Method | WSPayloads.get_xss() | wshawk/main.py:109-110 | | Caching | _payloads_cache dictionary | wshawk/main.py:67 | | Count | 22,000+ total attack vectors (subset used per scan) | README.md |

The loader implements fallback logic for both pip-installed packages (using importlib.resources) and development environments (filesystem access).

Example Payload Loading:

# Legacy scanner usage
payloads = WSPayloads.get_xss()[:100]  # First 100 payloads

# V2 scanner with full collection
all_payloads = WSPayloads.get_xss()

Sources: wshawk/main.py:61-142, wshawk/scanner_v2.py:265

Reflection Detection Methodology

Legacy Scanner Approach

The original scanner (WSHawk class) implements simple reflection detection by checking if the payload appears verbatim in the response.

Detection Logic:

graph LR
    Payload["XSS Payload"] --> JSON["JSON Wrapper<br/>{message: payload}"]
    JSON --> Send["ws.send()"]
    Send --> Response["WebSocket Response"]
    Response --> Check{"payload in response?"}
    Check -->|Yes| Vuln["Mark as XSS<br/>severity: HIGH"]
    Check -->|No| Next["Next Payload"]

Implementation: wshawk/main.py:399-438

# Simplified reflection check from legacy scanner
if response and payload in response:
    Logger.vuln(f"XSS reflection detected: {payload[:50]}")
    vulnerabilities.append({
        'type': 'Cross-Site Scripting (XSS)',
        'severity': 'HIGH',
        'description': 'XSS payload reflected in WebSocket response',
        'payload': payload,
        'recommendation': 'Sanitize and encode all user input'
    })

Limitations:

No context analysis
No encoding detection
Binary vulnerable/not-vulnerable classification
No browser verification

Sources: wshawk/main.py:399-438

Enhanced XSS Detection (V2 Scanner)

Automated Context-Aware Analysis

The V2 scanner (WSHawkV2 class) implements sophisticated XSS detection with multiple verification stages.

Message Format Intelligence

Before payload injection, the scanner analyzes message structure during the learning phase:

graph TB
    Learning["Learning Phase<br/>5-10 seconds"]
    Samples["Sample Messages"]
    Analyze["MessageAnalyzer.learn_from_messages()"]
    Format["Detect Format<br/>JSON/XML/Binary/Plain"]
    Fields["Extract Injectable Fields"]
    
    Learning --> Samples
    Samples --> Analyze
    Analyze --> Format
    Analyze --> Fields
    
    Format --> Injection["Smart Injection Strategy"]
    Fields --> Injection

Sources: wshawk/scanner_v2.py:112-175

Field-Aware Payload Injection

For structured formats like JSON, payloads are automatically injected into all detected fields:

| Message Format | Injection Strategy | Example | |----------------|-------------------|---------| | JSON | Inject into each detected field | {"message": "<xss>", "user": "<xss>"} | | XML | Inject into text nodes and attributes | <msg><text><xss></text></msg> | | Plain Text | Direct substitution | <script>alert(1)</script> | | Binary | Skip (handled by BinaryMessageHandler) | N/A |

Implementation: wshawk/scanner_v2.py:271-276

if self.learning_complete and self.message_analyzer.detected_format == MessageFormat.JSON:
    injected_messages = self.message_analyzer.inject_payload_into_message(
        base_message, payload
    )
else:
    injected_messages = [payload]

Sources: wshawk/scanner_v2.py:258-341

Confidence Scoring System

VulnerabilityVerifier Integration

The VulnerabilityVerifier class provides automated confidence scoring based on multiple heuristics.

graph TB
    Response["WebSocket Response"]
    Payload["Injected Payload"]
    
    Response --> Verify["VulnerabilityVerifier.verify_xss()"]
    Payload --> Verify
    
    Verify --> Reflect{"Exact Reflection?"}
    Verify --> Context{"HTML/JS Context?"}
    Verify --> Encoded{"Encoding Applied?"}
    Verify --> Dangerous{"Dangerous Tags?"}
    
    Reflect -->|Yes| ScoreHigh["Score += 3"]
    Reflect -->|No| ScoreLow["Score += 0"]
    
    Context -->|Unquoted Attr| ScoreHigh2["Score += 2"]
    Context -->|JS Context| ScoreHigh3["Score += 3"]
    Context -->|HTML Context| ScoreMed["Score += 2"]
    
    Encoded -->|No Encoding| ScoreHigh4["Score += 2"]
    Encoded -->|Partial| ScoreMed2["Score += 1"]
    
    Dangerous -->|script/img/svg| ScoreHigh5["Score += 2"]
    
    ScoreHigh --> Calculate["Calculate Total Score"]
    ScoreLow --> Calculate
    ScoreHigh2 --> Calculate
    ScoreHigh3 --> Calculate
    ScoreMed --> Calculate
    ScoreHigh4 --> Calculate
    ScoreMed2 --> Calculate
    ScoreHigh5 --> Calculate
    
    Calculate --> Level{"Score Range"}
    Level -->|0-2| LOW["ConfidenceLevel.LOW"]
    Level -->|3-5| MEDIUM["ConfidenceLevel.MEDIUM"]
    Level -->|6-8| HIGH["ConfidenceLevel.HIGH"]
    Level -->|9+| CRITICAL["ConfidenceLevel.CRITICAL"]

Confidence Levels:

| Level | Score Range | Meaning | Action Taken | |-------|-------------|---------|--------------| | LOW | 0-2 | Payload reflected but likely sanitized | Logged, not reported | | MEDIUM | 3-5 | Partial reflection, possible filter bypass | Reported without verification | | HIGH | 6-8 | Strong indicators of exploitation | Browser verification triggered | | CRITICAL | 9+ | Browser-confirmed execution | Reported with screenshot evidence |

Implementation: wshawk/scanner_v2.py:287-289

is_vuln, confidence, description = self.verifier.verify_xss(
    response, payload
)

Sources: wshawk/scanner_v2.py:258-341

Browser Verification with Playwright

HeadlessBrowserXSSVerifier

For HIGH-confidence XSS findings, WSHawk automatically spawns a real Chromium browser to confirm exploitation.

Verification Architecture

graph TB
    subgraph "Detection Phase"
        HighConf["HIGH Confidence XSS<br/>from VulnerabilityVerifier"]
    end
    
    subgraph "Browser Initialization"
        Check{"headless_verifier<br/>exists?"}
        Create["HeadlessBrowserXSSVerifier()"]
        Start["await start()<br/>Launch Playwright"]
    end
    
    subgraph "Execution Test"
        Inject["Inject Payload into Page"]
        HTML["Create Test HTML<br/>with Response Content"]
        Navigate["page.goto(test_html)"]
        Wait["Wait for JS Execution"]
        Monitor["Monitor Console Events"]
        CheckAlert{"alert() or<br/>console.log?"}
    end
    
    subgraph "Evidence Collection"
        Screenshot["page.screenshot()"]
        Console["Console Logs"]
        Evidence["Execution Evidence Object"]
    end
    
    subgraph "Result Processing"
        Upgrade["confidence = CRITICAL"]
        Report["Add to vulnerabilities[]<br/>browser_verified: true"]
    end
    
    HighConf --> Check
    Check -->|No| Create
    Check -->|Yes| Inject
    Create --> Start
    Start --> Inject
    
    Inject --> HTML
    HTML --> Navigate
    Navigate --> Wait
    Wait --> Monitor
    Monitor --> CheckAlert
    
    CheckAlert -->|Yes| Screenshot
    CheckAlert -->|No| Next["Not Executed"]
    
    Screenshot --> Console
    Console --> Evidence
    
    Evidence --> Upgrade
    Upgrade --> Report

Sources: wshawk/scanner_v2.py:294-314

Browser Verification Code Flow

Initialization:

# Lazy initialization on first HIGH-confidence finding
if not self.headless_verifier:
    self.headless_verifier = HeadlessBrowserXSSVerifier()
    await self.headless_verifier.start()  # Launches Chromium

Execution Test:

is_executed, evidence = await self.headless_verifier.verify_xss_execution(
    response, payload
)

if is_executed:
    browser_verified = True
    confidence = ConfidenceLevel.CRITICAL
    description = f"REAL EXECUTION: {evidence}"

Result Enhancement:

self.vulnerabilities.append({
    'type': 'Cross-Site Scripting (XSS)',
    'severity': confidence.value,  # 'CRITICAL' if browser-verified
    'confidence': confidence.value,
    'description': description,
    'payload': payload,
    'response_snippet': response[:200],
    'browser_verified': browser_verified,  # True/False flag
    'recommendation': 'Sanitize and encode all user input'
})

Sources: wshawk/scanner_v2.py:294-330

Smart Payload Evolution

Adaptive XSS Payload Generation

When smart payloads are enabled (--smart-payloads flag), WSHawk uses genetic algorithms to evolve successful payloads.

graph LR
    subgraph "Seed Population"
        Success["Successful XSS Payload"]
        Seed["payload_evolver.seed()"]
    end
    
    subgraph "Fitness Tracking"
        Update["update_fitness(payload, 1.0)"]
        Score["Fitness Score Database"]
    end
    
    subgraph "Evolution"
        Crossover["Crossover Operation"]
        Mutation["Mutation Strategies"]
        Selection["Parent Selection"]
        Evolve["evolve(count=30)"]
    end
    
    subgraph "Context Generation"
        ContextGen["ContextAwareGenerator"]
        Format["Learned Message Format"]
        Priority["get_priority_categories()"]
        Generate["generate_payloads(xss)"]
    end
    
    Success --> Seed
    Seed --> Score
    Success --> Update
    
    Score --> Selection
    Selection --> Crossover
    Crossover --> Mutation
    Mutation --> Evolve
    
    Format --> ContextGen
    Priority --> Generate
    ContextGen --> Generate
    
    Evolve --> Testing["Re-test Evolved Payloads"]
    Generate --> Testing

Implementation: wshawk/scanner_v2.py:317-319

# Seed successful payload into evolver
if self.use_smart_payloads:
    self.payload_evolver.seed([payload])
    self.payload_evolver.update_fitness(payload, 1.0)

Evolution Phase: wshawk/scanner_v2.py:638-703

if self.use_smart_payloads and len(self.payload_evolver.population) > 0:
    Logger.info("Running evolved payload phase...")
    evolved = self.payload_evolver.evolve(count=30)
    
    # Generate context-aware payloads
    priorities = self.feedback_loop.get_priority_categories()
    for category, _ in priorities[:3]:
        ctx_payloads = self.context_generator.generate_payloads(category, count=10)
        evolved.extend(ctx_payloads)

Sources: wshawk/scanner_v2.py:638-703, wshawk/scanner_v2.py:317-319

XSS Detection in Practice

Test Execution Flow

Legacy Scanner (WSHawk)

sequenceDiagram
    participant CLI as wshawk CLI
    participant Scanner as WSHawk.test_xss()
    participant WS as WebSocket Connection
    participant Payloads as WSPayloads.get_xss()
    participant Logger as Logger
    
    CLI->>Scanner: await test_xss()
    Scanner->>Payloads: Load XSS payloads
    Payloads-->>Scanner: payloads[:100]
    
    Scanner->>WS: await connect()
    
    loop For each payload
        Scanner->>Scanner: json.dumps({"message": payload})
        Scanner->>WS: await send(message)
        WS-->>Scanner: response
        
        alt payload in response
            Scanner->>Logger: vuln("XSS reflection detected")
            Scanner->>Scanner: vulnerabilities.append({...})
        end
        
        Scanner->>Scanner: await asyncio.sleep(0.1)
    end
    
    Scanner->>WS: await close()

Sources: wshawk/main.py:399-438

V2 Scanner with Browser Verification (WSHawkV2)

sequenceDiagram
    participant CLI as wshawk-advanced
    participant Scanner as WSHawkV2
    participant Learning as Learning Phase
    participant Verifier as VulnerabilityVerifier
    participant Browser as HeadlessBrowserXSSVerifier
    participant Storage as vulnerabilities[]
    
    CLI->>Scanner: await run_heuristic_scan()
    Scanner->>Learning: await learning_phase(5s)
    Learning-->>Scanner: Message format + fields
    
    Scanner->>Scanner: await test_xss_v2()
    
    loop For each payload
        Scanner->>Scanner: inject_payload_into_message()
        Scanner->>Scanner: await ws.send()
        Scanner->>Scanner: await ws.recv()
        
        Scanner->>Verifier: verify_xss(response, payload)
        Verifier-->>Scanner: (is_vuln, confidence, desc)
        
        alt confidence == HIGH
            Scanner->>Browser: verify_xss_execution()
            Browser-->>Scanner: (is_executed, evidence)
            
            alt is_executed == True
                Scanner->>Storage: Add with confidence=CRITICAL
                Scanner->>Storage: browser_verified=True
            end
        else confidence >= MEDIUM
            Scanner->>Storage: Add with original confidence
        end
    end

Sources: wshawk/scanner_v2.py:258-341, wshawk/scanner_v2.py:593-852

Configuration and Options

Enabling Browser Verification

Browser verification can be enabled via multiple interfaces:

| Interface | Method | Example | |-----------|--------|---------| | CLI Flag | --playwright | wshawk-advanced ws://target --playwright | | Full Mode | --full | wshawk-advanced ws://target --full | | Python API | use_headless_browser = True | scanner.use_headless_browser = True | | Configuration File | wshawk.yaml | scanner.features.playwright: true |

Python API Usage:

from wshawk.scanner_v2 import WSHawkV2

scanner = WSHawkV2("ws://target.com")
scanner.use_headless_browser = True  # Enable Playwright
await scanner.run_heuristic_scan()

Configuration File:

scanner:
  features:
    playwright: true
    smart_payloads: true

Sources: wshawk/advanced_cli.py:39-40, wshawk/scanner_v2.py:78, README.md:256-263

Rate Limiting and Performance

Adaptive Request Control

XSS testing respects the adaptive rate limiter to avoid overwhelming target servers or triggering rate-limiting protections.

Default Settings:

| Parameter | Default | Purpose | |-----------|---------|---------| | Tokens per Second | 10 | Max requests/second | | Bucket Size | 20 | Burst capacity | | Adaptive Mode | Enabled | Slows down on errors | | Sleep Between Requests | 0.05s (50ms) | Additional delay |

Implementation:

# Rate limiter initialization
self.rate_limiter = TokenBucketRateLimiter(
    tokens_per_second=rate_limit,
    bucket_size=rate_limit * 2,
    enable_adaptive=True
)

# Usage in test loop
await asyncio.sleep(0.05)  # Rate limiting between payloads

Sources: wshawk/scanner_v2.py:62-66, wshawk/scanner_v2.py:336

Output and Reporting

Vulnerability Record Structure

XSS findings are stored with detailed metadata for reporting:

{
    'type': 'Cross-Site Scripting (XSS)',
    'severity': 'CRITICAL',           # CRITICAL/HIGH/MEDIUM/LOW
    'confidence': 'CRITICAL',          # ConfidenceLevel enum value
    'description': 'REAL EXECUTION: alert dialog captured',
    'payload': '<script>alert(1)</script>',
    'response_snippet': 'Server echoed: <script>alert(1)</script>...',
    'browser_verified': True,          # Only present if Playwright used
    'recommendation': 'Sanitize and encode all user input'
}

Browser Verification Evidence

When browser verification succeeds, additional evidence is collected:

| Evidence Type | Description | Storage | |---------------|-------------|---------| | Screenshot | PNG image of XSS execution | Embedded in HTML report | | Console Logs | JavaScript console output | Appended to description | | Execution Flag | Boolean confirmation | browser_verified field | | Timing Data | Time to execution | Metadata |

Sources: wshawk/scanner_v2.py:321-330

Summary

WSHawk's XSS detection system implements a multi-stage pipeline that progresses from basic reflection detection to real browser execution confirmation:

Static Payload Testing: 22,000+ XSS vectors from WSPayloads.get_xss()
Context-Aware Injection: Automatic field detection and smart injection via MessageAnalyzer
Confidence Scoring: Heuristic analysis via VulnerabilityVerifier (LOW/MEDIUM/HIGH/CRITICAL)
Browser Verification: Automatic Playwright testing for HIGH-confidence findings
Smart Evolution: Genetic algorithm optimization for WAF bypass

The combination of automated heuristics and real browser confirmation ensures both high detection rates and low false positives, while the confidence scoring system allows security teams to prioritize critical findings that require immediate remediation.

Key Classes:

WSHawk.test_xss() - Legacy reflection detection wshawk/main.py:399-438
WSHawkV2.test_xss_v2() - Enhanced detection pipeline wshawk/scanner_v2.py:258-341
VulnerabilityVerifier.verify_xss() - Confidence scoring (referenced but implementation not in provided files)
HeadlessBrowserXSSVerifier - Playwright browser automation (referenced but implementation not in provided files)
WSPayloads - Static payload collection wshawk/main.py:61-142