Cross-Site Scripting (XSS) Detection
Cross-Site Scripting (XSS) Detection
The following files were used as context for generating this wiki page:
- .github/workflows/ghcr-publish.yml
- README.md
- requirements.txt
- wshawk/main.py
- wshawk/advanced_cli.py
- wshawk/scanner_v2.py
Purpose and Scope
This document details WSHawk's Cross-Site Scripting (XSS) detection methodology, including payload injection strategies, reflection analysis, confidence scoring, and real browser verification using Playwright. The system implements a multi-stage detection pipeline that progresses from basic reflection detection to automated browser execution confirmation.
For information about other injection vulnerabilities, see Injection Vulnerabilities. For details on OAST-based blind vulnerability detection, see OAST Blind Vulnerability Detection. For payload mutation and WAF evasion techniques, see Payload Mutation and WAF Evasion.
XSS Detection Architecture
WSHawk implements a three-tier XSS detection system that separates static payload testing, heuristic analysis, and browser-based verification into distinct stages.
Detection Pipeline Overview
graph TB
subgraph "Payload Source"
StaticPayloads["WSPayloads.get_xss()"]
SmartGen["ContextAwareGenerator<br/>Smart Payload Evolution"]
Evolver["PayloadEvolver<br/>Genetic Algorithm"]
end
subgraph "Message Preparation"
Analyzer["MessageAnalyzer<br/>Format Detection"]
Injector["inject_payload_into_message()<br/>Field-Aware Injection"]
end
subgraph "Transmission Layer"
Send["WebSocket Send"]
Receive["WebSocket Receive"]
RateLimit["TokenBucketRateLimiter<br/>Adaptive Rate Control"]
end
subgraph "Detection Engine"
Verifier["VulnerabilityVerifier.verify_xss()"]
ConfScore["ConfidenceLevel Scoring<br/>LOW/MEDIUM/HIGH/CRITICAL"]
Context["Context Analysis<br/>HTML/JS/Attribute"]
end
subgraph "Advanced Verification"
BrowserCheck{"confidence == HIGH?"}
Playwright["HeadlessBrowserXSSVerifier"]
Execute["Real Browser Execution"]
Screenshot["Screenshot Capture"]
Evidence["Evidence Collection"]
end
subgraph "Result Storage"
VulnList["vulnerabilities[]<br/>severity + confidence"]
FeedbackLoop["FeedbackLoop<br/>Response Analysis"]
EvolutionUpdate["update_fitness()<br/>Seed Successful Payloads"]
end
StaticPayloads --> Injector
SmartGen --> Injector
Evolver --> Injector
Analyzer --> Injector
Injector --> Send
Send --> RateLimit
RateLimit --> Receive
Receive --> Verifier
Verifier --> ConfScore
ConfScore --> Context
Context --> BrowserCheck
BrowserCheck -->|Yes| Playwright
BrowserCheck -->|No| VulnList
Playwright --> Execute
Execute --> Screenshot
Screenshot --> Evidence
Evidence --> VulnList
VulnList --> FeedbackLoop
FeedbackLoop --> EvolutionUpdate
EvolutionUpdate --> Evolver
Sources: wshawk/scanner_v2.py:258-341, wshawk/main.py:399-438
XSS Payload System
Static Payload Collection
WSHawk maintains a comprehensive XSS payload database loaded from external files. The payload system uses lazy loading with caching for performance.
| Component | Description | Location |
|-----------|-------------|----------|
| Payload File | xss.txt containing XSS vectors | wshawk/payloads/xss.txt |
| Loader Method | WSPayloads.get_xss() | wshawk/main.py:109-110 |
| Caching | _payloads_cache dictionary | wshawk/main.py:67 |
| Count | 22,000+ total attack vectors (subset used per scan) | README.md |
The loader implements fallback logic for both pip-installed packages (using importlib.resources) and development environments (filesystem access).
Example Payload Loading:
# Legacy scanner usage
payloads = WSPayloads.get_xss()[:100] # First 100 payloads
# V2 scanner with full collection
all_payloads = WSPayloads.get_xss()
Sources: wshawk/main.py:61-142, wshawk/scanner_v2.py:265
Reflection Detection Methodology
Legacy Scanner Approach
The original scanner (WSHawk class) implements simple reflection detection by checking if the payload appears verbatim in the response.
Detection Logic:
graph LR
Payload["XSS Payload"] --> JSON["JSON Wrapper<br/>{message: payload}"]
JSON --> Send["ws.send()"]
Send --> Response["WebSocket Response"]
Response --> Check{"payload in response?"}
Check -->|Yes| Vuln["Mark as XSS<br/>severity: HIGH"]
Check -->|No| Next["Next Payload"]
Implementation: wshawk/main.py:399-438
# Simplified reflection check from legacy scanner
if response and payload in response:
Logger.vuln(f"XSS reflection detected: {payload[:50]}")
vulnerabilities.append({
'type': 'Cross-Site Scripting (XSS)',
'severity': 'HIGH',
'description': 'XSS payload reflected in WebSocket response',
'payload': payload,
'recommendation': 'Sanitize and encode all user input'
})
Limitations:
- No context analysis
- No encoding detection
- Binary vulnerable/not-vulnerable classification
- No browser verification
Sources: wshawk/main.py:399-438
Enhanced XSS Detection (V2 Scanner)
Automated Context-Aware Analysis
The V2 scanner (WSHawkV2 class) implements sophisticated XSS detection with multiple verification stages.
Message Format Intelligence
Before payload injection, the scanner analyzes message structure during the learning phase:
graph TB
Learning["Learning Phase<br/>5-10 seconds"]
Samples["Sample Messages"]
Analyze["MessageAnalyzer.learn_from_messages()"]
Format["Detect Format<br/>JSON/XML/Binary/Plain"]
Fields["Extract Injectable Fields"]
Learning --> Samples
Samples --> Analyze
Analyze --> Format
Analyze --> Fields
Format --> Injection["Smart Injection Strategy"]
Fields --> Injection
Sources: wshawk/scanner_v2.py:112-175
Field-Aware Payload Injection
For structured formats like JSON, payloads are automatically injected into all detected fields:
| Message Format | Injection Strategy | Example |
|----------------|-------------------|---------|
| JSON | Inject into each detected field | {"message": "<xss>", "user": "<xss>"} |
| XML | Inject into text nodes and attributes | <msg><text><xss></text></msg> |
| Plain Text | Direct substitution | <script>alert(1)</script> |
| Binary | Skip (handled by BinaryMessageHandler) | N/A |
Implementation: wshawk/scanner_v2.py:271-276
if self.learning_complete and self.message_analyzer.detected_format == MessageFormat.JSON:
injected_messages = self.message_analyzer.inject_payload_into_message(
base_message, payload
)
else:
injected_messages = [payload]
Sources: wshawk/scanner_v2.py:258-341
Confidence Scoring System
VulnerabilityVerifier Integration
The VulnerabilityVerifier class provides automated confidence scoring based on multiple heuristics.
graph TB
Response["WebSocket Response"]
Payload["Injected Payload"]
Response --> Verify["VulnerabilityVerifier.verify_xss()"]
Payload --> Verify
Verify --> Reflect{"Exact Reflection?"}
Verify --> Context{"HTML/JS Context?"}
Verify --> Encoded{"Encoding Applied?"}
Verify --> Dangerous{"Dangerous Tags?"}
Reflect -->|Yes| ScoreHigh["Score += 3"]
Reflect -->|No| ScoreLow["Score += 0"]
Context -->|Unquoted Attr| ScoreHigh2["Score += 2"]
Context -->|JS Context| ScoreHigh3["Score += 3"]
Context -->|HTML Context| ScoreMed["Score += 2"]
Encoded -->|No Encoding| ScoreHigh4["Score += 2"]
Encoded -->|Partial| ScoreMed2["Score += 1"]
Dangerous -->|script/img/svg| ScoreHigh5["Score += 2"]
ScoreHigh --> Calculate["Calculate Total Score"]
ScoreLow --> Calculate
ScoreHigh2 --> Calculate
ScoreHigh3 --> Calculate
ScoreMed --> Calculate
ScoreHigh4 --> Calculate
ScoreMed2 --> Calculate
ScoreHigh5 --> Calculate
Calculate --> Level{"Score Range"}
Level -->|0-2| LOW["ConfidenceLevel.LOW"]
Level -->|3-5| MEDIUM["ConfidenceLevel.MEDIUM"]
Level -->|6-8| HIGH["ConfidenceLevel.HIGH"]
Level -->|9+| CRITICAL["ConfidenceLevel.CRITICAL"]
Confidence Levels:
| Level | Score Range | Meaning | Action Taken | |-------|-------------|---------|--------------| | LOW | 0-2 | Payload reflected but likely sanitized | Logged, not reported | | MEDIUM | 3-5 | Partial reflection, possible filter bypass | Reported without verification | | HIGH | 6-8 | Strong indicators of exploitation | Browser verification triggered | | CRITICAL | 9+ | Browser-confirmed execution | Reported with screenshot evidence |
Implementation: wshawk/scanner_v2.py:287-289
is_vuln, confidence, description = self.verifier.verify_xss(
response, payload
)
Sources: wshawk/scanner_v2.py:258-341
Browser Verification with Playwright
HeadlessBrowserXSSVerifier
For HIGH-confidence XSS findings, WSHawk automatically spawns a real Chromium browser to confirm exploitation.
Verification Architecture
graph TB
subgraph "Detection Phase"
HighConf["HIGH Confidence XSS<br/>from VulnerabilityVerifier"]
end
subgraph "Browser Initialization"
Check{"headless_verifier<br/>exists?"}
Create["HeadlessBrowserXSSVerifier()"]
Start["await start()<br/>Launch Playwright"]
end
subgraph "Execution Test"
Inject["Inject Payload into Page"]
HTML["Create Test HTML<br/>with Response Content"]
Navigate["page.goto(test_html)"]
Wait["Wait for JS Execution"]
Monitor["Monitor Console Events"]
CheckAlert{"alert() or<br/>console.log?"}
end
subgraph "Evidence Collection"
Screenshot["page.screenshot()"]
Console["Console Logs"]
Evidence["Execution Evidence Object"]
end
subgraph "Result Processing"
Upgrade["confidence = CRITICAL"]
Report["Add to vulnerabilities[]<br/>browser_verified: true"]
end
HighConf --> Check
Check -->|No| Create
Check -->|Yes| Inject
Create --> Start
Start --> Inject
Inject --> HTML
HTML --> Navigate
Navigate --> Wait
Wait --> Monitor
Monitor --> CheckAlert
CheckAlert -->|Yes| Screenshot
CheckAlert -->|No| Next["Not Executed"]
Screenshot --> Console
Console --> Evidence
Evidence --> Upgrade
Upgrade --> Report
Sources: wshawk/scanner_v2.py:294-314
Browser Verification Code Flow
Initialization:
# Lazy initialization on first HIGH-confidence finding
if not self.headless_verifier:
self.headless_verifier = HeadlessBrowserXSSVerifier()
await self.headless_verifier.start() # Launches Chromium
Execution Test:
is_executed, evidence = await self.headless_verifier.verify_xss_execution(
response, payload
)
if is_executed:
browser_verified = True
confidence = ConfidenceLevel.CRITICAL
description = f"REAL EXECUTION: {evidence}"
Result Enhancement:
self.vulnerabilities.append({
'type': 'Cross-Site Scripting (XSS)',
'severity': confidence.value, # 'CRITICAL' if browser-verified
'confidence': confidence.value,
'description': description,
'payload': payload,
'response_snippet': response[:200],
'browser_verified': browser_verified, # True/False flag
'recommendation': 'Sanitize and encode all user input'
})
Sources: wshawk/scanner_v2.py:294-330
Smart Payload Evolution
Adaptive XSS Payload Generation
When smart payloads are enabled (--smart-payloads flag), WSHawk uses genetic algorithms to evolve successful payloads.
graph LR
subgraph "Seed Population"
Success["Successful XSS Payload"]
Seed["payload_evolver.seed()"]
end
subgraph "Fitness Tracking"
Update["update_fitness(payload, 1.0)"]
Score["Fitness Score Database"]
end
subgraph "Evolution"
Crossover["Crossover Operation"]
Mutation["Mutation Strategies"]
Selection["Parent Selection"]
Evolve["evolve(count=30)"]
end
subgraph "Context Generation"
ContextGen["ContextAwareGenerator"]
Format["Learned Message Format"]
Priority["get_priority_categories()"]
Generate["generate_payloads(xss)"]
end
Success --> Seed
Seed --> Score
Success --> Update
Score --> Selection
Selection --> Crossover
Crossover --> Mutation
Mutation --> Evolve
Format --> ContextGen
Priority --> Generate
ContextGen --> Generate
Evolve --> Testing["Re-test Evolved Payloads"]
Generate --> Testing
Implementation: wshawk/scanner_v2.py:317-319
# Seed successful payload into evolver
if self.use_smart_payloads:
self.payload_evolver.seed([payload])
self.payload_evolver.update_fitness(payload, 1.0)
Evolution Phase: wshawk/scanner_v2.py:638-703
if self.use_smart_payloads and len(self.payload_evolver.population) > 0:
Logger.info("Running evolved payload phase...")
evolved = self.payload_evolver.evolve(count=30)
# Generate context-aware payloads
priorities = self.feedback_loop.get_priority_categories()
for category, _ in priorities[:3]:
ctx_payloads = self.context_generator.generate_payloads(category, count=10)
evolved.extend(ctx_payloads)
Sources: wshawk/scanner_v2.py:638-703, wshawk/scanner_v2.py:317-319
XSS Detection in Practice
Test Execution Flow
Legacy Scanner (WSHawk)
sequenceDiagram
participant CLI as wshawk CLI
participant Scanner as WSHawk.test_xss()
participant WS as WebSocket Connection
participant Payloads as WSPayloads.get_xss()
participant Logger as Logger
CLI->>Scanner: await test_xss()
Scanner->>Payloads: Load XSS payloads
Payloads-->>Scanner: payloads[:100]
Scanner->>WS: await connect()
loop For each payload
Scanner->>Scanner: json.dumps({"message": payload})
Scanner->>WS: await send(message)
WS-->>Scanner: response
alt payload in response
Scanner->>Logger: vuln("XSS reflection detected")
Scanner->>Scanner: vulnerabilities.append({...})
end
Scanner->>Scanner: await asyncio.sleep(0.1)
end
Scanner->>WS: await close()
Sources: wshawk/main.py:399-438
V2 Scanner with Browser Verification (WSHawkV2)
sequenceDiagram
participant CLI as wshawk-advanced
participant Scanner as WSHawkV2
participant Learning as Learning Phase
participant Verifier as VulnerabilityVerifier
participant Browser as HeadlessBrowserXSSVerifier
participant Storage as vulnerabilities[]
CLI->>Scanner: await run_heuristic_scan()
Scanner->>Learning: await learning_phase(5s)
Learning-->>Scanner: Message format + fields
Scanner->>Scanner: await test_xss_v2()
loop For each payload
Scanner->>Scanner: inject_payload_into_message()
Scanner->>Scanner: await ws.send()
Scanner->>Scanner: await ws.recv()
Scanner->>Verifier: verify_xss(response, payload)
Verifier-->>Scanner: (is_vuln, confidence, desc)
alt confidence == HIGH
Scanner->>Browser: verify_xss_execution()
Browser-->>Scanner: (is_executed, evidence)
alt is_executed == True
Scanner->>Storage: Add with confidence=CRITICAL
Scanner->>Storage: browser_verified=True
end
else confidence >= MEDIUM
Scanner->>Storage: Add with original confidence
end
end
Sources: wshawk/scanner_v2.py:258-341, wshawk/scanner_v2.py:593-852
Configuration and Options
Enabling Browser Verification
Browser verification can be enabled via multiple interfaces:
| Interface | Method | Example |
|-----------|--------|---------|
| CLI Flag | --playwright | wshawk-advanced ws://target --playwright |
| Full Mode | --full | wshawk-advanced ws://target --full |
| Python API | use_headless_browser = True | scanner.use_headless_browser = True |
| Configuration File | wshawk.yaml | scanner.features.playwright: true |
Python API Usage:
from wshawk.scanner_v2 import WSHawkV2
scanner = WSHawkV2("ws://target.com")
scanner.use_headless_browser = True # Enable Playwright
await scanner.run_heuristic_scan()
Configuration File:
scanner:
features:
playwright: true
smart_payloads: true
Sources: wshawk/advanced_cli.py:39-40, wshawk/scanner_v2.py:78, README.md:256-263
Rate Limiting and Performance
Adaptive Request Control
XSS testing respects the adaptive rate limiter to avoid overwhelming target servers or triggering rate-limiting protections.
Default Settings:
| Parameter | Default | Purpose | |-----------|---------|---------| | Tokens per Second | 10 | Max requests/second | | Bucket Size | 20 | Burst capacity | | Adaptive Mode | Enabled | Slows down on errors | | Sleep Between Requests | 0.05s (50ms) | Additional delay |
Implementation:
# Rate limiter initialization
self.rate_limiter = TokenBucketRateLimiter(
tokens_per_second=rate_limit,
bucket_size=rate_limit * 2,
enable_adaptive=True
)
# Usage in test loop
await asyncio.sleep(0.05) # Rate limiting between payloads
Sources: wshawk/scanner_v2.py:62-66, wshawk/scanner_v2.py:336
Output and Reporting
Vulnerability Record Structure
XSS findings are stored with detailed metadata for reporting:
{
'type': 'Cross-Site Scripting (XSS)',
'severity': 'CRITICAL', # CRITICAL/HIGH/MEDIUM/LOW
'confidence': 'CRITICAL', # ConfidenceLevel enum value
'description': 'REAL EXECUTION: alert dialog captured',
'payload': '<script>alert(1)</script>',
'response_snippet': 'Server echoed: <script>alert(1)</script>...',
'browser_verified': True, # Only present if Playwright used
'recommendation': 'Sanitize and encode all user input'
}
Browser Verification Evidence
When browser verification succeeds, additional evidence is collected:
| Evidence Type | Description | Storage |
|---------------|-------------|---------|
| Screenshot | PNG image of XSS execution | Embedded in HTML report |
| Console Logs | JavaScript console output | Appended to description |
| Execution Flag | Boolean confirmation | browser_verified field |
| Timing Data | Time to execution | Metadata |
Sources: wshawk/scanner_v2.py:321-330
Summary
WSHawk's XSS detection system implements a multi-stage pipeline that progresses from basic reflection detection to real browser execution confirmation:
- Static Payload Testing: 22,000+ XSS vectors from
WSPayloads.get_xss() - Context-Aware Injection: Automatic field detection and smart injection via
MessageAnalyzer - Confidence Scoring: Heuristic analysis via
VulnerabilityVerifier(LOW/MEDIUM/HIGH/CRITICAL) - Browser Verification: Automatic Playwright testing for HIGH-confidence findings
- Smart Evolution: Genetic algorithm optimization for WAF bypass
The combination of automated heuristics and real browser confirmation ensures both high detection rates and low false positives, while the confidence scoring system allows security teams to prioritize critical findings that require immediate remediation.
Key Classes:
WSHawk.test_xss()- Legacy reflection detection wshawk/main.py:399-438WSHawkV2.test_xss_v2()- Enhanced detection pipeline wshawk/scanner_v2.py:258-341VulnerabilityVerifier.verify_xss()- Confidence scoring (referenced but implementation not in provided files)HeadlessBrowserXSSVerifier- Playwright browser automation (referenced but implementation not in provided files)WSPayloads- Static payload collection wshawk/main.py:61-142