Scanner Engine (WSHawkV2)
Scanner Engine (WSHawkV2)
Relevant source files
Purpose and Scope
This document describes the WSHawkV2 class, which serves as the central orchestrator for WebSocket security scanning. It covers the scanner's initialization, the intelligence-driven learning phase, test execution workflow, and coordination with supporting modules.
For details about the intelligence modules that the scanner coordinates with, see Intelligence Modules. For information about the payload system used during testing, see Payload Management System. For descriptions of individual vulnerability tests, see Vulnerability Detection Modules.
Overview
The WSHawkV2 class (wshawk/scanner_v2.py L28-L680
) is the primary entry point for executing security scans against WebSocket endpoints. It implements an intelligence-driven testing approach that adapts to the target application's message structure, server technology stack, and response patterns.
Core Responsibilities
| Responsibility | Implementation |
| --- | --- |
| Connection Management | Establishes and maintains WebSocket connections via websockets library |
| Intelligence Gathering | Coordinates MessageIntelligence, ServerFingerprinter, and SessionStateMachine |
| Test Orchestration | Executes vulnerability tests in sequence with rate limiting |
| Verification | Integrates VulnerabilityVerifier, HeadlessBrowserXSSVerifier, and OASTProvider |
| Result Aggregation | Collects findings and generates HTML reports via EnhancedHTMLReporter |
Sources: wshawk/scanner_v2.py L28-L75
Class Structure and Initialization
WSHawkV2 Constructor
WSHawkV2(
url: str,
headers: Optional[Dict] = None,
auth_sequence: Optional[str] = None,
max_rps: int = 10
)
Parameters:
| Parameter | Type | Default | Purpose |
| --- | --- | --- | --- |
| url | str | Required | Target WebSocket URL (ws:// or wss://) |
| headers | Optional[Dict] | None | Additional HTTP headers for connection |
| auth_sequence | Optional[str] | None | Path to YAML file defining authentication sequence |
| max_rps | int | 10 | Maximum requests per second for rate limiting |
Initialization Workflow:
flowchart TD
Init["WSHawkV2.init()"]
Intel["Initialize Intelligence Modules"]
MI["self.message_intel = MessageIntelligence()"]
VV["self.verifier = VulnerabilityVerifier()"]
SF["self.fingerprinter = ServerFingerprinter()"]
SM["self.state_machine = SessionStateMachine()"]
RateLimit["Initialize Rate Limiting"]
RL["self.rate_limiter = TokenBucketRateLimiter(max_rps)"]
Optional["Initialize Optional Features"]
HB["self.use_headless_browser = True<br>self.headless_verifier = None"]
OAST["self.use_oast = True<br>self.oast_provider = None"]
Report["Initialize Reporting"]
Rep["self.reporter = EnhancedHTMLReporter()"]
Stats["Initialize Statistics"]
StatsInit["messages_sent = 0<br>messages_received = 0<br>start_time = None<br>end_time = None"]
Learning["Initialize Learning State"]
Learn["learning_complete = False<br>sample_messages = []"]
Auth["Load Auth Sequence (if provided)"]
AuthLoad["state_machine.load_sequence_from_yaml()"]
Init -.-> Intel
Intel -.-> MI
Intel -.-> VV
Intel -.-> SF
Intel -.-> SM
Init -.-> RateLimit
RateLimit -.-> RL
Init -.-> Optional
Optional -.-> HB
Optional -.-> OAST
Init -.-> Report
Report -.-> Rep
Init -.-> Stats
Stats -.-> StatsInit
Init -.-> Learning
Learning -.-> Learn
Init -.-> Auth
Auth -.-> AuthLoad
Key Initialization Components:
The constructor instantiates seven core intelligence modules (wshawk/scanner_v2.py L40-L50
):
MessageIntelligence: Analyzes message format (JSON, XML, binary) and identifies injectable fieldsVulnerabilityVerifier: Performs context-aware verification of potential vulnerabilitiesServerFingerprinter: Identifies server technology stack (language, framework, database)SessionStateMachine: Tracks connection state and authentication flowTokenBucketRateLimiter: Implements adaptive rate limiting (tokens_per_second=max_rps, enable_adaptive=True)EnhancedHTMLReporter: Generates professional HTML reports with CVSS scoringHeadlessBrowserXSSVerifierandOASTProvider: Optional verification tools (lazily initialized)
Sources: wshawk/scanner_v2.py L28-L75
The Learning Phase
Purpose
The learning phase (wshawk/scanner_v2.py L87-L141
) is a critical pre-testing step where the scanner passively observes WebSocket traffic to understand the application's communication patterns. This intelligence enables context-aware payload injection that adapts to the target's message structure.
Learning Phase Workflow
sequenceDiagram
participant p1 as WSHawkV2
participant p2 as WebSocket Connection
participant p3 as MessageIntelligence
participant p4 as ServerFingerprinter
p1->>p1: "learning_phase(ws, duration=5)"
p1->>p1: "start = time.monotonic()"
loop "While time < duration"
p1->>p2: "ws.recv() with timeout=1.0"
p2-->>p1: "message"
p1->>p1: "samples.append(message)"
p1->>p1: "messages_received += 1"
p1->>p4: "fingerprinter.add_response(message)"
note over p1: "Log first 3 samples"
end
p1->>p3: "message_intel.learn_from_messages(samples)"
p3-->>p1: "Format learned"
p1->>p3: "get_format_info()"
p3-->>p1: "format, injectable_fields"
note over p1: "Logger.success(Detected format)"
p1->>p4: "fingerprinter.fingerprint()"
p4-->>p1: "language, framework, database"
note over p1: "Logger.success(Server fingerprint)"
p1->>p1: "learning_complete = True"
Learning Phase Implementation:
- Message Collection (wshawk/scanner_v2.py L94-L114 ): * Listens for messages with 1-second timeout per receive * Default collection duration: 5 seconds * Stores samples in
self.sample_messages* Incrementsself.messages_receivedcounter - Format Detection (wshawk/scanner_v2.py L119-L129 ): * Invokes
message_intel.learn_from_messages(samples)* Callsget_format_info()to extract detected format and injectable fields * Logs detected format (JSON, XML, Binary, Text) * Logs up to 5 injectable field names - Server Fingerprinting (wshawk/scanner_v2.py L131-L136 ): * Accumulates responses via
fingerprinter.add_response(message)* Callsfingerprinter.fingerprint()to identify technology stack * Logs detected language, framework, and database * Enables database-specific payload selection for SQL injection tests - Completion (wshawk/scanner_v2.py L138-L141 ): * Sets
self.learning_complete = Trueif samples collected * Falls back to basic payload injection if no messages received
Impact on Testing:
When learning_complete is True, vulnerability test methods inject payloads using message_intel.inject_payload_into_message() instead of sending raw payloads. This ensures payloads are properly embedded in the application's expected message structure.
Sources: wshawk/scanner_v2.py L87-L141
Test Orchestration Architecture
Main Scan Execution Flow
The run_intelligent_scan() method (wshawk/scanner_v2.py L545-L680
) orchestrates the complete testing workflow:
flowchart TD
Start["run_intelligent_scan()"]
Init["Initialize Scan"]
StartTime["self.start_time = datetime.now()"]
Banner["Logger.banner()"]
Connect["Connect to WebSocket"]
ConnectCall["ws = await self.connect()"]
CheckConn["Connection<br>Successful?"]
Learn["Learning Phase"]
LearnCall["await learning_phase(ws, duration=5)"]
Tests["Execute Vulnerability Tests"]
SQL["test_sql_injection_v2(ws)"]
XSS["test_xss_v2(ws)"]
CMD["test_command_injection_v2(ws)"]
Path["test_path_traversal_v2(ws)"]
XXE["test_xxe_v2(ws)"]
NoSQL["test_nosql_injection_v2(ws)"]
SSRF["test_ssrf_v2(ws)"]
CloseWS["Close WebSocket"]
CloseCall["await ws.close()"]
Session["Session Hijacking Tests"]
SessionCall["SessionHijackingTester.run_all_tests()"]
Cleanup["Cleanup Resources"]
CleanBrowser["headless_verifier.stop()"]
CleanOAST["oast_provider.stop()"]
Report["Generate Report"]
Summary["Calculate duration, stats"]
HTML["reporter.generate_report()"]
Save["Save to wshawk_report_TIMESTAMP.html"]
Return["Return Vulnerabilities"]
Start -.-> Init
Init -.-> StartTime
StartTime -.-> Banner
Banner -.-> Connect
Connect -.->|"Yes"| ConnectCall
ConnectCall -.-> CheckConn
CheckConn -.-> Learn
CheckConn -.->|"No"| Return
Learn -.-> LearnCall
LearnCall -.-> Tests
Tests -.-> SQL
SQL -.-> XSS
XSS -.-> CMD
CMD -.-> Path
Path -.-> XXE
XXE -.-> NoSQL
NoSQL -.-> SSRF
SSRF -.-> CloseWS
CloseWS -.-> CloseCall
CloseCall -.-> Session
Session -.-> SessionCall
SessionCall -.-> Cleanup
Cleanup -.-> CleanBrowser
CleanBrowser -.-> CleanOAST
CleanOAST -.-> Report
Report -.-> Summary
Summary -.-> HTML
HTML -.-> Save
Save -.-> Return
Execution Phases:
- Initialization (wshawk/scanner_v2.py L549-L553 ) * Records
start_timefor duration calculation * Displays banner viaLogger.banner()* Logs target URL and scan parameters - Connection (wshawk/scanner_v2.py L556-L561 ) * Calls
await self.connect()to establish WebSocket connection * ReturnsNoneif connection fails * Updatesstate_machineto 'connected' state - Learning Phase (wshawk/scanner_v2.py L564 ) * Executes 5-second passive observation * Builds intelligence about message format and server technology
- Vulnerability Testing (wshawk/scanner_v2.py L568-L587 ) * Executes seven test methods sequentially: * SQL Injection * Cross-Site Scripting (XSS) * Command Injection * Path Traversal * XML External Entity (XXE) * NoSQL Injection * Server-Side Request Forgery (SSRF) * Each test uses intelligence from learning phase * Print statements separate test output visually
- Session Security Testing (wshawk/scanner_v2.py L593-L616 ) * Closes main WebSocket connection * Instantiates
SessionHijackingTester* Runs six session security tests * Appends session vulnerabilities to main results - Resource Cleanup (wshawk/scanner_v2.py L618-L631 ) * Stops
headless_verifierbrowser if initialized * Stopsoast_providerserver if running * Logs cleanup status - Reporting (wshawk/scanner_v2.py L633-L678 ) * Calculates scan duration * Displays summary statistics (messages sent/received, vulnerability count) * Shows confidence breakdown (CRITICAL/HIGH/MEDIUM/LOW) * Generates HTML report with
reporter.generate_report()* Saves towshawk_report_YYYYMMDD_HHMMSS.html* Displays rate limiter statistics
Sources: wshawk/scanner_v2.py L545-L680
Vulnerability Test Method Pattern
Common Testing Pattern
All vulnerability test methods follow a consistent architectural pattern. Using SQL injection as an example:
flowchart TD
TestMethod["test_sql_injection_v2(ws)"]
Init["Initialize Test"]
LogStart["Logger.info('Testing SQL injection...')"]
LoadPayloads["payloads = WSPayloads.get_sql_injection()[:100]"]
Fingerprint["Server<br>Fingerprinted?"]
GetRecommended["recommended = fingerprinter.get_recommended_payloads()"]
UseDB["Use database-specific payloads"]
UseGeneric["Use generic payloads"]
BaseMsg["Get Base Message"]
GetBase["base_message = sample_messages[0]"]
Loop["For Each Payload"]
CheckLearning["learning_complete<br>and JSON format?"]
SmartInject["injected_messages = message_intel.inject_payload_into_message()"]
RawInject["injected_messages = [payload]"]
SendLoop["For Each Injected Message"]
Send["await ws.send(msg)"]
IncrementSent["messages_sent += 1"]
Receive["await asyncio.wait_for(ws.recv(), timeout=2.0)"]
IncrementRecv["messages_received += 1"]
AddFingerprint["fingerprinter.add_response(response)"]
Verify["Verify Vulnerability"]
VerifyCall["is_vuln, confidence, description = verifier.verify_sql_injection()"]
CheckVuln["is_vuln and<br>confidence != LOW?"]
LogVuln["Logger.vuln(description)"]
Append["vulnerabilities.append(finding)"]
Skip["Continue"]
RateLimit["await asyncio.sleep(0.05)"]
Return["Return Results"]
TestMethod -.-> Init
Init -.->|"Yes"| LogStart
LogStart -.->|"No"| LoadPayloads
LoadPayloads -.-> Fingerprint
Fingerprint -.-> GetRecommended
Fingerprint -.-> UseGeneric
GetRecommended -.-> UseDB
UseDB -.->|"No"| BaseMsg
UseGeneric -.-> BaseMsg
BaseMsg -.-> GetBase
GetBase -.->|"Yes"| Loop
Loop -.-> CheckLearning
CheckLearning -.-> SmartInject
CheckLearning -.-> RawInject
SmartInject -.-> SendLoop
RawInject -.-> SendLoop
SendLoop -.-> Send
Send -.-> IncrementSent
IncrementSent -.-> Receive
Receive -.-> IncrementRecv
IncrementRecv -.->|"Yes"| AddFingerprint
AddFingerprint -.->|"No"| Verify
Verify -.-> VerifyCall
VerifyCall -.-> CheckVuln
CheckVuln -.-> LogVuln
CheckVuln -.-> Skip
LogVuln -.-> Append
Append -.-> RateLimit
Skip -.-> RateLimit
RateLimit -.-> Loop
Loop -.-> Return
Pattern Components:
- Payload Selection (wshawk/scanner_v2.py L149-L158 ): * Load base payloads from
WSPayloadsclass * Checkfingerprinter.fingerprint()for technology detection * If database detected, prepend database-specific payloads viaget_recommended_payloads() - Intelligent Injection (wshawk/scanner_v2.py L161-L172 ): * Use
sample_messages[0]as template for payload embedding * Iflearning_completeand format is JSON: callmessage_intel.inject_payload_into_message()* Otherwise: send raw payload strings - Send/Receive Cycle (wshawk/scanner_v2.py L174-L182 ): * Send injected message via
await ws.send(msg)* Incrementmessages_sentcounter * Receive response with 2-second timeout * Incrementmessages_receivedcounter * Add response tofingerprinterfor continuous learning - Verification (wshawk/scanner_v2.py L184-L188 ): * Call vulnerability-specific verifier method (e.g.,
verifier.verify_sql_injection()) * Returns tuple:(is_vuln: bool, confidence: ConfidenceLevel, description: str)* Filter outLOWconfidence findings to reduce false positives - Result Recording (wshawk/scanner_v2.py L189-L202 ): * Log vulnerability with
Logger.vuln()* Append structured finding toself.vulnerabilitieslist with fields: *type,severity,confidence,description,payload,response_snippet,recommendation - Rate Limiting (wshawk/scanner_v2.py L207 ): * Sleep 0.05 seconds (50ms) between payloads * Prevents overwhelming target server * Complements
TokenBucketRateLimiterfor adaptive rate control
Sources: wshawk/scanner_v2.py L143-L213
Advanced Verification Integration
Browser-Based XSS Verification
The scanner integrates optional browser-based verification for XSS payloads:
flowchart TD
XSSTest["test_xss_v2()"]
Pattern["Pattern-Based Detection"]
PatternVerify["verifier.verify_xss()"]
CheckConfidence["confidence == HIGH<br>and use_headless_browser?"]
BrowserInit["Initialize Browser (if needed)"]
InitBrowser["headless_verifier = HeadlessBrowserXSSVerifier()<br>await headless_verifier.start()"]
BrowserVerify["Browser Verification"]
VerifyExec["is_executed, evidence = await verify_xss_execution()"]
CheckExec["is_executed?"]
Upgrade["Upgrade to CRITICAL"]
UpgradeConf["confidence = ConfidenceLevel.CRITICAL<br>browser_verified = True"]
Record["Record Finding"]
Append["vulnerabilities.append()<br>with browser_verified flag"]
XSSTest -.-> Pattern
Pattern -.-> PatternVerify
PatternVerify -.->|"Yes"| CheckConfidence
CheckConfidence -.-> BrowserInit
CheckConfidence -.->|"No"| Record
BrowserInit -.-> InitBrowser
InitBrowser -.->|"Yes"| BrowserVerify
BrowserVerify -.->|"No"| VerifyExec
VerifyExec -.-> CheckExec
CheckExec -.-> Upgrade
CheckExec -.-> Record
Upgrade -.-> UpgradeConf
UpgradeConf -.-> Record
Record -.-> Append
Implementation Details (wshawk/scanner_v2.py L250-L271
):
- Only triggered for
HIGHconfidence XSS findings - Lazily initializes
HeadlessBrowserXSSVerifieron first use - Calls
verify_xss_execution(response, payload)to test actual JavaScript execution - If executed, upgrades confidence to
CRITICALand setsbrowser_verifiedflag - Screenshot evidence captured automatically by
HeadlessBrowserXSSVerifier
OAST Integration for Blind Vulnerabilities
The scanner uses Out-of-Band Application Security Testing (OAST) for XXE and SSRF:
OAST Workflow (wshawk/scanner_v2.py L409-L425
):
- Check if
use_oastis enabled - Initialize
OASTProviderif not already running - Generate OAST-enabled payload with unique identifier:
oast_provider.generate_payload('xxe', 'test{id}') - Embed payload in message and send
- OAST server listens for DNS/HTTP callbacks
- If callback received, confirms blind vulnerability
The OAST provider runs on localhost:8888 by default (wshawk/scanner_v2.py L412
).
Sources: wshawk/scanner_v2.py L215-L293
wshawk/scanner_v2.py L402-L456
Connection Management
WebSocket Connection Lifecycle
#mermaid-4t818l8jjvh{font-family:ui-sans-serif,-apple-system,system-ui,Segoe UI,Helvetica;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-4t818l8jjvh .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-4t818l8jjvh .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-4t818l8jjvh .error-icon{fill:#dddddd;}#mermaid-4t818l8jjvh .error-text{fill:#222222;stroke:#222222;}#mermaid-4t818l8jjvh .edge-thickness-normal{stroke-width:1px;}#mermaid-4t818l8jjvh .edge-thickness-thick{stroke-width:3.5px;}#mermaid-4t818l8jjvh .edge-pattern-solid{stroke-dasharray:0;}#mermaid-4t818l8jjvh .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-4t818l8jjvh .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-4t818l8jjvh .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-4t818l8jjvh .marker{fill:#999;stroke:#999;}#mermaid-4t818l8jjvh .marker.cross{stroke:#999;}#mermaid-4t818l8jjvh svg{font-family:ui-sans-serif,-apple-system,system-ui,Segoe UI,Helvetica;font-size:16px;}#mermaid-4t818l8jjvh p{margin:0;}#mermaid-4t818l8jjvh defs #statediagram-barbEnd{fill:#999;stroke:#999;}#mermaid-4t818l8jjvh g.stateGroup text{fill:#dddddd;stroke:none;font-size:10px;}#mermaid-4t818l8jjvh g.stateGroup text{fill:#333;stroke:none;font-size:10px;}#mermaid-4t818l8jjvh g.stateGroup .state-title{font-weight:bolder;fill:#333;}#mermaid-4t818l8jjvh g.stateGroup rect{fill:#ffffff;stroke:#dddddd;}#mermaid-4t818l8jjvh g.stateGroup line{stroke:#999;stroke-width:1;}#mermaid-4t818l8jjvh .transition{stroke:#999;stroke-width:1;fill:none;}#mermaid-4t818l8jjvh .stateGroup .composit{fill:#f4f4f4;border-bottom:1px;}#mermaid-4t818l8jjvh .stateGroup .alt-composit{fill:#e0e0e0;border-bottom:1px;}#mermaid-4t818l8jjvh .state-note{stroke:#e6d280;fill:#fff5ad;}#mermaid-4t818l8jjvh .state-note text{fill:#333;stroke:none;font-size:10px;}#mermaid-4t818l8jjvh .stateLabel .box{stroke:none;stroke-width:0;fill:#ffffff;opacity:0.5;}#mermaid-4t818l8jjvh .edgeLabel .label rect{fill:#ffffff;opacity:0.5;}#mermaid-4t818l8jjvh .edgeLabel{background-color:#ffffff;text-align:center;}#mermaid-4t818l8jjvh .edgeLabel p{background-color:#ffffff;}#mermaid-4t818l8jjvh .edgeLabel rect{opacity:0.5;background-color:#ffffff;fill:#ffffff;}#mermaid-4t818l8jjvh .edgeLabel .label text{fill:#333;}#mermaid-4t818l8jjvh .label div .edgeLabel{color:#333;}#mermaid-4t818l8jjvh .stateLabel text{fill:#333;font-size:10px;font-weight:bold;}#mermaid-4t818l8jjvh .node circle.state-start{fill:#999;stroke:#999;}#mermaid-4t818l8jjvh .node .fork-join{fill:#999;stroke:#999;}#mermaid-4t818l8jjvh .node circle.state-end{fill:#dddddd;stroke:#f4f4f4;stroke-width:1.5;}#mermaid-4t818l8jjvh .end-state-inner{fill:#f4f4f4;stroke-width:1.5;}#mermaid-4t818l8jjvh .node rect{fill:#ffffff;stroke:#dddddd;stroke-width:1px;}#mermaid-4t818l8jjvh .node polygon{fill:#ffffff;stroke:#dddddd;stroke-width:1px;}#mermaid-4t818l8jjvh #statediagram-barbEnd{fill:#999;}#mermaid-4t818l8jjvh .statediagram-cluster rect{fill:#ffffff;stroke:#dddddd;stroke-width:1px;}#mermaid-4t818l8jjvh .cluster-label,#mermaid-4t818l8jjvh .nodeLabel{color:#333;}#mermaid-4t818l8jjvh .statediagram-cluster rect.outer{rx:5px;ry:5px;}#mermaid-4t818l8jjvh .statediagram-state .divider{stroke:#dddddd;}#mermaid-4t818l8jjvh .statediagram-state .title-state{rx:5px;ry:5px;}#mermaid-4t818l8jjvh .statediagram-cluster.statediagram-cluster .inner{fill:#f4f4f4;}#mermaid-4t818l8jjvh .statediagram-cluster.statediagram-cluster-alt .inner{fill:#f8f8f8;}#mermaid-4t818l8jjvh .statediagram-cluster .inner{rx:0;ry:0;}#mermaid-4t818l8jjvh .statediagram-state rect.basic{rx:5px;ry:5px;}#mermaid-4t818l8jjvh .statediagram-state rect.divider{stroke-dasharray:10,10;fill:#f8f8f8;}#mermaid-4t818l8jjvh .note-edge{stroke-dasharray:5;}#mermaid-4t818l8jjvh .statediagram-note rect{fill:#fff5ad;stroke:#e6d280;stroke-width:1px;rx:0;ry:0;}#mermaid-4t818l8jjvh .statediagram-note rect{fill:#fff5ad;stroke:#e6d280;stroke-width:1px;rx:0;ry:0;}#mermaid-4t818l8jjvh .statediagram-note text{fill:#333;}#mermaid-4t818l8jjvh .statediagram-note .nodeLabel{color:#333;}#mermaid-4t818l8jjvh .statediagram .edgeLabel{color:red;}#mermaid-4t818l8jjvh #dependencyStart,#mermaid-4t818l8jjvh #dependencyEnd{fill:#999;stroke:#999;stroke-width:1;}#mermaid-4t818l8jjvh .statediagramTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-4t818l8jjvh :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;}"connect()""websockets.connect() success""Connection failed""learning_phase()""Learning complete""Execute test methods""ws.close()""Session complete""Return None""Scan complete"DisconnectedConnectingConnectedErrorLearningTestingClosed
Connection Method (wshawk/scanner_v2.py L77-L85
):
async def connect(self):
"""Establish WebSocket connection"""
try:
ws = await websockets.connect(self.url, additional_headers=self.headers)
self.state_machine._update_state('connected')
return ws
except Exception as e:
Logger.error(f"Connection failed: {e}")
return None
- Uses
websockets.connect()with additional headers fromself.headers - Updates
SessionStateMachineto 'connected' state - Returns WebSocket object or
Noneon failure - Connection failure causes
run_intelligent_scan()to exit early
State Tracking:
The SessionStateMachine tracks connection state throughout the scan lifecycle, enabling authentication sequence replay and state-dependent testing logic.
Sources: wshawk/scanner_v2.py L77-L85
Statistics and Metrics
Runtime Statistics Collection
The scanner tracks several metrics during execution:
| Metric | Type | Purpose |
| --- | --- | --- |
| messages_sent | int | Total WebSocket messages sent to target |
| messages_received | int | Total WebSocket messages received from target |
| start_time | datetime | Scan start timestamp |
| end_time | datetime | Scan completion timestamp |
| vulnerabilities | List[Dict] | Aggregated vulnerability findings |
| sample_messages | List[str] | Messages collected during learning phase |
| traffic_logs | List | Detailed request/response pairs for reporting |
Duration Calculation (wshawk/scanner_v2.py L634-L636
):
self.end_time = datetime.now()
duration = (self.end_time - self.start_time).total_seconds()
Rate Limiter Statistics (wshawk/scanner_v2.py L676-L678
):
rate_stats = self.rate_limiter.get_stats()
Logger.info(f"Rate limiter: {rate_stats['total_requests']} requests, {rate_stats['total_waits']} waits")
Logger.info(f" Current rate: {rate_stats['current_rate']}, Adaptive adjustments: {rate_stats['adaptive_adjustments']}")
These metrics are included in the final HTML report for audit and analysis purposes.
Sources: wshawk/scanner_v2.py L64-L75
wshawk/scanner_v2.py L634-L678
Integration with Intelligence Modules
Module Coordination Map
flowchart TD
Scanner["WSHawkV2"]
MI["MessageIntelligence<br>message_intel"]
SF["ServerFingerprinter<br>fingerprinter"]
VV["VulnerabilityVerifier<br>verifier"]
SM["SessionStateMachine<br>state_machine"]
RL["TokenBucketRateLimiter<br>rate_limiter"]
HB["HeadlessBrowserXSSVerifier<br>headless_verifier"]
OAST["OASTProvider<br>oast_provider"]
Rep["EnhancedHTMLReporter<br>reporter"]
Scanner -.->|"inject_payload_into_message()"| MI
Scanner -.->|"get_format_info()"| MI
Scanner -.->|"add_response()"| MI
Scanner -.->|"fingerprint()"| SF
Scanner -.->|"get_recommended_payloads()"| SF
Scanner -.->|"verify_sql_injection()"| SF
Scanner -.->|"verify_xss()"| VV
Scanner -.->|"verify_command_injection()"| VV
Scanner -.->|"verify_path_traversal()"| VV
Scanner -.->|"_update_state()"| VV
Scanner -.-> SM
Scanner -.->|"load_sequence_from_yaml()"| SM
Scanner -.->|"learn_from_messages()"| RL
Scanner -.->|"get_stats()"| RL
Scanner -.->|"stop()"| HB
Scanner -.->|"acquire()"| HB
Scanner -.->|"generate_payload()"| HB
Scanner -.->|"start()"| OAST
Scanner -.->|"stop()"| OAST
Scanner -.->|"verify_xss_execution()"| OAST
Scanner -.->|"start()"| Rep
subgraph Reporting ["Reporting"]
Rep
end
subgraph subGraph2 ["Advanced Verification"]
HB
OAST
end
subgraph subGraph1 ["Rate Limiting"]
RL
end
subgraph subGraph0 ["Intelligence Modules"]
MI
SF
VV
SM
end
Key Integration Points:
- MessageIntelligence (wshawk/scanner_v2.py L41 ): * Learning phase:
learn_from_messages(samples)* Payload injection:inject_payload_into_message(base_message, payload)* Format query:get_format_info() - ServerFingerprinter (wshawk/scanner_v2.py L43 ): * Response accumulation:
add_response(message)(called continuously) * Fingerprint extraction:fingerprint()returns language/framework/database * Payload recommendations:get_recommended_payloads(fingerprint) - VulnerabilityVerifier (wshawk/scanner_v2.py L42 ): * SQL verification:
verify_sql_injection(response, payload)* XSS verification:verify_xss(response, payload)* Command injection:verify_command_injection(response, payload)* Path traversal:verify_path_traversal(response, payload)* Returns:(is_vuln: bool, confidence: ConfidenceLevel, description: str) - TokenBucketRateLimiter (wshawk/scanner_v2.py L45-L49 ): * Configured with
tokens_per_second=max_rps,enable_adaptive=True* Acquire token:await rate_limiter.acquire()(used in SSRF test) * Statistics:get_stats()returns request counts and adaptive adjustments - EnhancedHTMLReporter (wshawk/scanner_v2.py L50 ): * Report generation:
generate_report(vulnerabilities, scan_info, fingerprint_info)* Returns HTML string for file output
For detailed documentation of these modules, see Intelligence Modules.
Sources: wshawk/scanner_v2.py L14-L26
Report Generation
HTML Report Structure
The scanner generates comprehensive HTML reports via the EnhancedHTMLReporter:
Report Generation Process (wshawk/scanner_v2.py L652-L673
):
- Prepare Scan Information:
scan_info = { 'target': self.url, 'duration': duration, 'messages_sent': self.messages_sent, 'messages_received': self.messages_received } - Extract Fingerprint:
fingerprint_info = self.fingerprinter.get_info() - Generate HTML:
report_html = self.reporter.generate_report( self.vulnerabilities, scan_info, fingerprint_info ) - Save to File:
report_filename = f"wshawk_report_{datetime.now().strftime('%Y%m%d_%H%M%S')}.html" with open(report_filename, 'w') as f: f.write(report_html)
Report Contents:
- Vulnerability findings with CVSS v3.1 scores
- Confidence levels (CRITICAL/HIGH/MEDIUM/LOW)
- Payload details and response snippets
- Server fingerprint information
- Scan statistics and duration
- Rate limiter performance metrics
- Screenshots for browser-verified XSS
- Remediation recommendations
For details about report format and structure, see Report Format and Output.
Sources: wshawk/scanner_v2.py L652-L673
Configuration Options
Scanner Flags and Settings
The WSHawkV2 class exposes several configuration options:
| Option | Type | Default | Location | Purpose |
| --- | --- | --- | --- | --- |
| use_headless_browser | bool | True | scanner_v2.py L53 | Enable Playwright browser verification for XSS |
| use_oast | bool | True | scanner_v2.py L57 | Enable OAST for blind XXE/SSRF detection |
| max_rps | int | 10 | Constructor param | Maximum requests per second for rate limiting |
Example Configuration:
from wshawk.scanner_v2 import WSHawkV2
scanner = WSHawkV2("ws://target.com", max_rps=5)
scanner.use_headless_browser = False # Disable browser verification
scanner.use_oast = True # Keep OAST enabled
await scanner.run_intelligent_scan()
CLI Integration:
The CLI commands (wshawk, wshawk-advanced) map command-line flags to these options:
--playwrightflag setsuse_headless_browser = True--no-oastflag setsuse_oast = False--rate Nflag setsmax_rps = N--fullflag enables all features
Sources: wshawk/scanner_v2.py L52-L58
Error Handling and Resilience
Exception Management
The scanner implements defensive error handling throughout:
Connection Failures (wshawk/scanner_v2.py L83-L85
):
- Logs error and returns
None - Caller checks for
Noneand exits gracefully
Message Receive Timeouts (wshawk/scanner_v2.py L101-L113
wshawk/scanner_v2.py L204-L205
):
- Uses
asyncio.wait_for(ws.recv(), timeout=X)with 1-2 second timeouts - Catches
asyncio.TimeoutErrorand continues to next payload - Prevents scan from hanging on unresponsive servers
Test Method Exceptions (wshawk/scanner_v2.py L209-L211
):
- Wraps test logic in try/except blocks
- Logs errors but continues with remaining tests
- Ensures one failing test doesn't abort entire scan
Resource Cleanup (wshawk/scanner_v2.py L618-L631
):
- Wraps browser and OAST cleanup in try/except
- Logs cleanup status
- Ensures resources released even if errors occur
Learning Phase Fallback (wshawk/scanner_v2.py L115-L117
wshawk/scanner_v2.py L139-L141
):
- If no messages received during learning phase, logs warning
- Falls back to basic payload injection without message structure intelligence
- Allows scan to proceed even without optimal intelligence
Sources: wshawk/scanner_v2.py L77-L680
Usage Examples
Basic Scan
import asyncio
from wshawk.scanner_v2 import WSHawkV2
async def main():
scanner = WSHawkV2("ws://target.com:8080")
results = await scanner.run_intelligent_scan()
print(f"Found {len(results)} vulnerabilities")
asyncio.run(main())
Scan with Authentication
scanner = WSHawkV2(
"wss://secure.example.com/ws",
headers={"Authorization": "Bearer token123"},
auth_sequence="auth_config.yaml"
)
await scanner.run_intelligent_scan()
Custom Rate Limiting
scanner = WSHawkV2("ws://target.com", max_rps=5)
scanner.use_headless_browser = True
scanner.use_oast = True
await scanner.run_intelligent_scan()
Accessing Results Programmatically
scanner = WSHawkV2("ws://target.com")
results = await scanner.run_intelligent_scan()
# Filter by severity
critical = [v for v in results if v['severity'] == 'CRITICAL']
high = [v for v in results if v['severity'] == 'HIGH']
# Access findings
for vuln in results:
print(f"{vuln['type']}: {vuln['description']}")
print(f"Payload: {vuln['payload']}")
print(f"CVSS: {vuln.get('cvss_score', 'N/A')}")
For programmatic integration examples, see Python API Usage.
Sources: README.md L196-L209
Performance Characteristics
Scan Duration and Throughput
Typical scan performance metrics:
| Phase | Duration | Throughput | | --- | --- | --- | | Learning Phase | 5 seconds | Passive observation | | SQL Injection | 10-15 seconds | ~100 payloads | | XSS Testing | 10-15 seconds | ~100 payloads | | Command Injection | 10-15 seconds | ~100 payloads | | Path Traversal | 5-10 seconds | ~50 payloads | | XXE Testing | 3-5 seconds | ~30 payloads | | NoSQL Injection | 5-10 seconds | ~50 payloads | | SSRF Testing | 2-5 seconds | ~4-8 targets | | Session Tests | 10-20 seconds | 6 test scenarios | | Total | 60-100 seconds | ~450-550 payloads |
Rate Limiting Impact:
- Default rate: 10 requests/second
- 50ms delay between payloads (
await asyncio.sleep(0.05)) - Adaptive rate limiting adjusts based on server response times
- Statistics available via
rate_limiter.get_stats()
Optimization Strategies:
- Reduce payload counts in test methods (e.g.,
[:100]slice) - Adjust
max_rpsparameter for faster/slower scans - Disable browser verification to skip Playwright overhead
- Disable OAST if blind vulnerability detection not needed
Sources: wshawk/scanner_v2.py L150
wshawk/scanner_v2.py L545-L680
Summary
The WSHawkV2 scanner engine implements a sophisticated intelligence-driven testing workflow:
- Passive learning to understand application message structure and server technology
- Context-aware payload injection that adapts to detected formats
- Multi-layered verification combining pattern matching, context analysis, and browser/OAST verification
- Coordinated orchestration of seven vulnerability test types plus session security testing
- Professional reporting with CVSS scoring and actionable recommendations
The architecture prioritizes reducing false positives through rigorous verification while maintaining scan efficiency via adaptive rate limiting. The modular design enables extensibility through well-defined intelligence module interfaces.
For implementation details of the intelligence modules, see Intelligence Modules. For CLI usage patterns, see CLI Command Reference.