Scanner Engine (WSHawkV2)

Scanner Engine (WSHawkV2)

Relevant source files

Purpose and Scope

This document describes the WSHawkV2 class, which serves as the central orchestrator for WebSocket security scanning. It covers the scanner's initialization, the intelligence-driven learning phase, test execution workflow, and coordination with supporting modules.

For details about the intelligence modules that the scanner coordinates with, see Intelligence Modules. For information about the payload system used during testing, see Payload Management System. For descriptions of individual vulnerability tests, see Vulnerability Detection Modules.


Overview

The WSHawkV2 class (wshawk/scanner_v2.py L28-L680

) is the primary entry point for executing security scans against WebSocket endpoints. It implements an intelligence-driven testing approach that adapts to the target application's message structure, server technology stack, and response patterns.

Core Responsibilities

| Responsibility | Implementation | | --- | --- | | Connection Management | Establishes and maintains WebSocket connections via websockets library | | Intelligence Gathering | Coordinates MessageIntelligence, ServerFingerprinter, and SessionStateMachine | | Test Orchestration | Executes vulnerability tests in sequence with rate limiting | | Verification | Integrates VulnerabilityVerifier, HeadlessBrowserXSSVerifier, and OASTProvider | | Result Aggregation | Collects findings and generates HTML reports via EnhancedHTMLReporter |

Sources: wshawk/scanner_v2.py L28-L75


Class Structure and Initialization

WSHawkV2 Constructor

WSHawkV2(
    url: str,
    headers: Optional[Dict] = None,
    auth_sequence: Optional[str] = None,
    max_rps: int = 10
)

Parameters:

| Parameter | Type | Default | Purpose | | --- | --- | --- | --- | | url | str | Required | Target WebSocket URL (ws:// or wss://) | | headers | Optional[Dict] | None | Additional HTTP headers for connection | | auth_sequence | Optional[str] | None | Path to YAML file defining authentication sequence | | max_rps | int | 10 | Maximum requests per second for rate limiting |

Initialization Workflow:

flowchart TD

Init["WSHawkV2.init()"]
Intel["Initialize Intelligence Modules"]
MI["self.message_intel = MessageIntelligence()"]
VV["self.verifier = VulnerabilityVerifier()"]
SF["self.fingerprinter = ServerFingerprinter()"]
SM["self.state_machine = SessionStateMachine()"]
RateLimit["Initialize Rate Limiting"]
RL["self.rate_limiter = TokenBucketRateLimiter(max_rps)"]
Optional["Initialize Optional Features"]
HB["self.use_headless_browser = True<br>self.headless_verifier = None"]
OAST["self.use_oast = True<br>self.oast_provider = None"]
Report["Initialize Reporting"]
Rep["self.reporter = EnhancedHTMLReporter()"]
Stats["Initialize Statistics"]
StatsInit["messages_sent = 0<br>messages_received = 0<br>start_time = None<br>end_time = None"]
Learning["Initialize Learning State"]
Learn["learning_complete = False<br>sample_messages = []"]
Auth["Load Auth Sequence (if provided)"]
AuthLoad["state_machine.load_sequence_from_yaml()"]

Init -.-> Intel
Intel -.-> MI
Intel -.-> VV
Intel -.-> SF
Intel -.-> SM
Init -.-> RateLimit
RateLimit -.-> RL
Init -.-> Optional
Optional -.-> HB
Optional -.-> OAST
Init -.-> Report
Report -.-> Rep
Init -.-> Stats
Stats -.-> StatsInit
Init -.-> Learning
Learning -.-> Learn
Init -.-> Auth
Auth -.-> AuthLoad

Key Initialization Components:

The constructor instantiates seven core intelligence modules (wshawk/scanner_v2.py L40-L50

):

  • MessageIntelligence: Analyzes message format (JSON, XML, binary) and identifies injectable fields
  • VulnerabilityVerifier: Performs context-aware verification of potential vulnerabilities
  • ServerFingerprinter: Identifies server technology stack (language, framework, database)
  • SessionStateMachine: Tracks connection state and authentication flow
  • TokenBucketRateLimiter: Implements adaptive rate limiting (tokens_per_second=max_rps, enable_adaptive=True)
  • EnhancedHTMLReporter: Generates professional HTML reports with CVSS scoring
  • HeadlessBrowserXSSVerifier and OASTProvider: Optional verification tools (lazily initialized)

Sources: wshawk/scanner_v2.py L28-L75


The Learning Phase

Purpose

The learning phase (wshawk/scanner_v2.py L87-L141

) is a critical pre-testing step where the scanner passively observes WebSocket traffic to understand the application's communication patterns. This intelligence enables context-aware payload injection that adapts to the target's message structure.

Learning Phase Workflow

sequenceDiagram
  participant p1 as WSHawkV2
  participant p2 as WebSocket Connection
  participant p3 as MessageIntelligence
  participant p4 as ServerFingerprinter

  p1->>p1: "learning_phase(ws, duration=5)"
  p1->>p1: "start = time.monotonic()"
  loop "While time < duration"
    p1->>p2: "ws.recv() with timeout=1.0"
    p2-->>p1: "message"
    p1->>p1: "samples.append(message)"
    p1->>p1: "messages_received += 1"
    p1->>p4: "fingerprinter.add_response(message)"
    note over p1: "Log first 3 samples"
  end
  p1->>p3: "message_intel.learn_from_messages(samples)"
  p3-->>p1: "Format learned"
  p1->>p3: "get_format_info()"
  p3-->>p1: "format, injectable_fields"
  note over p1: "Logger.success(Detected format)"
  p1->>p4: "fingerprinter.fingerprint()"
  p4-->>p1: "language, framework, database"
  note over p1: "Logger.success(Server fingerprint)"
  p1->>p1: "learning_complete = True"

Learning Phase Implementation:

  1. Message Collection (wshawk/scanner_v2.py L94-L114 ): * Listens for messages with 1-second timeout per receive * Default collection duration: 5 seconds * Stores samples in self.sample_messages * Increments self.messages_received counter
  2. Format Detection (wshawk/scanner_v2.py L119-L129 ): * Invokes message_intel.learn_from_messages(samples) * Calls get_format_info() to extract detected format and injectable fields * Logs detected format (JSON, XML, Binary, Text) * Logs up to 5 injectable field names
  3. Server Fingerprinting (wshawk/scanner_v2.py L131-L136 ): * Accumulates responses via fingerprinter.add_response(message) * Calls fingerprinter.fingerprint() to identify technology stack * Logs detected language, framework, and database * Enables database-specific payload selection for SQL injection tests
  4. Completion (wshawk/scanner_v2.py L138-L141 ): * Sets self.learning_complete = True if samples collected * Falls back to basic payload injection if no messages received

Impact on Testing:

When learning_complete is True, vulnerability test methods inject payloads using message_intel.inject_payload_into_message() instead of sending raw payloads. This ensures payloads are properly embedded in the application's expected message structure.

Sources: wshawk/scanner_v2.py L87-L141


Test Orchestration Architecture

Main Scan Execution Flow

The run_intelligent_scan() method (wshawk/scanner_v2.py L545-L680

) orchestrates the complete testing workflow:

flowchart TD

Start["run_intelligent_scan()"]
Init["Initialize Scan"]
StartTime["self.start_time = datetime.now()"]
Banner["Logger.banner()"]
Connect["Connect to WebSocket"]
ConnectCall["ws = await self.connect()"]
CheckConn["Connection<br>Successful?"]
Learn["Learning Phase"]
LearnCall["await learning_phase(ws, duration=5)"]
Tests["Execute Vulnerability Tests"]
SQL["test_sql_injection_v2(ws)"]
XSS["test_xss_v2(ws)"]
CMD["test_command_injection_v2(ws)"]
Path["test_path_traversal_v2(ws)"]
XXE["test_xxe_v2(ws)"]
NoSQL["test_nosql_injection_v2(ws)"]
SSRF["test_ssrf_v2(ws)"]
CloseWS["Close WebSocket"]
CloseCall["await ws.close()"]
Session["Session Hijacking Tests"]
SessionCall["SessionHijackingTester.run_all_tests()"]
Cleanup["Cleanup Resources"]
CleanBrowser["headless_verifier.stop()"]
CleanOAST["oast_provider.stop()"]
Report["Generate Report"]
Summary["Calculate duration, stats"]
HTML["reporter.generate_report()"]
Save["Save to wshawk_report_TIMESTAMP.html"]
Return["Return Vulnerabilities"]

Start -.-> Init
Init -.-> StartTime
StartTime -.-> Banner
Banner -.-> Connect
Connect -.->|"Yes"| ConnectCall
ConnectCall -.-> CheckConn
CheckConn -.-> Learn
CheckConn -.->|"No"| Return
Learn -.-> LearnCall
LearnCall -.-> Tests
Tests -.-> SQL
SQL -.-> XSS
XSS -.-> CMD
CMD -.-> Path
Path -.-> XXE
XXE -.-> NoSQL
NoSQL -.-> SSRF
SSRF -.-> CloseWS
CloseWS -.-> CloseCall
CloseCall -.-> Session
Session -.-> SessionCall
SessionCall -.-> Cleanup
Cleanup -.-> CleanBrowser
CleanBrowser -.-> CleanOAST
CleanOAST -.-> Report
Report -.-> Summary
Summary -.-> HTML
HTML -.-> Save
Save -.-> Return

Execution Phases:

  1. Initialization (wshawk/scanner_v2.py L549-L553 ) * Records start_time for duration calculation * Displays banner via Logger.banner() * Logs target URL and scan parameters
  2. Connection (wshawk/scanner_v2.py L556-L561 ) * Calls await self.connect() to establish WebSocket connection * Returns None if connection fails * Updates state_machine to 'connected' state
  3. Learning Phase (wshawk/scanner_v2.py L564 ) * Executes 5-second passive observation * Builds intelligence about message format and server technology
  4. Vulnerability Testing (wshawk/scanner_v2.py L568-L587 ) * Executes seven test methods sequentially: * SQL Injection * Cross-Site Scripting (XSS) * Command Injection * Path Traversal * XML External Entity (XXE) * NoSQL Injection * Server-Side Request Forgery (SSRF) * Each test uses intelligence from learning phase * Print statements separate test output visually
  5. Session Security Testing (wshawk/scanner_v2.py L593-L616 ) * Closes main WebSocket connection * Instantiates SessionHijackingTester * Runs six session security tests * Appends session vulnerabilities to main results
  6. Resource Cleanup (wshawk/scanner_v2.py L618-L631 ) * Stops headless_verifier browser if initialized * Stops oast_provider server if running * Logs cleanup status
  7. Reporting (wshawk/scanner_v2.py L633-L678 ) * Calculates scan duration * Displays summary statistics (messages sent/received, vulnerability count) * Shows confidence breakdown (CRITICAL/HIGH/MEDIUM/LOW) * Generates HTML report with reporter.generate_report() * Saves to wshawk_report_YYYYMMDD_HHMMSS.html * Displays rate limiter statistics

Sources: wshawk/scanner_v2.py L545-L680


Vulnerability Test Method Pattern

Common Testing Pattern

All vulnerability test methods follow a consistent architectural pattern. Using SQL injection as an example:

flowchart TD

TestMethod["test_sql_injection_v2(ws)"]
Init["Initialize Test"]
LogStart["Logger.info('Testing SQL injection...')"]
LoadPayloads["payloads = WSPayloads.get_sql_injection()[:100]"]
Fingerprint["Server<br>Fingerprinted?"]
GetRecommended["recommended = fingerprinter.get_recommended_payloads()"]
UseDB["Use database-specific payloads"]
UseGeneric["Use generic payloads"]
BaseMsg["Get Base Message"]
GetBase["base_message = sample_messages[0]"]
Loop["For Each Payload"]
CheckLearning["learning_complete<br>and JSON format?"]
SmartInject["injected_messages = message_intel.inject_payload_into_message()"]
RawInject["injected_messages = [payload]"]
SendLoop["For Each Injected Message"]
Send["await ws.send(msg)"]
IncrementSent["messages_sent += 1"]
Receive["await asyncio.wait_for(ws.recv(), timeout=2.0)"]
IncrementRecv["messages_received += 1"]
AddFingerprint["fingerprinter.add_response(response)"]
Verify["Verify Vulnerability"]
VerifyCall["is_vuln, confidence, description = verifier.verify_sql_injection()"]
CheckVuln["is_vuln and<br>confidence != LOW?"]
LogVuln["Logger.vuln(description)"]
Append["vulnerabilities.append(finding)"]
Skip["Continue"]
RateLimit["await asyncio.sleep(0.05)"]
Return["Return Results"]

TestMethod -.-> Init
Init -.->|"Yes"| LogStart
LogStart -.->|"No"| LoadPayloads
LoadPayloads -.-> Fingerprint
Fingerprint -.-> GetRecommended
Fingerprint -.-> UseGeneric
GetRecommended -.-> UseDB
UseDB -.->|"No"| BaseMsg
UseGeneric -.-> BaseMsg
BaseMsg -.-> GetBase
GetBase -.->|"Yes"| Loop
Loop -.-> CheckLearning
CheckLearning -.-> SmartInject
CheckLearning -.-> RawInject
SmartInject -.-> SendLoop
RawInject -.-> SendLoop
SendLoop -.-> Send
Send -.-> IncrementSent
IncrementSent -.-> Receive
Receive -.-> IncrementRecv
IncrementRecv -.->|"Yes"| AddFingerprint
AddFingerprint -.->|"No"| Verify
Verify -.-> VerifyCall
VerifyCall -.-> CheckVuln
CheckVuln -.-> LogVuln
CheckVuln -.-> Skip
LogVuln -.-> Append
Append -.-> RateLimit
Skip -.-> RateLimit
RateLimit -.-> Loop
Loop -.-> Return

Pattern Components:

  1. Payload Selection (wshawk/scanner_v2.py L149-L158 ): * Load base payloads from WSPayloads class * Check fingerprinter.fingerprint() for technology detection * If database detected, prepend database-specific payloads via get_recommended_payloads()
  2. Intelligent Injection (wshawk/scanner_v2.py L161-L172 ): * Use sample_messages[0] as template for payload embedding * If learning_complete and format is JSON: call message_intel.inject_payload_into_message() * Otherwise: send raw payload strings
  3. Send/Receive Cycle (wshawk/scanner_v2.py L174-L182 ): * Send injected message via await ws.send(msg) * Increment messages_sent counter * Receive response with 2-second timeout * Increment messages_received counter * Add response to fingerprinter for continuous learning
  4. Verification (wshawk/scanner_v2.py L184-L188 ): * Call vulnerability-specific verifier method (e.g., verifier.verify_sql_injection()) * Returns tuple: (is_vuln: bool, confidence: ConfidenceLevel, description: str) * Filter out LOW confidence findings to reduce false positives
  5. Result Recording (wshawk/scanner_v2.py L189-L202 ): * Log vulnerability with Logger.vuln() * Append structured finding to self.vulnerabilities list with fields: * type, severity, confidence, description, payload, response_snippet, recommendation
  6. Rate Limiting (wshawk/scanner_v2.py L207 ): * Sleep 0.05 seconds (50ms) between payloads * Prevents overwhelming target server * Complements TokenBucketRateLimiter for adaptive rate control

Sources: wshawk/scanner_v2.py L143-L213


Advanced Verification Integration

Browser-Based XSS Verification

The scanner integrates optional browser-based verification for XSS payloads:

flowchart TD

XSSTest["test_xss_v2()"]
Pattern["Pattern-Based Detection"]
PatternVerify["verifier.verify_xss()"]
CheckConfidence["confidence == HIGH<br>and use_headless_browser?"]
BrowserInit["Initialize Browser (if needed)"]
InitBrowser["headless_verifier = HeadlessBrowserXSSVerifier()<br>await headless_verifier.start()"]
BrowserVerify["Browser Verification"]
VerifyExec["is_executed, evidence = await verify_xss_execution()"]
CheckExec["is_executed?"]
Upgrade["Upgrade to CRITICAL"]
UpgradeConf["confidence = ConfidenceLevel.CRITICAL<br>browser_verified = True"]
Record["Record Finding"]
Append["vulnerabilities.append()<br>with browser_verified flag"]

XSSTest -.-> Pattern
Pattern -.-> PatternVerify
PatternVerify -.->|"Yes"| CheckConfidence
CheckConfidence -.-> BrowserInit
CheckConfidence -.->|"No"| Record
BrowserInit -.-> InitBrowser
InitBrowser -.->|"Yes"| BrowserVerify
BrowserVerify -.->|"No"| VerifyExec
VerifyExec -.-> CheckExec
CheckExec -.-> Upgrade
CheckExec -.-> Record
Upgrade -.-> UpgradeConf
UpgradeConf -.-> Record
Record -.-> Append

Implementation Details (wshawk/scanner_v2.py L250-L271

):

  • Only triggered for HIGH confidence XSS findings
  • Lazily initializes HeadlessBrowserXSSVerifier on first use
  • Calls verify_xss_execution(response, payload) to test actual JavaScript execution
  • If executed, upgrades confidence to CRITICAL and sets browser_verified flag
  • Screenshot evidence captured automatically by HeadlessBrowserXSSVerifier

OAST Integration for Blind Vulnerabilities

The scanner uses Out-of-Band Application Security Testing (OAST) for XXE and SSRF:

OAST Workflow (wshawk/scanner_v2.py L409-L425

):

  1. Check if use_oast is enabled
  2. Initialize OASTProvider if not already running
  3. Generate OAST-enabled payload with unique identifier: oast_provider.generate_payload('xxe', 'test{id}')
  4. Embed payload in message and send
  5. OAST server listens for DNS/HTTP callbacks
  6. If callback received, confirms blind vulnerability

The OAST provider runs on localhost:8888 by default (wshawk/scanner_v2.py L412

).

Sources: wshawk/scanner_v2.py L215-L293

wshawk/scanner_v2.py L402-L456


Connection Management

WebSocket Connection Lifecycle

#mermaid-4t818l8jjvh{font-family:ui-sans-serif,-apple-system,system-ui,Segoe UI,Helvetica;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-4t818l8jjvh .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-4t818l8jjvh .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-4t818l8jjvh .error-icon{fill:#dddddd;}#mermaid-4t818l8jjvh .error-text{fill:#222222;stroke:#222222;}#mermaid-4t818l8jjvh .edge-thickness-normal{stroke-width:1px;}#mermaid-4t818l8jjvh .edge-thickness-thick{stroke-width:3.5px;}#mermaid-4t818l8jjvh .edge-pattern-solid{stroke-dasharray:0;}#mermaid-4t818l8jjvh .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-4t818l8jjvh .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-4t818l8jjvh .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-4t818l8jjvh .marker{fill:#999;stroke:#999;}#mermaid-4t818l8jjvh .marker.cross{stroke:#999;}#mermaid-4t818l8jjvh svg{font-family:ui-sans-serif,-apple-system,system-ui,Segoe UI,Helvetica;font-size:16px;}#mermaid-4t818l8jjvh p{margin:0;}#mermaid-4t818l8jjvh defs #statediagram-barbEnd{fill:#999;stroke:#999;}#mermaid-4t818l8jjvh g.stateGroup text{fill:#dddddd;stroke:none;font-size:10px;}#mermaid-4t818l8jjvh g.stateGroup text{fill:#333;stroke:none;font-size:10px;}#mermaid-4t818l8jjvh g.stateGroup .state-title{font-weight:bolder;fill:#333;}#mermaid-4t818l8jjvh g.stateGroup rect{fill:#ffffff;stroke:#dddddd;}#mermaid-4t818l8jjvh g.stateGroup line{stroke:#999;stroke-width:1;}#mermaid-4t818l8jjvh .transition{stroke:#999;stroke-width:1;fill:none;}#mermaid-4t818l8jjvh .stateGroup .composit{fill:#f4f4f4;border-bottom:1px;}#mermaid-4t818l8jjvh .stateGroup .alt-composit{fill:#e0e0e0;border-bottom:1px;}#mermaid-4t818l8jjvh .state-note{stroke:#e6d280;fill:#fff5ad;}#mermaid-4t818l8jjvh .state-note text{fill:#333;stroke:none;font-size:10px;}#mermaid-4t818l8jjvh .stateLabel .box{stroke:none;stroke-width:0;fill:#ffffff;opacity:0.5;}#mermaid-4t818l8jjvh .edgeLabel .label rect{fill:#ffffff;opacity:0.5;}#mermaid-4t818l8jjvh .edgeLabel{background-color:#ffffff;text-align:center;}#mermaid-4t818l8jjvh .edgeLabel p{background-color:#ffffff;}#mermaid-4t818l8jjvh .edgeLabel rect{opacity:0.5;background-color:#ffffff;fill:#ffffff;}#mermaid-4t818l8jjvh .edgeLabel .label text{fill:#333;}#mermaid-4t818l8jjvh .label div .edgeLabel{color:#333;}#mermaid-4t818l8jjvh .stateLabel text{fill:#333;font-size:10px;font-weight:bold;}#mermaid-4t818l8jjvh .node circle.state-start{fill:#999;stroke:#999;}#mermaid-4t818l8jjvh .node .fork-join{fill:#999;stroke:#999;}#mermaid-4t818l8jjvh .node circle.state-end{fill:#dddddd;stroke:#f4f4f4;stroke-width:1.5;}#mermaid-4t818l8jjvh .end-state-inner{fill:#f4f4f4;stroke-width:1.5;}#mermaid-4t818l8jjvh .node rect{fill:#ffffff;stroke:#dddddd;stroke-width:1px;}#mermaid-4t818l8jjvh .node polygon{fill:#ffffff;stroke:#dddddd;stroke-width:1px;}#mermaid-4t818l8jjvh #statediagram-barbEnd{fill:#999;}#mermaid-4t818l8jjvh .statediagram-cluster rect{fill:#ffffff;stroke:#dddddd;stroke-width:1px;}#mermaid-4t818l8jjvh .cluster-label,#mermaid-4t818l8jjvh .nodeLabel{color:#333;}#mermaid-4t818l8jjvh .statediagram-cluster rect.outer{rx:5px;ry:5px;}#mermaid-4t818l8jjvh .statediagram-state .divider{stroke:#dddddd;}#mermaid-4t818l8jjvh .statediagram-state .title-state{rx:5px;ry:5px;}#mermaid-4t818l8jjvh .statediagram-cluster.statediagram-cluster .inner{fill:#f4f4f4;}#mermaid-4t818l8jjvh .statediagram-cluster.statediagram-cluster-alt .inner{fill:#f8f8f8;}#mermaid-4t818l8jjvh .statediagram-cluster .inner{rx:0;ry:0;}#mermaid-4t818l8jjvh .statediagram-state rect.basic{rx:5px;ry:5px;}#mermaid-4t818l8jjvh .statediagram-state rect.divider{stroke-dasharray:10,10;fill:#f8f8f8;}#mermaid-4t818l8jjvh .note-edge{stroke-dasharray:5;}#mermaid-4t818l8jjvh .statediagram-note rect{fill:#fff5ad;stroke:#e6d280;stroke-width:1px;rx:0;ry:0;}#mermaid-4t818l8jjvh .statediagram-note rect{fill:#fff5ad;stroke:#e6d280;stroke-width:1px;rx:0;ry:0;}#mermaid-4t818l8jjvh .statediagram-note text{fill:#333;}#mermaid-4t818l8jjvh .statediagram-note .nodeLabel{color:#333;}#mermaid-4t818l8jjvh .statediagram .edgeLabel{color:red;}#mermaid-4t818l8jjvh #dependencyStart,#mermaid-4t818l8jjvh #dependencyEnd{fill:#999;stroke:#999;stroke-width:1;}#mermaid-4t818l8jjvh .statediagramTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-4t818l8jjvh :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;}"connect()""websockets.connect() success""Connection failed""learning_phase()""Learning complete""Execute test methods""ws.close()""Session complete""Return None""Scan complete"DisconnectedConnectingConnectedErrorLearningTestingClosed

Connection Method (wshawk/scanner_v2.py L77-L85

):

async def connect(self):
    """Establish WebSocket connection"""
    try:
        ws = await websockets.connect(self.url, additional_headers=self.headers)
        self.state_machine._update_state('connected')
        return ws
    except Exception as e:
        Logger.error(f"Connection failed: {e}")
        return None
  • Uses websockets.connect() with additional headers from self.headers
  • Updates SessionStateMachine to 'connected' state
  • Returns WebSocket object or None on failure
  • Connection failure causes run_intelligent_scan() to exit early

State Tracking:

The SessionStateMachine tracks connection state throughout the scan lifecycle, enabling authentication sequence replay and state-dependent testing logic.

Sources: wshawk/scanner_v2.py L77-L85


Statistics and Metrics

Runtime Statistics Collection

The scanner tracks several metrics during execution:

| Metric | Type | Purpose | | --- | --- | --- | | messages_sent | int | Total WebSocket messages sent to target | | messages_received | int | Total WebSocket messages received from target | | start_time | datetime | Scan start timestamp | | end_time | datetime | Scan completion timestamp | | vulnerabilities | List[Dict] | Aggregated vulnerability findings | | sample_messages | List[str] | Messages collected during learning phase | | traffic_logs | List | Detailed request/response pairs for reporting |

Duration Calculation (wshawk/scanner_v2.py L634-L636

):

self.end_time = datetime.now()
duration = (self.end_time - self.start_time).total_seconds()

Rate Limiter Statistics (wshawk/scanner_v2.py L676-L678

):

rate_stats = self.rate_limiter.get_stats()
Logger.info(f"Rate limiter: {rate_stats['total_requests']} requests, {rate_stats['total_waits']} waits")
Logger.info(f"  Current rate: {rate_stats['current_rate']}, Adaptive adjustments: {rate_stats['adaptive_adjustments']}")

These metrics are included in the final HTML report for audit and analysis purposes.

Sources: wshawk/scanner_v2.py L64-L75

wshawk/scanner_v2.py L634-L678


Integration with Intelligence Modules

Module Coordination Map

flowchart TD

Scanner["WSHawkV2"]
MI["MessageIntelligence<br>message_intel"]
SF["ServerFingerprinter<br>fingerprinter"]
VV["VulnerabilityVerifier<br>verifier"]
SM["SessionStateMachine<br>state_machine"]
RL["TokenBucketRateLimiter<br>rate_limiter"]
HB["HeadlessBrowserXSSVerifier<br>headless_verifier"]
OAST["OASTProvider<br>oast_provider"]
Rep["EnhancedHTMLReporter<br>reporter"]

Scanner -.->|"inject_payload_into_message()"| MI
Scanner -.->|"get_format_info()"| MI
Scanner -.->|"add_response()"| MI
Scanner -.->|"fingerprint()"| SF
Scanner -.->|"get_recommended_payloads()"| SF
Scanner -.->|"verify_sql_injection()"| SF
Scanner -.->|"verify_xss()"| VV
Scanner -.->|"verify_command_injection()"| VV
Scanner -.->|"verify_path_traversal()"| VV
Scanner -.->|"_update_state()"| VV
Scanner -.-> SM
Scanner -.->|"load_sequence_from_yaml()"| SM
Scanner -.->|"learn_from_messages()"| RL
Scanner -.->|"get_stats()"| RL
Scanner -.->|"stop()"| HB
Scanner -.->|"acquire()"| HB
Scanner -.->|"generate_payload()"| HB
Scanner -.->|"start()"| OAST
Scanner -.->|"stop()"| OAST
Scanner -.->|"verify_xss_execution()"| OAST
Scanner -.->|"start()"| Rep

subgraph Reporting ["Reporting"]
    Rep
end

subgraph subGraph2 ["Advanced Verification"]
    HB
    OAST
end

subgraph subGraph1 ["Rate Limiting"]
    RL
end

subgraph subGraph0 ["Intelligence Modules"]
    MI
    SF
    VV
    SM
end

Key Integration Points:

  1. MessageIntelligence (wshawk/scanner_v2.py L41 ): * Learning phase: learn_from_messages(samples) * Payload injection: inject_payload_into_message(base_message, payload) * Format query: get_format_info()
  2. ServerFingerprinter (wshawk/scanner_v2.py L43 ): * Response accumulation: add_response(message) (called continuously) * Fingerprint extraction: fingerprint() returns language/framework/database * Payload recommendations: get_recommended_payloads(fingerprint)
  3. VulnerabilityVerifier (wshawk/scanner_v2.py L42 ): * SQL verification: verify_sql_injection(response, payload) * XSS verification: verify_xss(response, payload) * Command injection: verify_command_injection(response, payload) * Path traversal: verify_path_traversal(response, payload) * Returns: (is_vuln: bool, confidence: ConfidenceLevel, description: str)
  4. TokenBucketRateLimiter (wshawk/scanner_v2.py L45-L49 ): * Configured with tokens_per_second=max_rps, enable_adaptive=True * Acquire token: await rate_limiter.acquire() (used in SSRF test) * Statistics: get_stats() returns request counts and adaptive adjustments
  5. EnhancedHTMLReporter (wshawk/scanner_v2.py L50 ): * Report generation: generate_report(vulnerabilities, scan_info, fingerprint_info) * Returns HTML string for file output

For detailed documentation of these modules, see Intelligence Modules.

Sources: wshawk/scanner_v2.py L14-L26

wshawk/scanner_v2.py L40-L50


Report Generation

HTML Report Structure

The scanner generates comprehensive HTML reports via the EnhancedHTMLReporter:

Report Generation Process (wshawk/scanner_v2.py L652-L673

):

  1. Prepare Scan Information: scan_info = { 'target': self.url, 'duration': duration, 'messages_sent': self.messages_sent, 'messages_received': self.messages_received }
  2. Extract Fingerprint: fingerprint_info = self.fingerprinter.get_info()
  3. Generate HTML: report_html = self.reporter.generate_report( self.vulnerabilities, scan_info, fingerprint_info )
  4. Save to File: report_filename = f"wshawk_report_{datetime.now().strftime('%Y%m%d_%H%M%S')}.html" with open(report_filename, 'w') as f: f.write(report_html)

Report Contents:

  • Vulnerability findings with CVSS v3.1 scores
  • Confidence levels (CRITICAL/HIGH/MEDIUM/LOW)
  • Payload details and response snippets
  • Server fingerprint information
  • Scan statistics and duration
  • Rate limiter performance metrics
  • Screenshots for browser-verified XSS
  • Remediation recommendations

For details about report format and structure, see Report Format and Output.

Sources: wshawk/scanner_v2.py L652-L673


Configuration Options

Scanner Flags and Settings

The WSHawkV2 class exposes several configuration options:

| Option | Type | Default | Location | Purpose | | --- | --- | --- | --- | --- | | use_headless_browser | bool | True | scanner_v2.py L53 | Enable Playwright browser verification for XSS | | use_oast | bool | True | scanner_v2.py L57 | Enable OAST for blind XXE/SSRF detection | | max_rps | int | 10 | Constructor param | Maximum requests per second for rate limiting |

Example Configuration:

from wshawk.scanner_v2 import WSHawkV2

scanner = WSHawkV2("ws://target.com", max_rps=5)
scanner.use_headless_browser = False  # Disable browser verification
scanner.use_oast = True  # Keep OAST enabled

await scanner.run_intelligent_scan()

CLI Integration:

The CLI commands (wshawk, wshawk-advanced) map command-line flags to these options:

  • --playwright flag sets use_headless_browser = True
  • --no-oast flag sets use_oast = False
  • --rate N flag sets max_rps = N
  • --full flag enables all features

Sources: wshawk/scanner_v2.py L52-L58

README.md L133-L141


Error Handling and Resilience

Exception Management

The scanner implements defensive error handling throughout:

Connection Failures (wshawk/scanner_v2.py L83-L85

):

  • Logs error and returns None
  • Caller checks for None and exits gracefully

Message Receive Timeouts (wshawk/scanner_v2.py L101-L113

wshawk/scanner_v2.py L204-L205

):

  • Uses asyncio.wait_for(ws.recv(), timeout=X) with 1-2 second timeouts
  • Catches asyncio.TimeoutError and continues to next payload
  • Prevents scan from hanging on unresponsive servers

Test Method Exceptions (wshawk/scanner_v2.py L209-L211

):

  • Wraps test logic in try/except blocks
  • Logs errors but continues with remaining tests
  • Ensures one failing test doesn't abort entire scan

Resource Cleanup (wshawk/scanner_v2.py L618-L631

):

  • Wraps browser and OAST cleanup in try/except
  • Logs cleanup status
  • Ensures resources released even if errors occur

Learning Phase Fallback (wshawk/scanner_v2.py L115-L117

wshawk/scanner_v2.py L139-L141

):

  • If no messages received during learning phase, logs warning
  • Falls back to basic payload injection without message structure intelligence
  • Allows scan to proceed even without optimal intelligence

Sources: wshawk/scanner_v2.py L77-L680


Usage Examples

Basic Scan

import asyncio
from wshawk.scanner_v2 import WSHawkV2

async def main():
    scanner = WSHawkV2("ws://target.com:8080")
    results = await scanner.run_intelligent_scan()
    print(f"Found {len(results)} vulnerabilities")

asyncio.run(main())

Scan with Authentication

scanner = WSHawkV2(
    "wss://secure.example.com/ws",
    headers={"Authorization": "Bearer token123"},
    auth_sequence="auth_config.yaml"
)
await scanner.run_intelligent_scan()

Custom Rate Limiting

scanner = WSHawkV2("ws://target.com", max_rps=5)
scanner.use_headless_browser = True
scanner.use_oast = True
await scanner.run_intelligent_scan()

Accessing Results Programmatically

scanner = WSHawkV2("ws://target.com")
results = await scanner.run_intelligent_scan()

# Filter by severity
critical = [v for v in results if v['severity'] == 'CRITICAL']
high = [v for v in results if v['severity'] == 'HIGH']

# Access findings
for vuln in results:
    print(f"{vuln['type']}: {vuln['description']}")
    print(f"Payload: {vuln['payload']}")
    print(f"CVSS: {vuln.get('cvss_score', 'N/A')}")

For programmatic integration examples, see Python API Usage.

Sources: README.md L196-L209


Performance Characteristics

Scan Duration and Throughput

Typical scan performance metrics:

| Phase | Duration | Throughput | | --- | --- | --- | | Learning Phase | 5 seconds | Passive observation | | SQL Injection | 10-15 seconds | ~100 payloads | | XSS Testing | 10-15 seconds | ~100 payloads | | Command Injection | 10-15 seconds | ~100 payloads | | Path Traversal | 5-10 seconds | ~50 payloads | | XXE Testing | 3-5 seconds | ~30 payloads | | NoSQL Injection | 5-10 seconds | ~50 payloads | | SSRF Testing | 2-5 seconds | ~4-8 targets | | Session Tests | 10-20 seconds | 6 test scenarios | | Total | 60-100 seconds | ~450-550 payloads |

Rate Limiting Impact:

  • Default rate: 10 requests/second
  • 50ms delay between payloads (await asyncio.sleep(0.05))
  • Adaptive rate limiting adjusts based on server response times
  • Statistics available via rate_limiter.get_stats()

Optimization Strategies:

  1. Reduce payload counts in test methods (e.g., [:100] slice)
  2. Adjust max_rps parameter for faster/slower scans
  3. Disable browser verification to skip Playwright overhead
  4. Disable OAST if blind vulnerability detection not needed

Sources: wshawk/scanner_v2.py L150

wshawk/scanner_v2.py L207

wshawk/scanner_v2.py L545-L680


Summary

The WSHawkV2 scanner engine implements a sophisticated intelligence-driven testing workflow:

  1. Passive learning to understand application message structure and server technology
  2. Context-aware payload injection that adapts to detected formats
  3. Multi-layered verification combining pattern matching, context analysis, and browser/OAST verification
  4. Coordinated orchestration of seven vulnerability test types plus session security testing
  5. Professional reporting with CVSS scoring and actionable recommendations

The architecture prioritizes reducing false positives through rigorous verification while maintaining scan efficiency via adaptive rate limiting. The modular design enables extensibility through well-defined intelligence module interfaces.

For implementation details of the intelligence modules, see Intelligence Modules. For CLI usage patterns, see CLI Command Reference.