Core Architecture
Core Architecture
The following files were used as context for generating this wiki page:
- RELEASE_3.0.0.md
- RELEASE_SUMMARY.md
- docs/V3_COMPLETE_GUIDE.md
- requirements.txt
- wshawk/advanced_cli.py
- wshawk/scanner_v2.py
Purpose and Scope
This document provides a technical deep dive into WSHawk v3.0.0's internal architecture, covering the design patterns, component interactions, and implementation details of the core scanning engine and its supporting subsystems. This page focuses on the structural organization and data flow within the codebase.
For operational usage of the scanner, see Getting Started. For details on specific vulnerability detection techniques, see Offensive Testing. For deployment patterns, see Distribution and Deployment. For extending the system, see Plugin System.
Sources: RELEASE_SUMMARY.md:1-60, docs/V3_COMPLETE_GUIDE.md:1-444
Architectural Planes
WSHawk v3.0.0 is organized into four distinct architectural planes, each responsible for a specific operational concern. This separation allows independent evolution of each subsystem while maintaining clear interfaces between them.
| Plane | Primary Responsibility | Key Components | Failure Mode |
|-------|----------------------|----------------|--------------|
| Red Team Execution | Payload injection, vulnerability detection | WSHawkV2, MessageAnalyzer, VulnerabilityVerifier, PayloadEvolver | Scan produces no findings |
| Infrastructure Persistence | Data durability, scan history | ScanDatabase (SQLite), WAL mode, report exporters | Data loss on crash |
| Resilience Control | Network stability, error handling | ResilientSession, circuit breakers, rate limiters | Service degradation under load |
| Integration Collaboration | External platform communication | Jira, DefectDojo, webhook notifiers | Integration failures block scan completion |
The planes are intentionally decoupled: a failure in the Integration Collaboration Plane does not prevent the Red Team Execution Plane from completing its scan. Similarly, the Resilience Control Plane wraps all network I/O regardless of which plane initiated the request.
Sources: RELEASE_SUMMARY.md:7-20, docs/V3_COMPLETE_GUIDE.md:111-131
WSHawkV2 Scanner Engine
Component Architecture
The WSHawkV2 class serves as the central orchestrator for all scanning operations. It initializes and coordinates analysis modules, manages the connection lifecycle, and aggregates results.
graph TB
subgraph "WSHawkV2 Core (scanner_v2.py:35-101)"
Scanner["WSHawkV2<br/>__init__()"]
Connect["connect()<br/>line 102-110"]
Learning["learning_phase()<br/>line 112-175"]
HeuristicScan["run_heuristic_scan()<br/>line 593-850"]
end
subgraph "Analysis Modules"
MA["MessageAnalyzer<br/>message_intelligence.py"]
VV["VulnerabilityVerifier<br/>vulnerability_verifier.py"]
SF["ServerFingerprinter<br/>server_fingerprint.py"]
SM["SessionStateMachine<br/>state_machine.py"]
RL["TokenBucketRateLimiter<br/>rate_limiter.py"]
end
subgraph "Smart Payload System"
CAG["ContextAwareGenerator<br/>smart_payloads/context_generator.py"]
PE["PayloadEvolver<br/>smart_payloads/payload_evolver.py"]
FL["FeedbackLoop<br/>smart_payloads/feedback_loop.py"]
end
subgraph "Static Payloads"
WSP["WSPayloads<br/>__main__.py"]
Files["22,000+ vectors<br/>payloads/*.txt"]
end
subgraph "Verification"
HB["HeadlessBrowserXSSVerifier<br/>headless_xss_verifier.py"]
OAST["OASTProvider<br/>oast_provider.py"]
SH["SessionHijackingTester<br/>session_hijacking_tester.py"]
end
Scanner --> MA
Scanner --> VV
Scanner --> SF
Scanner --> SM
Scanner --> RL
Scanner --> CAG
Scanner --> PE
Scanner --> FL
Scanner --> WSP
WSP --> Files
Scanner --> HB
Scanner --> OAST
Scanner --> SH
HeuristicScan --> Connect
HeuristicScan --> Learning
Learning --> MA
Learning --> SF
Initialization Sequence: The constructor wshawk/scanner_v2.py:40-100 instantiates all analysis modules, configures rate limiting via TokenBucketRateLimiter (line 62-66), and optionally enables smart payload generation (line 72-75). Configuration is loaded from WSHawkConfig if not explicitly provided (line 49-53).
Sources: wshawk/scanner_v2.py:35-101, docs/V3_COMPLETE_GUIDE.md:115-120
Scan Lifecycle
The core scanning workflow is implemented in the run_heuristic_scan() method wshawk/scanner_v2.py:593-850. The lifecycle consists of five distinct phases:
sequenceDiagram
participant Client as WSHawkV2
participant WS as WebSocket
participant MA as MessageAnalyzer
participant VV as VulnerabilityVerifier
participant PE as PayloadEvolver
participant SH as SessionHijackingTester
Client->>WS: connect() [line 102-110]
WS-->>Client: Connection established
Note over Client,MA: Phase 1: Learning (5s)
Client->>WS: listen for messages
WS-->>Client: sample messages
Client->>MA: learn_from_messages()
Client->>MA: get_format_info()
MA-->>Client: format: JSON/XML/etc
Note over Client,VV: Phase 2: Heuristic Testing
loop For each vulnerability type
Client->>Client: inject_payload_into_message()
Client->>WS: send(payload)
WS-->>Client: response
Client->>VV: verify_sql_injection/verify_xss()
VV-->>Client: confidence: HIGH/MEDIUM/LOW
end
Note over Client,PE: Phase 3: Smart Evolution (if enabled)
Client->>PE: evolve(count=30)
PE-->>Client: mutated payloads
loop For each evolved payload
Client->>WS: send(evolved_payload)
WS-->>Client: response
Client->>VV: verify_*()
Client->>PE: update_fitness()
end
Note over Client,SH: Phase 4: Session Security
Client->>SH: run_all_tests()
SH-->>Client: session vulnerabilities
Note over Client: Phase 5: Reporting
Client->>Client: generate_report()
Client->>Client: export_formats()
Client->>Client: trigger_integrations()
Phase 1 - Learning (line 112-175): The scanner establishes a baseline by collecting 5-10 seconds of server messages. The MessageAnalyzer.learn_from_messages() detects the message format (JSON, XML, protobuf) and identifies injectable fields. The ServerFingerprinter analyzes response patterns to identify the technology stack.
Phase 2 - Heuristic Testing (line 616-635): For each vulnerability type (SQLi, XSS, Command Injection, etc.), the scanner injects payloads into detected message fields. The VulnerabilityVerifier examines responses using multiple heuristics to assign confidence levels. High-confidence XSS findings trigger Playwright browser verification (line 294-314).
Phase 3 - Smart Evolution (line 637-703): If use_smart_payloads is enabled, the PayloadEvolver generates mutated variants of successful payloads using genetic algorithms. The FeedbackLoop analyzes response characteristics (timing, size, patterns) to guide evolution.
Phase 4 - Session Security (line 709-732): The SessionHijackingTester validates authentication controls through six specialized tests.
Phase 5 - Reporting (line 750-850): Results are aggregated, CVSS scores calculated, and reports generated in multiple formats (HTML, JSON, CSV, SARIF). Configured integrations (Jira, DefectDojo, webhooks) are triggered.
Sources: wshawk/scanner_v2.py:593-850, docs/V3_COMPLETE_GUIDE.md:176-201
Analysis and Verification Subsystems
MessageAnalyzer
The MessageAnalyzer module wshawk/message_intelligence.py performs structural analysis of WebSocket messages to enable context-aware payload injection.
Format Detection: The learn_from_messages() method analyzes a corpus of sample messages to detect:
- JSON structures with nested objects and arrays
- XML documents with DTD/schema references
- Protobuf binary serialization patterns
- Plain text or custom formats
Field Extraction: Once the format is identified, get_format_info() returns a list of injectable_fields that represent potential injection points. For JSON messages, this includes all string-valued keys. For XML, this includes text nodes and attribute values.
Intelligent Injection: The inject_payload_into_message() method wshawk/scanner_v2.py:201-204 takes a base message and a payload, returning a list of mutated messages with the payload injected into each identified field. This allows a single payload to be tested across multiple injection points without manual intervention.
Sources: wshawk/scanner_v2.py:145-165
VulnerabilityVerifier
The VulnerabilityVerifier module wshawk/vulnerability_verifier.py implements multi-heuristic detection for 11+ vulnerability types. Each verification method returns a tuple of (is_vulnerable: bool, confidence: ConfidenceLevel, description: str).
Confidence Scoring: The system uses an enum-based confidence system:
CRITICAL: Browser-verified execution (XSS) or OAST callback receivedHIGH: Multiple strong indicators (e.g., SQL error + timing anomaly)MEDIUM: Single strong indicator or multiple weak indicatorsLOW: Weak indicators that may be false positives
SQL Injection Detection (line 217-245): The verify_sql_injection() method checks for:
- Database error messages (syntax errors, constraint violations)
- Timing anomalies for blind SQLi payloads
- Successful UNION-based injection patterns
XSS Detection (line 286-331): The verify_xss() method:
- Checks if the payload is reflected in the response
- Analyzes the reflection context (HTML, JavaScript, attribute)
- Determines if special characters are properly escaped
- Triggers Playwright verification for high-confidence candidates (line 294-314)
Sources: wshawk/scanner_v2.py:177-256, wshawk/scanner_v2.py:258-341
ServerFingerprinter
The ServerFingerprinter wshawk/server_fingerprint.py identifies the server technology stack by analyzing response patterns, error messages, and timing characteristics. This enables the scanner to prioritize relevant payloads.
Detection Categories:
- Language: Python, Node.js, Java, PHP, Ruby, Go
- Framework: Express, Flask, Django, Spring, Socket.io
- Database: MySQL, PostgreSQL, MongoDB, Redis
Payload Recommendations: The get_recommended_payloads() method returns technology-specific attack vectors. For example, if Node.js/MongoDB is detected, it prioritizes NoSQL injection over traditional SQLi wshawk/scanner_v2.py:189-192.
Sources: wshawk/scanner_v2.py:166-171, wshawk/scanner_v2.py:353-358
Smart Payload Evolution System
The Smart Payload Evolution (SPE) system represents a departure from traditional static fuzzing by introducing adaptive, context-aware payload generation.
Component Interaction
graph LR
subgraph "Input"
Samples["Sample Messages<br/>learning_phase()"]
Static["Static Payloads<br/>WSPayloads.get_*()"]
end
subgraph "Context Learning"
CAG["ContextAwareGenerator<br/>learn_from_message()"]
Context["context dict<br/>format, fields, types"]
end
subgraph "Feedback System"
FL["FeedbackLoop<br/>analyze_response()"]
Baseline["baseline_metrics"]
Signals["ResponseSignal<br/>ERROR/ANOMALY/NORMAL"]
end
subgraph "Evolution Engine"
PE["PayloadEvolver<br/>population[]<br/>fitness_scores{}"]
Mutate["mutate()<br/>crossover()<br/>inject()"]
end
subgraph "Output"
Evolved["Evolved Payloads<br/>evolve(count=30)"]
CtxGen["Context Payloads<br/>generate_payloads()"]
end
Samples --> CAG
CAG --> Context
Static --> PE
Context --> CAG
CAG --> CtxGen
PE --> Mutate
Mutate --> Evolved
Evolved --> Scanner["Scanner<br/>test & verify"]
Scanner --> FL
FL --> Signals
Signals --> PE
PE --> |"update_fitness()"|PE
Sources: wshawk/scanner_v2.py:71-75, wshawk/scanner_v2.py:637-703
ContextAwareGenerator
The ContextAwareGenerator wshawk/smart_payloads/context_generator.py analyzes the target application's message structure to generate type-appropriate payloads.
Learning Phase: During learn_from_message() wshawk/scanner_v2.py:158-163, the generator:
- Parses message structure to extract field names and types
- Builds a schema of expected data formats
- Identifies fields that accept user-controlled input
Payload Generation: The generate_payloads(category, count) method wshawk/scanner_v2.py:645 creates payloads that:
- Match the detected message format (valid JSON/XML structure)
- Inject attack vectors into identified fields
- Maintain syntactic validity to bypass basic input validation
Sources: wshawk/scanner_v2.py:158-163, wshawk/scanner_v2.py:645-646
PayloadEvolver
The PayloadEvolver wshawk/smart_payloads/payload_evolver.py implements genetic algorithms to breed successful attack payloads. The system maintains a population of candidate payloads, each with an associated fitness score.
Initialization: The evolver is seeded with successful payloads discovered during the heuristic scan via seed() wshawk/scanner_v2.py:233.
Evolution Process: The evolve(count) method wshawk/scanner_v2.py:640 generates new payloads through:
- Selection: High-fitness payloads are selected for breeding
- Crossover: Two parent payloads are combined to create offspring
- Mutation: Random character substitutions, encoding changes, and structure modifications
- Injection: Novel injection contexts derived from message structure analysis
Fitness Feedback: When evolved payloads trigger vulnerabilities, update_fitness() wshawk/scanner_v2.py:234, 319, 683 increases their breeding priority, causing similar payloads to proliferate in future generations.
Sources: wshawk/scanner_v2.py:637-703
FeedbackLoop
The FeedbackLoop wshawk/smart_payloads/feedback_loop.py provides real-time classification of server responses to guide the evolution engine.
Baseline Establishment: During the learning phase, establish_baseline() wshawk/scanner_v2.py:161 records normal response characteristics (timing, size, structure).
Response Analysis: The analyze_response() method wshawk/scanner_v2.py:223-225, 670-672 compares each response against the baseline to detect:
- ERROR signals: Stack traces, error codes, unexpected status changes
- ANOMALY signals: Significant timing deviations, response size changes
- NORMAL signals: Responses within expected parameters
Priority Categories: The get_priority_categories() method wshawk/scanner_v2.py:643 returns attack categories ranked by their success rate, allowing the scanner to focus on the most promising attack vectors.
Sources: wshawk/scanner_v2.py:161, wshawk/scanner_v2.py:223-225, wshawk/scanner_v2.py:643-646
Resilience Control Layer
The Resilience Control Plane wraps all network operations to handle unstable targets, rate limiting, and service failures gracefully.
ResilientSession Architecture
graph TB
subgraph "Application Layer"
Scanner["WSHawkV2<br/>Network operations"]
Integrations["Jira/DefectDojo/Webhooks<br/>API calls"]
OAST["OASTProvider<br/>HTTP requests"]
end
subgraph "ResilientSession Wrapper"
RS["ResilientSession<br/>execute_with_retry()"]
Classify["Error Classifier<br/>Transient vs Permanent"]
Backoff["ExponentialBackoff<br/>wait = base * 2^attempt + jitter"]
CB["CircuitBreaker<br/>CLOSED/OPEN/HALF_OPEN"]
end
subgraph "Rate Limiting"
TBR["TokenBucketRateLimiter<br/>acquire()"]
Adaptive["Adaptive Rate Control<br/>Server health monitoring"]
end
subgraph "Network"
HTTP["aiohttp.ClientSession"]
WS["websockets.connect()"]
end
Scanner --> RS
Integrations --> RS
OAST --> RS
RS --> Classify
Classify --> |"Transient error"|Backoff
Classify --> |"Permanent error"|Fail["Fail fast"]
Backoff --> |"Retry"|RS
RS --> CB
CB --> |"threshold exceeded"|Open["Block requests<br/>60s cooldown"]
CB --> |"recovery test"|HalfOpen["Allow 1 request"]
Scanner --> TBR
TBR --> Adaptive
Adaptive --> |"429 response"|SlowDown["Reduce rate"]
RS --> HTTP
RS --> WS
Sources: RELEASE_SUMMARY.md:9-14, docs/V3_COMPLETE_GUIDE.md:134-173
Error Classification
The ResilientSession distinguishes between transient errors (retry) and permanent errors (fail fast):
Transient Errors:
- HTTP 429 (Too Many Requests)
- HTTP 503 (Service Unavailable)
- Network timeouts
- Connection refused (server may be restarting)
Permanent Errors:
- HTTP 401/403 (Authentication/Authorization failures)
- HTTP 404 (Resource not found)
- Invalid SSL certificates
- Protocol errors
Implementation Pattern (from V3_COMPLETE_GUIDE.md:139-160):
if self.is_transient(error):
attempt += 1
delay = self.calculate_backoff(attempt)
await asyncio.sleep(delay)
else:
self.circuit_breaker.record_failure()
raise error
Sources: docs/V3_COMPLETE_GUIDE.md:139-160
Circuit Breaker State Machine
The circuit breaker protects downstream services from cascading failures. It implements a three-state machine:
| State | Behavior | Transition | |-------|----------|------------| | CLOSED | All requests pass through normally | After N failures → OPEN | | OPEN | All requests fail immediately | After 60s cooldown → HALF_OPEN | | HALF_OPEN | Single test request allowed | Success → CLOSED, Failure → OPEN |
State Transitions: When the failure threshold is exceeded (e.g., 5 consecutive failures), the circuit opens to prevent further damage. After a cooldown period, a single canary request is allowed. If it succeeds, normal operation resumes. If it fails, the circuit re-opens for another cooldown cycle.
Sources: RELEASE_SUMMARY.md:13, docs/V3_COMPLETE_GUIDE.md:166-169
Exponential Backoff
The backoff algorithm calculates wait times using the formula:
wait_time = min(max_delay, base_delay * 2^attempt) + random_jitter
Jitter: Random jitter (typically 0-1 seconds) prevents the "thundering herd" problem where multiple clients retry simultaneously after a failure, overwhelming the recovering server.
Adaptive Rate Limiting: The TokenBucketRateLimiter wshawk/scanner_v2.py:62-66 dynamically adjusts the request rate based on server health. When it detects 429 responses or timeouts, it reduces the token generation rate.
Sources: RELEASE_SUMMARY.md:12, docs/V3_COMPLETE_GUIDE.md:163-164, wshawk/scanner_v2.py:62-66
Persistence Layer
WSHawk v3.0.0 implements a "zero-loss persistence" architecture where all scan data is durably stored to survive crashes and power failures.
SQLite Database Schema
The persistent storage is implemented using SQLite with Write-Ahead Logging (WAL) mode. The database file is located at ~/.wshawk/scans.db by default.
Core Tables:
- scans: Scan metadata (target URL, start/end time, statistics)
- vulnerabilities: Individual findings with CVSS scores and payloads
- traffic_logs: Every WebSocket frame sent and received
- sessions: Web dashboard user sessions
- api_keys: Programmatic access tokens
WAL Mode Benefits:
- Crash Recovery: Uncommitted transactions are preserved in the WAL file
- Concurrent Reads: Multiple processes can read while one writes
- Performance: Writes are buffered and flushed in batches
Database Access: The ScanDatabase class wshawk/web/database.py provides an abstraction layer for all database operations. The web dashboard and CLI both use this interface.
Sources: RELEASE_SUMMARY.md:16-19, docs/V3_COMPLETE_GUIDE.md:122-125, wshawk/scanner_v2.py:17
Report Exporter
The ReportExporter wshawk/report_exporter.py generates reports in multiple formats from the same vulnerability dataset.
Format Support:
- HTML: Visual reports with CVSS badges, syntax highlighting, and remediation steps wshawk/enhanced_reporter.py
- JSON: Machine-readable format for SIEM integration wshawk/scanner_v2.py:796-802
- CSV: Spreadsheet-compatible tabular data
- SARIF: Static Analysis Results Interchange Format for GitHub Security tab
Export Workflow: After scan completion wshawk/scanner_v2.py:796-805, the scanner iterates through configured formats and calls exporter.export() for each. The SARIF format includes precise code locations and remediation guidance for CI/CD integration.
Sources: wshawk/scanner_v2.py:68, wshawk/scanner_v2.py:796-805, RELEASE_SUMMARY.md:52-56
Component Initialization and Configuration
Configuration System
The WSHawkConfig class wshawk/config.py implements hierarchical configuration with multiple sources:
- Default values (embedded in code)
- wshawk.yaml in current directory or
~/.wshawk/ - Environment variables (prefixed with
WSHAWK_) - CLI flags (highest priority)
Secret Resolution: Configuration values can reference external sources:
env:VAR_NAME- Read from environment variablefile:path/to/secret- Read from file
This allows sensitive credentials (API keys, passwords) to be stored outside the configuration file.
Configuration Loading: The scanner initializes configuration at startup wshawk/scanner_v2.py:48-53. If no config is provided, it calls WSHawkConfig.load() which searches standard locations.
Sources: wshawk/scanner_v2.py:48-53, docs/V3_COMPLETE_GUIDE.md:299-302
Initialization Sequence
The complete initialization flow when creating a WSHawkV2 instance:
sequenceDiagram
participant App as Application
participant Config as WSHawkConfig
participant Scanner as WSHawkV2.__init__
participant Modules as Analysis Modules
App->>Config: load()
Config->>Config: search for wshawk.yaml
Config->>Config: resolve environment variables
Config->>Config: apply secret resolution
Config-->>App: config object
App->>Scanner: WSHawkV2(url, config=config)
Scanner->>Modules: MessageAnalyzer()
Scanner->>Modules: VulnerabilityVerifier()
Scanner->>Modules: ServerFingerprinter()
Scanner->>Modules: SessionStateMachine()
Scanner->>Scanner: get rate_limit from config
Scanner->>Modules: TokenBucketRateLimiter(rate_limit)
Scanner->>Scanner: check config.scanner.features
alt smart_payloads enabled
Scanner->>Modules: ContextAwareGenerator()
Scanner->>Modules: PayloadEvolver(population_size=100)
Scanner->>Modules: FeedbackLoop()
end
Scanner->>Modules: EnhancedHTMLReporter()
Scanner->>Modules: ReportExporter()
Scanner->>Modules: BinaryMessageHandler()
Scanner-->>App: scanner instance ready
Configuration Overrides: The advanced CLI wshawk/advanced_cli.py:86-97 demonstrates how CLI flags override configuration values. For example, --playwright sets config.scanner.features.playwright to True, which is then read by the scanner wshawk/advanced_cli.py:180-181.
Sources: wshawk/scanner_v2.py:40-100, wshawk/advanced_cli.py:86-97
Module Dependencies
Key dependencies and their roles:
| Dependency | Version | Purpose |
|------------|---------|---------|
| websockets | ≥12.0 | Core WebSocket protocol implementation |
| aiohttp | ≥3.9.0 | HTTP client for integrations and OAST |
| playwright | ≥1.40.0 | Headless browser for XSS verification |
| pyyaml | ≥6.0.1 | Configuration file parsing |
| flask | ≥3.0.0 | Web management dashboard |
| numpy, scipy | ≥1.26.0, ≥1.11.0 | Genetic algorithm operations in PayloadEvolver |
| msgpack, cbor2 | ≥1.0.7, ≥5.6.1 | Binary message format analysis |
Optional Dependencies: Playwright is optional but recommended. If not installed, browser-based XSS verification is skipped wshawk/scanner_v2.py:294-309.
Sources: requirements.txt:1-25
Integration Points
The Integration Collaboration Plane provides extensibility through well-defined interfaces:
External Platform Integrations
Jira Integration: The JiraIntegration class wshawk/integrations/jira_connector.py automatically creates tickets for high/critical findings. Configuration is loaded from integrations.jira.* settings wshawk/scanner_v2.py:823-834. The integration is fault-tolerant: failures are logged but don't block scan completion.
DefectDojo Integration: The DefectDojoIntegration wshawk/integrations/defectdojo.py pushes findings to the vulnerability management platform. It automatically creates engagements if they don't exist wshawk/scanner_v2.py:810-820.
Webhook Notifications: The WebhookNotifier wshawk/integrations/webhook.py sends real-time alerts to Slack, Discord, or Microsoft Teams with rich formatting and CVSS severity badges wshawk/scanner_v2.py:837-845.
Integration Trigger: All integrations are triggered after report generation wshawk/scanner_v2.py:807-846 in a try-except block to ensure failures don't crash the scanner.
Sources: wshawk/scanner_v2.py:807-846, RELEASE_SUMMARY.md:30-34
CLI Entry Points
The system provides four distinct CLI entry points, each mapped to a specific use case:
| Command | Entry Point | Purpose |
|---------|-------------|---------|
| wshawk | pyproject.toml → wshawk.cli:main | Quick scan with default settings |
| wshawk-interactive | pyproject.toml → wshawk.interactive_cli:main | Menu-driven interface for beginners |
| wshawk-advanced | pyproject.toml → wshawk.advanced_cli:cli | Full feature access with flags |
| wshawk-defensive | pyproject.toml → wshawk.defensive_mode:main | Blue team validation tests |
CLI Argument Parsing: The advanced CLI wshawk/advanced_cli.py:12-84 demonstrates the comprehensive flag system. Flags are organized into argument groups (Integrations, Smart Payloads, Web GUI) for clarity.
Sources: wshawk/advanced_cli.py:12-84, RELEASE_SUMMARY.md:44-48
Module Relationships
The following diagram maps the complete module dependency graph, showing how code entities relate to each other:
graph TB
subgraph "Entry Points"
CLI["wshawk.cli:main<br/>__main__.py"]
AdvCLI["wshawk.advanced_cli:cli<br/>advanced_cli.py"]
IntCLI["wshawk.interactive_cli:main"]
DefCLI["wshawk.defensive_mode:main"]
WebApp["wshawk.web.app:run_web"]
end
subgraph "Core Engine"
V2["WSHawkV2<br/>scanner_v2.py:35"]
Legacy["WSScanner (Legacy)<br/>__main__.py"]
WSP["WSPayloads<br/>__main__.py"]
end
subgraph "Analysis"
MA["MessageAnalyzer<br/>message_intelligence.py"]
VV["VulnerabilityVerifier<br/>vulnerability_verifier.py"]
SF["ServerFingerprinter<br/>server_fingerprint.py"]
SM["SessionStateMachine<br/>state_machine.py"]
end
subgraph "Smart Payloads"
CAG["ContextAwareGenerator<br/>smart_payloads/context_generator.py"]
PE["PayloadEvolver<br/>smart_payloads/payload_evolver.py"]
FL["FeedbackLoop<br/>smart_payloads/feedback_loop.py"]
end
subgraph "Verification"
HBV["HeadlessBrowserXSSVerifier<br/>headless_xss_verifier.py"]
OAST["OASTProvider<br/>oast_provider.py"]
SHT["SessionHijackingTester<br/>session_hijacking_tester.py"]
end
subgraph "Reporting"
EHR["EnhancedHTMLReporter<br/>enhanced_reporter.py"]
RE["ReportExporter<br/>report_exporter.py"]
end
subgraph "Infrastructure"
Config["WSHawkConfig<br/>config.py"]
RSession["ResilientSession<br/>resilient_session.py"]
RL["TokenBucketRateLimiter<br/>rate_limiter.py"]
DB["ScanDatabase<br/>web/database.py"]
end
subgraph "Integrations"
JiraInt["JiraIntegration<br/>integrations/jira_connector.py"]
DDInt["DefectDojoIntegration<br/>integrations/defectdojo.py"]
WHInt["WebhookNotifier<br/>integrations/webhook.py"]
end
CLI --> V2
AdvCLI --> V2
AdvCLI --> WebApp
IntCLI --> Legacy
DefCLI --> DefMode["DefensiveValidator"]
V2 --> Config
V2 --> MA
V2 --> VV
V2 --> SF
V2 --> SM
V2 --> RL
V2 --> WSP
V2 --> CAG
V2 --> PE
V2 --> FL
V2 --> HBV
V2 --> OAST
V2 --> SHT
V2 --> EHR
V2 --> RE
V2 --> JiraInt
V2 --> DDInt
V2 --> WHInt
JiraInt --> RSession
DDInt --> RSession
WHInt --> RSession
OAST --> RSession
WebApp --> DB
WebApp --> Config
V2 --> DB
Key Observations:
-
Central Role of WSHawkV2: The
WSHawkV2class wshawk/scanner_v2.py:35 is the primary integration point, instantiating and coordinating all subsystems. -
ResilientSession as Infrastructure: The
ResilientSessionwraps network I/O for all external communications (integrations, OAST), providing a consistent error handling interface. -
Shared Database: Both the web dashboard and the scanner use
ScanDatabasefor persistence, enabling the web UI to display scan history from CLI runs. -
Configuration Cascade: The
WSHawkConfigis loaded once at startup and passed to all components, ensuring consistent behavior across modules. -
Optional Dependencies: Verification modules (Playwright, OAST) are conditionally initialized based on configuration, allowing the scanner to operate in resource-constrained environments.
Sources: wshawk/scanner_v2.py:1-100, wshawk/advanced_cli.py:1-300