Core Architecture

Core Architecture

The following files were used as context for generating this wiki page:

Purpose and Scope

This document provides a technical deep dive into WSHawk v3.0.0's internal architecture, covering the design patterns, component interactions, and implementation details of the core scanning engine and its supporting subsystems. This page focuses on the structural organization and data flow within the codebase.

For operational usage of the scanner, see Getting Started. For details on specific vulnerability detection techniques, see Offensive Testing. For deployment patterns, see Distribution and Deployment. For extending the system, see Plugin System.

Sources: RELEASE_SUMMARY.md:1-60, docs/V3_COMPLETE_GUIDE.md:1-444


Architectural Planes

WSHawk v3.0.0 is organized into four distinct architectural planes, each responsible for a specific operational concern. This separation allows independent evolution of each subsystem while maintaining clear interfaces between them.

| Plane | Primary Responsibility | Key Components | Failure Mode | |-------|----------------------|----------------|--------------| | Red Team Execution | Payload injection, vulnerability detection | WSHawkV2, MessageAnalyzer, VulnerabilityVerifier, PayloadEvolver | Scan produces no findings | | Infrastructure Persistence | Data durability, scan history | ScanDatabase (SQLite), WAL mode, report exporters | Data loss on crash | | Resilience Control | Network stability, error handling | ResilientSession, circuit breakers, rate limiters | Service degradation under load | | Integration Collaboration | External platform communication | Jira, DefectDojo, webhook notifiers | Integration failures block scan completion |

The planes are intentionally decoupled: a failure in the Integration Collaboration Plane does not prevent the Red Team Execution Plane from completing its scan. Similarly, the Resilience Control Plane wraps all network I/O regardless of which plane initiated the request.

Sources: RELEASE_SUMMARY.md:7-20, docs/V3_COMPLETE_GUIDE.md:111-131


WSHawkV2 Scanner Engine

Component Architecture

The WSHawkV2 class serves as the central orchestrator for all scanning operations. It initializes and coordinates analysis modules, manages the connection lifecycle, and aggregates results.

graph TB
    subgraph "WSHawkV2 Core (scanner_v2.py:35-101)"
        Scanner["WSHawkV2<br/>__init__()"]
        Connect["connect()<br/>line 102-110"]
        Learning["learning_phase()<br/>line 112-175"]
        HeuristicScan["run_heuristic_scan()<br/>line 593-850"]
    end
    
    subgraph "Analysis Modules"
        MA["MessageAnalyzer<br/>message_intelligence.py"]
        VV["VulnerabilityVerifier<br/>vulnerability_verifier.py"]
        SF["ServerFingerprinter<br/>server_fingerprint.py"]
        SM["SessionStateMachine<br/>state_machine.py"]
        RL["TokenBucketRateLimiter<br/>rate_limiter.py"]
    end
    
    subgraph "Smart Payload System"
        CAG["ContextAwareGenerator<br/>smart_payloads/context_generator.py"]
        PE["PayloadEvolver<br/>smart_payloads/payload_evolver.py"]
        FL["FeedbackLoop<br/>smart_payloads/feedback_loop.py"]
    end
    
    subgraph "Static Payloads"
        WSP["WSPayloads<br/>__main__.py"]
        Files["22,000+ vectors<br/>payloads/*.txt"]
    end
    
    subgraph "Verification"
        HB["HeadlessBrowserXSSVerifier<br/>headless_xss_verifier.py"]
        OAST["OASTProvider<br/>oast_provider.py"]
        SH["SessionHijackingTester<br/>session_hijacking_tester.py"]
    end
    
    Scanner --> MA
    Scanner --> VV
    Scanner --> SF
    Scanner --> SM
    Scanner --> RL
    
    Scanner --> CAG
    Scanner --> PE
    Scanner --> FL
    
    Scanner --> WSP
    WSP --> Files
    
    Scanner --> HB
    Scanner --> OAST
    Scanner --> SH
    
    HeuristicScan --> Connect
    HeuristicScan --> Learning
    Learning --> MA
    Learning --> SF

Initialization Sequence: The constructor wshawk/scanner_v2.py:40-100 instantiates all analysis modules, configures rate limiting via TokenBucketRateLimiter (line 62-66), and optionally enables smart payload generation (line 72-75). Configuration is loaded from WSHawkConfig if not explicitly provided (line 49-53).

Sources: wshawk/scanner_v2.py:35-101, docs/V3_COMPLETE_GUIDE.md:115-120

Scan Lifecycle

The core scanning workflow is implemented in the run_heuristic_scan() method wshawk/scanner_v2.py:593-850. The lifecycle consists of five distinct phases:

sequenceDiagram
    participant Client as WSHawkV2
    participant WS as WebSocket
    participant MA as MessageAnalyzer
    participant VV as VulnerabilityVerifier
    participant PE as PayloadEvolver
    participant SH as SessionHijackingTester
    
    Client->>WS: connect() [line 102-110]
    WS-->>Client: Connection established
    
    Note over Client,MA: Phase 1: Learning (5s)
    Client->>WS: listen for messages
    WS-->>Client: sample messages
    Client->>MA: learn_from_messages()
    Client->>MA: get_format_info()
    MA-->>Client: format: JSON/XML/etc
    
    Note over Client,VV: Phase 2: Heuristic Testing
    loop For each vulnerability type
        Client->>Client: inject_payload_into_message()
        Client->>WS: send(payload)
        WS-->>Client: response
        Client->>VV: verify_sql_injection/verify_xss()
        VV-->>Client: confidence: HIGH/MEDIUM/LOW
    end
    
    Note over Client,PE: Phase 3: Smart Evolution (if enabled)
    Client->>PE: evolve(count=30)
    PE-->>Client: mutated payloads
    loop For each evolved payload
        Client->>WS: send(evolved_payload)
        WS-->>Client: response
        Client->>VV: verify_*()
        Client->>PE: update_fitness()
    end
    
    Note over Client,SH: Phase 4: Session Security
    Client->>SH: run_all_tests()
    SH-->>Client: session vulnerabilities
    
    Note over Client: Phase 5: Reporting
    Client->>Client: generate_report()
    Client->>Client: export_formats()
    Client->>Client: trigger_integrations()

Phase 1 - Learning (line 112-175): The scanner establishes a baseline by collecting 5-10 seconds of server messages. The MessageAnalyzer.learn_from_messages() detects the message format (JSON, XML, protobuf) and identifies injectable fields. The ServerFingerprinter analyzes response patterns to identify the technology stack.

Phase 2 - Heuristic Testing (line 616-635): For each vulnerability type (SQLi, XSS, Command Injection, etc.), the scanner injects payloads into detected message fields. The VulnerabilityVerifier examines responses using multiple heuristics to assign confidence levels. High-confidence XSS findings trigger Playwright browser verification (line 294-314).

Phase 3 - Smart Evolution (line 637-703): If use_smart_payloads is enabled, the PayloadEvolver generates mutated variants of successful payloads using genetic algorithms. The FeedbackLoop analyzes response characteristics (timing, size, patterns) to guide evolution.

Phase 4 - Session Security (line 709-732): The SessionHijackingTester validates authentication controls through six specialized tests.

Phase 5 - Reporting (line 750-850): Results are aggregated, CVSS scores calculated, and reports generated in multiple formats (HTML, JSON, CSV, SARIF). Configured integrations (Jira, DefectDojo, webhooks) are triggered.

Sources: wshawk/scanner_v2.py:593-850, docs/V3_COMPLETE_GUIDE.md:176-201


Analysis and Verification Subsystems

MessageAnalyzer

The MessageAnalyzer module wshawk/message_intelligence.py performs structural analysis of WebSocket messages to enable context-aware payload injection.

Format Detection: The learn_from_messages() method analyzes a corpus of sample messages to detect:

  • JSON structures with nested objects and arrays
  • XML documents with DTD/schema references
  • Protobuf binary serialization patterns
  • Plain text or custom formats

Field Extraction: Once the format is identified, get_format_info() returns a list of injectable_fields that represent potential injection points. For JSON messages, this includes all string-valued keys. For XML, this includes text nodes and attribute values.

Intelligent Injection: The inject_payload_into_message() method wshawk/scanner_v2.py:201-204 takes a base message and a payload, returning a list of mutated messages with the payload injected into each identified field. This allows a single payload to be tested across multiple injection points without manual intervention.

Sources: wshawk/scanner_v2.py:145-165

VulnerabilityVerifier

The VulnerabilityVerifier module wshawk/vulnerability_verifier.py implements multi-heuristic detection for 11+ vulnerability types. Each verification method returns a tuple of (is_vulnerable: bool, confidence: ConfidenceLevel, description: str).

Confidence Scoring: The system uses an enum-based confidence system:

  • CRITICAL: Browser-verified execution (XSS) or OAST callback received
  • HIGH: Multiple strong indicators (e.g., SQL error + timing anomaly)
  • MEDIUM: Single strong indicator or multiple weak indicators
  • LOW: Weak indicators that may be false positives

SQL Injection Detection (line 217-245): The verify_sql_injection() method checks for:

  • Database error messages (syntax errors, constraint violations)
  • Timing anomalies for blind SQLi payloads
  • Successful UNION-based injection patterns

XSS Detection (line 286-331): The verify_xss() method:

  1. Checks if the payload is reflected in the response
  2. Analyzes the reflection context (HTML, JavaScript, attribute)
  3. Determines if special characters are properly escaped
  4. Triggers Playwright verification for high-confidence candidates (line 294-314)

Sources: wshawk/scanner_v2.py:177-256, wshawk/scanner_v2.py:258-341

ServerFingerprinter

The ServerFingerprinter wshawk/server_fingerprint.py identifies the server technology stack by analyzing response patterns, error messages, and timing characteristics. This enables the scanner to prioritize relevant payloads.

Detection Categories:

  • Language: Python, Node.js, Java, PHP, Ruby, Go
  • Framework: Express, Flask, Django, Spring, Socket.io
  • Database: MySQL, PostgreSQL, MongoDB, Redis

Payload Recommendations: The get_recommended_payloads() method returns technology-specific attack vectors. For example, if Node.js/MongoDB is detected, it prioritizes NoSQL injection over traditional SQLi wshawk/scanner_v2.py:189-192.

Sources: wshawk/scanner_v2.py:166-171, wshawk/scanner_v2.py:353-358


Smart Payload Evolution System

The Smart Payload Evolution (SPE) system represents a departure from traditional static fuzzing by introducing adaptive, context-aware payload generation.

Component Interaction

graph LR
    subgraph "Input"
        Samples["Sample Messages<br/>learning_phase()"]
        Static["Static Payloads<br/>WSPayloads.get_*()"]
    end
    
    subgraph "Context Learning"
        CAG["ContextAwareGenerator<br/>learn_from_message()"]
        Context["context dict<br/>format, fields, types"]
    end
    
    subgraph "Feedback System"
        FL["FeedbackLoop<br/>analyze_response()"]
        Baseline["baseline_metrics"]
        Signals["ResponseSignal<br/>ERROR/ANOMALY/NORMAL"]
    end
    
    subgraph "Evolution Engine"
        PE["PayloadEvolver<br/>population[]<br/>fitness_scores{}"]
        Mutate["mutate()<br/>crossover()<br/>inject()"]
    end
    
    subgraph "Output"
        Evolved["Evolved Payloads<br/>evolve(count=30)"]
        CtxGen["Context Payloads<br/>generate_payloads()"]
    end
    
    Samples --> CAG
    CAG --> Context
    
    Static --> PE
    Context --> CAG
    CAG --> CtxGen
    
    PE --> Mutate
    Mutate --> Evolved
    
    Evolved --> Scanner["Scanner<br/>test & verify"]
    Scanner --> FL
    FL --> Signals
    Signals --> PE
    PE --> |"update_fitness()"|PE

Sources: wshawk/scanner_v2.py:71-75, wshawk/scanner_v2.py:637-703

ContextAwareGenerator

The ContextAwareGenerator wshawk/smart_payloads/context_generator.py analyzes the target application's message structure to generate type-appropriate payloads.

Learning Phase: During learn_from_message() wshawk/scanner_v2.py:158-163, the generator:

  1. Parses message structure to extract field names and types
  2. Builds a schema of expected data formats
  3. Identifies fields that accept user-controlled input

Payload Generation: The generate_payloads(category, count) method wshawk/scanner_v2.py:645 creates payloads that:

  • Match the detected message format (valid JSON/XML structure)
  • Inject attack vectors into identified fields
  • Maintain syntactic validity to bypass basic input validation

Sources: wshawk/scanner_v2.py:158-163, wshawk/scanner_v2.py:645-646

PayloadEvolver

The PayloadEvolver wshawk/smart_payloads/payload_evolver.py implements genetic algorithms to breed successful attack payloads. The system maintains a population of candidate payloads, each with an associated fitness score.

Initialization: The evolver is seeded with successful payloads discovered during the heuristic scan via seed() wshawk/scanner_v2.py:233.

Evolution Process: The evolve(count) method wshawk/scanner_v2.py:640 generates new payloads through:

  1. Selection: High-fitness payloads are selected for breeding
  2. Crossover: Two parent payloads are combined to create offspring
  3. Mutation: Random character substitutions, encoding changes, and structure modifications
  4. Injection: Novel injection contexts derived from message structure analysis

Fitness Feedback: When evolved payloads trigger vulnerabilities, update_fitness() wshawk/scanner_v2.py:234, 319, 683 increases their breeding priority, causing similar payloads to proliferate in future generations.

Sources: wshawk/scanner_v2.py:637-703

FeedbackLoop

The FeedbackLoop wshawk/smart_payloads/feedback_loop.py provides real-time classification of server responses to guide the evolution engine.

Baseline Establishment: During the learning phase, establish_baseline() wshawk/scanner_v2.py:161 records normal response characteristics (timing, size, structure).

Response Analysis: The analyze_response() method wshawk/scanner_v2.py:223-225, 670-672 compares each response against the baseline to detect:

  • ERROR signals: Stack traces, error codes, unexpected status changes
  • ANOMALY signals: Significant timing deviations, response size changes
  • NORMAL signals: Responses within expected parameters

Priority Categories: The get_priority_categories() method wshawk/scanner_v2.py:643 returns attack categories ranked by their success rate, allowing the scanner to focus on the most promising attack vectors.

Sources: wshawk/scanner_v2.py:161, wshawk/scanner_v2.py:223-225, wshawk/scanner_v2.py:643-646


Resilience Control Layer

The Resilience Control Plane wraps all network operations to handle unstable targets, rate limiting, and service failures gracefully.

ResilientSession Architecture

graph TB
    subgraph "Application Layer"
        Scanner["WSHawkV2<br/>Network operations"]
        Integrations["Jira/DefectDojo/Webhooks<br/>API calls"]
        OAST["OASTProvider<br/>HTTP requests"]
    end
    
    subgraph "ResilientSession Wrapper"
        RS["ResilientSession<br/>execute_with_retry()"]
        Classify["Error Classifier<br/>Transient vs Permanent"]
        Backoff["ExponentialBackoff<br/>wait = base * 2^attempt + jitter"]
        CB["CircuitBreaker<br/>CLOSED/OPEN/HALF_OPEN"]
    end
    
    subgraph "Rate Limiting"
        TBR["TokenBucketRateLimiter<br/>acquire()"]
        Adaptive["Adaptive Rate Control<br/>Server health monitoring"]
    end
    
    subgraph "Network"
        HTTP["aiohttp.ClientSession"]
        WS["websockets.connect()"]
    end
    
    Scanner --> RS
    Integrations --> RS
    OAST --> RS
    
    RS --> Classify
    Classify --> |"Transient error"|Backoff
    Classify --> |"Permanent error"|Fail["Fail fast"]
    Backoff --> |"Retry"|RS
    
    RS --> CB
    CB --> |"threshold exceeded"|Open["Block requests<br/>60s cooldown"]
    CB --> |"recovery test"|HalfOpen["Allow 1 request"]
    
    Scanner --> TBR
    TBR --> Adaptive
    Adaptive --> |"429 response"|SlowDown["Reduce rate"]
    
    RS --> HTTP
    RS --> WS

Sources: RELEASE_SUMMARY.md:9-14, docs/V3_COMPLETE_GUIDE.md:134-173

Error Classification

The ResilientSession distinguishes between transient errors (retry) and permanent errors (fail fast):

Transient Errors:

  • HTTP 429 (Too Many Requests)
  • HTTP 503 (Service Unavailable)
  • Network timeouts
  • Connection refused (server may be restarting)

Permanent Errors:

  • HTTP 401/403 (Authentication/Authorization failures)
  • HTTP 404 (Resource not found)
  • Invalid SSL certificates
  • Protocol errors

Implementation Pattern (from V3_COMPLETE_GUIDE.md:139-160):

if self.is_transient(error):
    attempt += 1
    delay = self.calculate_backoff(attempt)
    await asyncio.sleep(delay)
else:
    self.circuit_breaker.record_failure()
    raise error

Sources: docs/V3_COMPLETE_GUIDE.md:139-160

Circuit Breaker State Machine

The circuit breaker protects downstream services from cascading failures. It implements a three-state machine:

| State | Behavior | Transition | |-------|----------|------------| | CLOSED | All requests pass through normally | After N failures → OPEN | | OPEN | All requests fail immediately | After 60s cooldown → HALF_OPEN | | HALF_OPEN | Single test request allowed | Success → CLOSED, Failure → OPEN |

State Transitions: When the failure threshold is exceeded (e.g., 5 consecutive failures), the circuit opens to prevent further damage. After a cooldown period, a single canary request is allowed. If it succeeds, normal operation resumes. If it fails, the circuit re-opens for another cooldown cycle.

Sources: RELEASE_SUMMARY.md:13, docs/V3_COMPLETE_GUIDE.md:166-169

Exponential Backoff

The backoff algorithm calculates wait times using the formula:

wait_time = min(max_delay, base_delay * 2^attempt) + random_jitter

Jitter: Random jitter (typically 0-1 seconds) prevents the "thundering herd" problem where multiple clients retry simultaneously after a failure, overwhelming the recovering server.

Adaptive Rate Limiting: The TokenBucketRateLimiter wshawk/scanner_v2.py:62-66 dynamically adjusts the request rate based on server health. When it detects 429 responses or timeouts, it reduces the token generation rate.

Sources: RELEASE_SUMMARY.md:12, docs/V3_COMPLETE_GUIDE.md:163-164, wshawk/scanner_v2.py:62-66


Persistence Layer

WSHawk v3.0.0 implements a "zero-loss persistence" architecture where all scan data is durably stored to survive crashes and power failures.

SQLite Database Schema

The persistent storage is implemented using SQLite with Write-Ahead Logging (WAL) mode. The database file is located at ~/.wshawk/scans.db by default.

Core Tables:

  • scans: Scan metadata (target URL, start/end time, statistics)
  • vulnerabilities: Individual findings with CVSS scores and payloads
  • traffic_logs: Every WebSocket frame sent and received
  • sessions: Web dashboard user sessions
  • api_keys: Programmatic access tokens

WAL Mode Benefits:

  1. Crash Recovery: Uncommitted transactions are preserved in the WAL file
  2. Concurrent Reads: Multiple processes can read while one writes
  3. Performance: Writes are buffered and flushed in batches

Database Access: The ScanDatabase class wshawk/web/database.py provides an abstraction layer for all database operations. The web dashboard and CLI both use this interface.

Sources: RELEASE_SUMMARY.md:16-19, docs/V3_COMPLETE_GUIDE.md:122-125, wshawk/scanner_v2.py:17

Report Exporter

The ReportExporter wshawk/report_exporter.py generates reports in multiple formats from the same vulnerability dataset.

Format Support:

  • HTML: Visual reports with CVSS badges, syntax highlighting, and remediation steps wshawk/enhanced_reporter.py
  • JSON: Machine-readable format for SIEM integration wshawk/scanner_v2.py:796-802
  • CSV: Spreadsheet-compatible tabular data
  • SARIF: Static Analysis Results Interchange Format for GitHub Security tab

Export Workflow: After scan completion wshawk/scanner_v2.py:796-805, the scanner iterates through configured formats and calls exporter.export() for each. The SARIF format includes precise code locations and remediation guidance for CI/CD integration.

Sources: wshawk/scanner_v2.py:68, wshawk/scanner_v2.py:796-805, RELEASE_SUMMARY.md:52-56


Component Initialization and Configuration

Configuration System

The WSHawkConfig class wshawk/config.py implements hierarchical configuration with multiple sources:

  1. Default values (embedded in code)
  2. wshawk.yaml in current directory or ~/.wshawk/
  3. Environment variables (prefixed with WSHAWK_)
  4. CLI flags (highest priority)

Secret Resolution: Configuration values can reference external sources:

  • env:VAR_NAME - Read from environment variable
  • file:path/to/secret - Read from file

This allows sensitive credentials (API keys, passwords) to be stored outside the configuration file.

Configuration Loading: The scanner initializes configuration at startup wshawk/scanner_v2.py:48-53. If no config is provided, it calls WSHawkConfig.load() which searches standard locations.

Sources: wshawk/scanner_v2.py:48-53, docs/V3_COMPLETE_GUIDE.md:299-302

Initialization Sequence

The complete initialization flow when creating a WSHawkV2 instance:

sequenceDiagram
    participant App as Application
    participant Config as WSHawkConfig
    participant Scanner as WSHawkV2.__init__
    participant Modules as Analysis Modules
    
    App->>Config: load()
    Config->>Config: search for wshawk.yaml
    Config->>Config: resolve environment variables
    Config->>Config: apply secret resolution
    Config-->>App: config object
    
    App->>Scanner: WSHawkV2(url, config=config)
    
    Scanner->>Modules: MessageAnalyzer()
    Scanner->>Modules: VulnerabilityVerifier()
    Scanner->>Modules: ServerFingerprinter()
    Scanner->>Modules: SessionStateMachine()
    
    Scanner->>Scanner: get rate_limit from config
    Scanner->>Modules: TokenBucketRateLimiter(rate_limit)
    
    Scanner->>Scanner: check config.scanner.features
    
    alt smart_payloads enabled
        Scanner->>Modules: ContextAwareGenerator()
        Scanner->>Modules: PayloadEvolver(population_size=100)
        Scanner->>Modules: FeedbackLoop()
    end
    
    Scanner->>Modules: EnhancedHTMLReporter()
    Scanner->>Modules: ReportExporter()
    Scanner->>Modules: BinaryMessageHandler()
    
    Scanner-->>App: scanner instance ready

Configuration Overrides: The advanced CLI wshawk/advanced_cli.py:86-97 demonstrates how CLI flags override configuration values. For example, --playwright sets config.scanner.features.playwright to True, which is then read by the scanner wshawk/advanced_cli.py:180-181.

Sources: wshawk/scanner_v2.py:40-100, wshawk/advanced_cli.py:86-97

Module Dependencies

Key dependencies and their roles:

| Dependency | Version | Purpose | |------------|---------|---------| | websockets | ≥12.0 | Core WebSocket protocol implementation | | aiohttp | ≥3.9.0 | HTTP client for integrations and OAST | | playwright | ≥1.40.0 | Headless browser for XSS verification | | pyyaml | ≥6.0.1 | Configuration file parsing | | flask | ≥3.0.0 | Web management dashboard | | numpy, scipy | ≥1.26.0, ≥1.11.0 | Genetic algorithm operations in PayloadEvolver | | msgpack, cbor2 | ≥1.0.7, ≥5.6.1 | Binary message format analysis |

Optional Dependencies: Playwright is optional but recommended. If not installed, browser-based XSS verification is skipped wshawk/scanner_v2.py:294-309.

Sources: requirements.txt:1-25


Integration Points

The Integration Collaboration Plane provides extensibility through well-defined interfaces:

External Platform Integrations

Jira Integration: The JiraIntegration class wshawk/integrations/jira_connector.py automatically creates tickets for high/critical findings. Configuration is loaded from integrations.jira.* settings wshawk/scanner_v2.py:823-834. The integration is fault-tolerant: failures are logged but don't block scan completion.

DefectDojo Integration: The DefectDojoIntegration wshawk/integrations/defectdojo.py pushes findings to the vulnerability management platform. It automatically creates engagements if they don't exist wshawk/scanner_v2.py:810-820.

Webhook Notifications: The WebhookNotifier wshawk/integrations/webhook.py sends real-time alerts to Slack, Discord, or Microsoft Teams with rich formatting and CVSS severity badges wshawk/scanner_v2.py:837-845.

Integration Trigger: All integrations are triggered after report generation wshawk/scanner_v2.py:807-846 in a try-except block to ensure failures don't crash the scanner.

Sources: wshawk/scanner_v2.py:807-846, RELEASE_SUMMARY.md:30-34

CLI Entry Points

The system provides four distinct CLI entry points, each mapped to a specific use case:

| Command | Entry Point | Purpose | |---------|-------------|---------| | wshawk | pyproject.tomlwshawk.cli:main | Quick scan with default settings | | wshawk-interactive | pyproject.tomlwshawk.interactive_cli:main | Menu-driven interface for beginners | | wshawk-advanced | pyproject.tomlwshawk.advanced_cli:cli | Full feature access with flags | | wshawk-defensive | pyproject.tomlwshawk.defensive_mode:main | Blue team validation tests |

CLI Argument Parsing: The advanced CLI wshawk/advanced_cli.py:12-84 demonstrates the comprehensive flag system. Flags are organized into argument groups (Integrations, Smart Payloads, Web GUI) for clarity.

Sources: wshawk/advanced_cli.py:12-84, RELEASE_SUMMARY.md:44-48


Module Relationships

The following diagram maps the complete module dependency graph, showing how code entities relate to each other:

graph TB
    subgraph "Entry Points"
        CLI["wshawk.cli:main<br/>__main__.py"]
        AdvCLI["wshawk.advanced_cli:cli<br/>advanced_cli.py"]
        IntCLI["wshawk.interactive_cli:main"]
        DefCLI["wshawk.defensive_mode:main"]
        WebApp["wshawk.web.app:run_web"]
    end
    
    subgraph "Core Engine"
        V2["WSHawkV2<br/>scanner_v2.py:35"]
        Legacy["WSScanner (Legacy)<br/>__main__.py"]
        WSP["WSPayloads<br/>__main__.py"]
    end
    
    subgraph "Analysis"
        MA["MessageAnalyzer<br/>message_intelligence.py"]
        VV["VulnerabilityVerifier<br/>vulnerability_verifier.py"]
        SF["ServerFingerprinter<br/>server_fingerprint.py"]
        SM["SessionStateMachine<br/>state_machine.py"]
    end
    
    subgraph "Smart Payloads"
        CAG["ContextAwareGenerator<br/>smart_payloads/context_generator.py"]
        PE["PayloadEvolver<br/>smart_payloads/payload_evolver.py"]
        FL["FeedbackLoop<br/>smart_payloads/feedback_loop.py"]
    end
    
    subgraph "Verification"
        HBV["HeadlessBrowserXSSVerifier<br/>headless_xss_verifier.py"]
        OAST["OASTProvider<br/>oast_provider.py"]
        SHT["SessionHijackingTester<br/>session_hijacking_tester.py"]
    end
    
    subgraph "Reporting"
        EHR["EnhancedHTMLReporter<br/>enhanced_reporter.py"]
        RE["ReportExporter<br/>report_exporter.py"]
    end
    
    subgraph "Infrastructure"
        Config["WSHawkConfig<br/>config.py"]
        RSession["ResilientSession<br/>resilient_session.py"]
        RL["TokenBucketRateLimiter<br/>rate_limiter.py"]
        DB["ScanDatabase<br/>web/database.py"]
    end
    
    subgraph "Integrations"
        JiraInt["JiraIntegration<br/>integrations/jira_connector.py"]
        DDInt["DefectDojoIntegration<br/>integrations/defectdojo.py"]
        WHInt["WebhookNotifier<br/>integrations/webhook.py"]
    end
    
    CLI --> V2
    AdvCLI --> V2
    AdvCLI --> WebApp
    IntCLI --> Legacy
    DefCLI --> DefMode["DefensiveValidator"]
    
    V2 --> Config
    V2 --> MA
    V2 --> VV
    V2 --> SF
    V2 --> SM
    V2 --> RL
    V2 --> WSP
    
    V2 --> CAG
    V2 --> PE
    V2 --> FL
    
    V2 --> HBV
    V2 --> OAST
    V2 --> SHT
    
    V2 --> EHR
    V2 --> RE
    
    V2 --> JiraInt
    V2 --> DDInt
    V2 --> WHInt
    
    JiraInt --> RSession
    DDInt --> RSession
    WHInt --> RSession
    OAST --> RSession
    
    WebApp --> DB
    WebApp --> Config
    V2 --> DB

Key Observations:

  1. Central Role of WSHawkV2: The WSHawkV2 class wshawk/scanner_v2.py:35 is the primary integration point, instantiating and coordinating all subsystems.

  2. ResilientSession as Infrastructure: The ResilientSession wraps network I/O for all external communications (integrations, OAST), providing a consistent error handling interface.

  3. Shared Database: Both the web dashboard and the scanner use ScanDatabase for persistence, enabling the web UI to display scan history from CLI runs.

  4. Configuration Cascade: The WSHawkConfig is loaded once at startup and passed to all components, ensuring consistent behavior across modules.

  5. Optional Dependencies: Verification modules (Playwright, OAST) are conditionally initialized based on configuration, allowing the scanner to operate in resource-constrained environments.

Sources: wshawk/scanner_v2.py:1-100, wshawk/advanced_cli.py:1-300