Analysis and Verification Modules

Analysis and Verification Modules

The following files were used as context for generating this wiki page:

Purpose and Scope

This document covers the three core analysis modules that form the intelligence layer of WSHawk's scanning engine: MessageAnalyzer, VulnerabilityVerifier, and ServerFingerprinter. These modules work together to understand message formats, verify potential vulnerabilities with low false-positive rates, and adapt testing strategies based on detected server technologies.

For information about the overall scanner orchestration, see WSHawkV2 Scanner Engine. For payload mutation and WAF evasion strategies, see Payload Mutation and WAF Evasion. For browser-based XSS verification specifically, see Playwright XSS Verification. For out-of-band vulnerability detection, see OAST Blind Vulnerability Detection.


Module Architecture Overview

The three analysis modules operate during different phases of the scanning lifecycle and provide complementary intelligence:

graph TB
    subgraph "WSHawkV2 Scanner"
        Scanner["WSHawkV2<br/>(scanner_v2.py)"]
    end
    
    subgraph "Analysis Modules"
        MA["MessageAnalyzer<br/>(message_intelligence.py)"]
        VV["VulnerabilityVerifier<br/>(vulnerability_verifier.py)"]
        SF["ServerFingerprinter<br/>(server_fingerprint.py)"]
    end
    
    subgraph "Module Outputs"
        Format["MessageFormat<br/>JSON/XML/BINARY/TEXT"]
        Confidence["ConfidenceLevel<br/>CRITICAL/HIGH/MEDIUM/LOW"]
        Fingerprint["ServerFingerprint<br/>language/framework/database"]
    end
    
    subgraph "Scanner Operations"
        Learn["Learning Phase<br/>observe traffic"]
        Inject["Payload Injection<br/>inject into fields"]
        Verify["Response Verification<br/>detect vulnerabilities"]
        Adapt["Adaptive Testing<br/>server-specific payloads"]
    end
    
    Scanner --> MA
    Scanner --> VV
    Scanner --> SF
    
    MA --> Format
    VV --> Confidence
    SF --> Fingerprint
    
    Learn --> MA
    Learn --> SF
    MA --> Inject
    VV --> Verify
    SF --> Adapt
    
    Inject --> Verify

Sources: wshawk/scanner_v2.py:1-678


MessageAnalyzer

Purpose and Capabilities

The MessageAnalyzer module automatically detects the structure and format of WebSocket messages during the learning phase, then intelligently injects payloads into discovered fields. This eliminates the need for manual message structure configuration.

Key Responsibilities:

  • Detect message format (JSON, XML, binary, plaintext)
  • Identify injectable fields within structured messages
  • Generate injected variants of base messages with payloads
  • Learn patterns from observed traffic

Module Initialization and Usage

The MessageAnalyzer is instantiated during scanner initialization:

wshawk/scanner_v2.py:41

self.message_analyzer = MessageAnalyzer()

Learning Phase Integration

During the learning phase, the analyzer builds understanding of the message structure:

sequenceDiagram
    participant Scanner as WSHawkV2
    participant MA as MessageAnalyzer
    participant SF as ServerFingerprinter
    participant WS as WebSocket
    
    Scanner->>WS: connect()
    Scanner->>WS: recv() x N messages
    loop Each Message
        Scanner->>MA: (implicit) observe message
        Scanner->>SF: add_response(message)
    end
    
    Scanner->>MA: learn_from_messages(samples)
    MA->>MA: detect format
    MA->>MA: identify injectable fields
    MA-->>Scanner: format_info
    
    Scanner->>MA: get_format_info()
    MA-->>Scanner: {format, injectable_fields}
    
    Scanner->>SF: fingerprint()
    SF-->>Scanner: ServerFingerprint

Sources: wshawk/scanner_v2.py:87-141

The scanner calls learn_from_messages() with collected samples:

wshawk/scanner_v2.py:120-129

Message Format Detection

The module exposes a MessageFormat enum with detected formats:

| Format | Description | Typical Indicators | |--------|-------------|-------------------| | JSON | JSON-structured messages | Valid JSON parsing, contains objects/arrays | | XML | XML-structured messages | XML tags, declaration headers | | BINARY | Binary protocol messages | Non-UTF8 bytes, protocol buffers | | TEXT | Plain text messages | Simple strings without structure |

Format information retrieval:

wshawk/scanner_v2.py:125-129

The scanner logs detected format and injectable fields for transparency.

Payload Injection

Once the format is learned, the analyzer generates injected message variants:

wshawk/scanner_v2.py:166-169

Injection Strategy by Format:

graph LR
    Payload["Input Payload"]
    
    subgraph "MessageAnalyzer.inject_payload_into_message()"
        Format{Detected Format}
        
        JSON["JSON Injection<br/>inject into each field"]
        XML["XML Injection<br/>inject into elements/attributes"]
        Text["Text Injection<br/>append/prepend payload"]
    end
    
    Output["List of Injected Messages"]
    
    Payload --> Format
    Format -->|MessageFormat.JSON| JSON
    Format -->|MessageFormat.XML| XML
    Format -->|other| Text
    
    JSON --> Output
    XML --> Output
    Text --> Output

Example Usage in SQL Injection Testing:

wshawk/scanner_v2.py:166-170

If the base message is:

{"action": "search", "query": "test", "limit": 10}

The analyzer generates injected variants like:

{"action": "' OR 1=1--", "query": "test", "limit": 10}
{"action": "search", "query": "' OR 1=1--", "limit": 10}
{"action": "search", "query": "test", "limit": "' OR 1=1--"}

Sources: wshawk/scanner_v2.py:143-210, wshawk/scanner_v2.py:212-290, wshawk/scanner_v2.py:292-356


VulnerabilityVerifier

Purpose and Confidence Levels

The VulnerabilityVerifier module performs heuristic analysis on server responses to determine if a vulnerability exists and assess confidence level. This reduces false positives by going beyond simple reflection-based detection.

Confidence Levels:

| Level | Enum Value | Meaning | |-------|-----------|---------| | CRITICAL | ConfidenceLevel.CRITICAL | Browser-verified execution (XSS) or confirmed exploitation | | HIGH | ConfidenceLevel.HIGH | Strong indicators, high probability of vulnerability | | MEDIUM | ConfidenceLevel.MEDIUM | Moderate indicators, likely vulnerable | | LOW | ConfidenceLevel.LOW | Weak indicators, possible false positive |

Module Initialization

wshawk/scanner_v2.py:42

self.verifier = VulnerabilityVerifier()

Verification Methods

The VulnerabilityVerifier provides specialized verification methods for each vulnerability type:

graph TB
    VV["VulnerabilityVerifier<br/>(vulnerability_verifier.py)"]
    
    subgraph "Verification Methods"
        SQL["verify_sql_injection()<br/>SQL error patterns<br/>timing attacks"]
        XSS["verify_xss()<br/>context analysis<br/>payload reflection<br/>encoding detection"]
        CMD["verify_command_injection()<br/>command output patterns<br/>timing-based detection"]
        Path["verify_path_traversal()<br/>file content patterns<br/>path indicators"]
    end
    
    subgraph "Return Values"
        Bool["is_vulnerable: bool"]
        Conf["confidence: ConfidenceLevel"]
        Desc["description: str"]
    end
    
    VV --> SQL
    VV --> XSS
    VV --> CMD
    VV --> Path
    
    SQL --> Bool
    SQL --> Conf
    SQL --> Desc
    
    XSS --> Bool
    XSS --> Conf
    XSS --> Desc
    
    CMD --> Bool
    CMD --> Conf
    CMD --> Desc
    
    Path --> Bool
    Path --> Conf
    Path --> Desc

Sources: wshawk/scanner_v2.py:182-185, wshawk/scanner_v2.py:241-243, wshawk/scanner_v2.py:329-331, wshawk/scanner_v2.py:375

SQL Injection Verification

The verifier analyzes responses for SQL error patterns and anomalies:

wshawk/scanner_v2.py:182-199

Verification Logic:

  1. Checks for database error messages (MySQL, PostgreSQL, MSSQL, Oracle)
  2. Analyzes query structure disruption
  3. Detects timing-based anomalies
  4. Returns confidence based on indicator strength

XSS Verification

XSS verification performs context-aware analysis:

wshawk/scanner_v2.py:241-280

Multi-Stage Verification:

sequenceDiagram
    participant Test as test_xss_v2()
    participant VV as VulnerabilityVerifier
    participant HV as HeadlessBrowserXSSVerifier
    
    Test->>VV: verify_xss(response, payload)
    VV->>VV: check reflection
    VV->>VV: analyze context
    VV->>VV: detect encoding
    VV-->>Test: (is_vuln, confidence, description)
    
    alt confidence == HIGH
        Test->>HV: verify_xss_execution(response, payload)
        HV->>HV: render in browser
        HV->>HV: detect script execution
        HV-->>Test: (is_executed, evidence)
        
        alt is_executed
            Test->>Test: upgrade to CRITICAL
            Test->>Test: mark browser_verified=True
        end
    end

Sources: wshawk/scanner_v2.py:248-268

The verification escalates HIGH confidence findings to browser verification (see Playwright XSS Verification).

Command Injection Verification

Command injection verification detects execution indicators:

wshawk/scanner_v2.py:329-346

Detection Patterns:

  • Command output strings (e.g., uid=, gid=, groups=)
  • System file contents (e.g., /etc/passwd patterns)
  • Timing-based detection for sleep commands
  • Error messages from shell interpreters

Path Traversal Verification

Path traversal verification identifies file access:

wshawk/scanner_v2.py:375-388

Verification Indicators:

  • Unix system file patterns (root:x:0:0, /etc/passwd format)
  • Windows file patterns (Windows Registry keys, system paths)
  • Directory listing patterns
  • File existence errors

Sources: wshawk/scanner_v2.py:358-397


ServerFingerprinter

Purpose and Capabilities

The ServerFingerprinter module identifies the underlying technology stack by analyzing server responses, error messages, and behavior patterns. This enables adaptive testing with server-specific payloads.

Detection Capabilities:

  • Programming language (Python, Node.js, PHP, Java, Ruby, Go)
  • Web framework (Django, Express, Flask, Spring, Rails)
  • Database system (MySQL, PostgreSQL, MongoDB, MSSQL, Oracle)

Module Initialization

wshawk/scanner_v2.py:43

self.fingerprinter = ServerFingerprinter()

Fingerprinting Process

The fingerprinter accumulates evidence during the learning phase:

graph LR
    subgraph "Evidence Collection"
        Resp["Server Responses"]
        Errors["Error Messages"]
        Headers["Response Patterns"]
    end
    
    subgraph "ServerFingerprinter"
        Add["add_response()"]
        Analyze["Pattern Matching"]
        Score["Confidence Scoring"]
    end
    
    subgraph "Output"
        FP["ServerFingerprint Object"]
        Lang["language: str"]
        FW["framework: str"]
        DB["database: str"]
    end
    
    Resp --> Add
    Errors --> Add
    Headers --> Add
    
    Add --> Analyze
    Analyze --> Score
    Score --> FP
    
    FP --> Lang
    FP --> FW
    FP --> DB

Learning Phase Integration:

wshawk/scanner_v2.py:105-136

Each received message is added to the fingerprinter for analysis, then fingerprint() is called to generate conclusions.

Recommended Payload Generation

Once the server is fingerprinted, the scanner requests server-specific payloads:

wshawk/scanner_v2.py:153-158

SQL Injection Example:

If fingerprint.database == "MySQL", recommended payloads might include:

  • MySQL-specific functions: SLEEP(5), BENCHMARK()
  • MySQL comment syntax: -- , #
  • MySQL error messages: CAST(), CONVERT()

Command Injection Example:

wshawk/scanner_v2.py:302-307

If fingerprint.language == "Node.js", recommended payloads might include:

  • Node-specific syntax: require('child_process').exec()
  • JavaScript eval patterns
  • Node.js error patterns

Fingerprint Information Export

The fingerprint data is included in reports:

wshawk/scanner_v2.py:657-663

Sources: wshawk/scanner_v2.py:131-136, wshawk/scanner_v2.py:153-158, wshawk/scanner_v2.py:302-307, wshawk/scanner_v2.py:657-663


Integration with Scanner

Data Flow During Scanning

The three modules work together throughout the scanning lifecycle:

graph TB
    subgraph "Phase 1: Learning"
        Connect["Connect to WebSocket"]
        Observe["Observe Traffic<br/>5 seconds"]
        Learn["MessageAnalyzer<br/>learn_from_messages()"]
        Finger["ServerFingerprinter<br/>fingerprint()"]
    end
    
    subgraph "Phase 2: Testing"
        GetBase["Get Base Message<br/>from samples"]
        Inject["MessageAnalyzer<br/>inject_payload_into_message()"]
        GetRec["ServerFingerprinter<br/>get_recommended_payloads()"]
        Send["Send Injected Messages"]
        Receive["Receive Responses"]
    end
    
    subgraph "Phase 3: Verification"
        Verify["VulnerabilityVerifier<br/>verify_*()"]
        Assess["Assess Confidence"]
        Browser["Browser Verification<br/>(if HIGH)"]
        Record["Record Vulnerability"]
    end
    
    Connect --> Observe
    Observe --> Learn
    Observe --> Finger
    
    Learn --> GetBase
    Finger --> GetRec
    GetBase --> Inject
    GetRec --> Inject
    Inject --> Send
    Send --> Receive
    
    Receive --> Verify
    Verify --> Assess
    Assess --> Browser
    Assess --> Record

Sources: wshawk/scanner_v2.py:542-677

Module State and Lifecycle

Each module maintains state throughout the scan:

| Module | State Tracked | Lifecycle | |--------|--------------|-----------| | MessageAnalyzer | detected_format, learned field names | Initialized once, learns during learning phase, used throughout testing | | VulnerabilityVerifier | Stateless (pure verification logic) | Initialized once, called repeatedly during testing | | ServerFingerprinter | Accumulated response patterns, confidence scores | Initialized once, accumulates evidence during learning, queried during testing |

Example: SQL Injection Test Flow

Complete flow showing module interaction:

wshawk/scanner_v2.py:143-210

  1. Get base message from self.sample_messages[0] (populated by MessageAnalyzer during learning)
  2. Get recommended payloads from ServerFingerprinter if database detected
  3. Inject payloads using MessageAnalyzer.inject_payload_into_message()
  4. Send and receive messages via WebSocket
  5. Verify responses using VulnerabilityVerifier.verify_sql_injection()
  6. Record vulnerabilities if confidence level is MEDIUM or higher

Module Configuration Flags

The scanner exposes configuration for module behavior:

wshawk/scanner_v2.py:52-62

  • learning_complete: Flag indicating if learning phase succeeded
  • use_headless_browser: Enable browser-based XSS verification (escalation from VulnerabilityVerifier)
  • use_oast: Enable out-of-band testing for blind vulnerabilities

Sources: wshawk/scanner_v2.py:28-76, wshawk/scanner_v2.py:143-210


Summary

The analysis and verification modules form the intelligence layer of WSHawk:

| Module | Primary Function | Key Output | Integration Point | |--------|------------------|------------|-------------------| | MessageAnalyzer | Understand message structure and inject payloads | MessageFormat, injected messages | Learning phase, all vulnerability tests | | VulnerabilityVerifier | Heuristically verify vulnerabilities from responses | ConfidenceLevel, descriptions | All vulnerability tests | | ServerFingerprinter | Identify server technology stack | ServerFingerprint, recommended payloads | Learning phase, adaptive payload selection |

These modules enable WSHawk to:

  • Operate without configuration: Automatically learns message formats
  • Reduce false positives: Multi-factor verification with confidence levels
  • Adapt to targets: Server-specific payload recommendations
  • Escalate verification: Integration with browser and OAST verification for high-confidence findings

Sources: wshawk/scanner_v2.py:1-678, CHANGELOG.md:1-101