Analysis and Verification Modules
Analysis and Verification Modules
The following files were used as context for generating this wiki page:
- .github/workflows/ghcr-publish.yml
- README.md
- requirements.txt
- wshawk/main.py
- wshawk/advanced_cli.py
- wshawk/scanner_v2.py
Purpose and Scope
This document covers the three core analysis modules that form the intelligence layer of WSHawk's scanning engine: MessageAnalyzer, VulnerabilityVerifier, and ServerFingerprinter. These modules work together to understand message formats, verify potential vulnerabilities with low false-positive rates, and adapt testing strategies based on detected server technologies.
For information about the overall scanner orchestration, see WSHawkV2 Scanner Engine. For payload mutation and WAF evasion strategies, see Payload Mutation and WAF Evasion. For browser-based XSS verification specifically, see Playwright XSS Verification. For out-of-band vulnerability detection, see OAST Blind Vulnerability Detection.
Module Architecture Overview
The three analysis modules operate during different phases of the scanning lifecycle and provide complementary intelligence:
graph TB
subgraph "WSHawkV2 Scanner"
Scanner["WSHawkV2<br/>(scanner_v2.py)"]
end
subgraph "Analysis Modules"
MA["MessageAnalyzer<br/>(message_intelligence.py)"]
VV["VulnerabilityVerifier<br/>(vulnerability_verifier.py)"]
SF["ServerFingerprinter<br/>(server_fingerprint.py)"]
end
subgraph "Module Outputs"
Format["MessageFormat<br/>JSON/XML/BINARY/TEXT"]
Confidence["ConfidenceLevel<br/>CRITICAL/HIGH/MEDIUM/LOW"]
Fingerprint["ServerFingerprint<br/>language/framework/database"]
end
subgraph "Scanner Operations"
Learn["Learning Phase<br/>observe traffic"]
Inject["Payload Injection<br/>inject into fields"]
Verify["Response Verification<br/>detect vulnerabilities"]
Adapt["Adaptive Testing<br/>server-specific payloads"]
end
Scanner --> MA
Scanner --> VV
Scanner --> SF
MA --> Format
VV --> Confidence
SF --> Fingerprint
Learn --> MA
Learn --> SF
MA --> Inject
VV --> Verify
SF --> Adapt
Inject --> Verify
Sources: wshawk/scanner_v2.py:1-678
MessageAnalyzer
Purpose and Capabilities
The MessageAnalyzer module automatically detects the structure and format of WebSocket messages during the learning phase, then intelligently injects payloads into discovered fields. This eliminates the need for manual message structure configuration.
Key Responsibilities:
- Detect message format (JSON, XML, binary, plaintext)
- Identify injectable fields within structured messages
- Generate injected variants of base messages with payloads
- Learn patterns from observed traffic
Module Initialization and Usage
The MessageAnalyzer is instantiated during scanner initialization:
self.message_analyzer = MessageAnalyzer()
Learning Phase Integration
During the learning phase, the analyzer builds understanding of the message structure:
sequenceDiagram
participant Scanner as WSHawkV2
participant MA as MessageAnalyzer
participant SF as ServerFingerprinter
participant WS as WebSocket
Scanner->>WS: connect()
Scanner->>WS: recv() x N messages
loop Each Message
Scanner->>MA: (implicit) observe message
Scanner->>SF: add_response(message)
end
Scanner->>MA: learn_from_messages(samples)
MA->>MA: detect format
MA->>MA: identify injectable fields
MA-->>Scanner: format_info
Scanner->>MA: get_format_info()
MA-->>Scanner: {format, injectable_fields}
Scanner->>SF: fingerprint()
SF-->>Scanner: ServerFingerprint
Sources: wshawk/scanner_v2.py:87-141
The scanner calls learn_from_messages() with collected samples:
Message Format Detection
The module exposes a MessageFormat enum with detected formats:
| Format | Description | Typical Indicators |
|--------|-------------|-------------------|
| JSON | JSON-structured messages | Valid JSON parsing, contains objects/arrays |
| XML | XML-structured messages | XML tags, declaration headers |
| BINARY | Binary protocol messages | Non-UTF8 bytes, protocol buffers |
| TEXT | Plain text messages | Simple strings without structure |
Format information retrieval:
The scanner logs detected format and injectable fields for transparency.
Payload Injection
Once the format is learned, the analyzer generates injected message variants:
Injection Strategy by Format:
graph LR
Payload["Input Payload"]
subgraph "MessageAnalyzer.inject_payload_into_message()"
Format{Detected Format}
JSON["JSON Injection<br/>inject into each field"]
XML["XML Injection<br/>inject into elements/attributes"]
Text["Text Injection<br/>append/prepend payload"]
end
Output["List of Injected Messages"]
Payload --> Format
Format -->|MessageFormat.JSON| JSON
Format -->|MessageFormat.XML| XML
Format -->|other| Text
JSON --> Output
XML --> Output
Text --> Output
Example Usage in SQL Injection Testing:
If the base message is:
{"action": "search", "query": "test", "limit": 10}
The analyzer generates injected variants like:
{"action": "' OR 1=1--", "query": "test", "limit": 10}
{"action": "search", "query": "' OR 1=1--", "limit": 10}
{"action": "search", "query": "test", "limit": "' OR 1=1--"}
Sources: wshawk/scanner_v2.py:143-210, wshawk/scanner_v2.py:212-290, wshawk/scanner_v2.py:292-356
VulnerabilityVerifier
Purpose and Confidence Levels
The VulnerabilityVerifier module performs heuristic analysis on server responses to determine if a vulnerability exists and assess confidence level. This reduces false positives by going beyond simple reflection-based detection.
Confidence Levels:
| Level | Enum Value | Meaning |
|-------|-----------|---------|
| CRITICAL | ConfidenceLevel.CRITICAL | Browser-verified execution (XSS) or confirmed exploitation |
| HIGH | ConfidenceLevel.HIGH | Strong indicators, high probability of vulnerability |
| MEDIUM | ConfidenceLevel.MEDIUM | Moderate indicators, likely vulnerable |
| LOW | ConfidenceLevel.LOW | Weak indicators, possible false positive |
Module Initialization
self.verifier = VulnerabilityVerifier()
Verification Methods
The VulnerabilityVerifier provides specialized verification methods for each vulnerability type:
graph TB
VV["VulnerabilityVerifier<br/>(vulnerability_verifier.py)"]
subgraph "Verification Methods"
SQL["verify_sql_injection()<br/>SQL error patterns<br/>timing attacks"]
XSS["verify_xss()<br/>context analysis<br/>payload reflection<br/>encoding detection"]
CMD["verify_command_injection()<br/>command output patterns<br/>timing-based detection"]
Path["verify_path_traversal()<br/>file content patterns<br/>path indicators"]
end
subgraph "Return Values"
Bool["is_vulnerable: bool"]
Conf["confidence: ConfidenceLevel"]
Desc["description: str"]
end
VV --> SQL
VV --> XSS
VV --> CMD
VV --> Path
SQL --> Bool
SQL --> Conf
SQL --> Desc
XSS --> Bool
XSS --> Conf
XSS --> Desc
CMD --> Bool
CMD --> Conf
CMD --> Desc
Path --> Bool
Path --> Conf
Path --> Desc
Sources: wshawk/scanner_v2.py:182-185, wshawk/scanner_v2.py:241-243, wshawk/scanner_v2.py:329-331, wshawk/scanner_v2.py:375
SQL Injection Verification
The verifier analyzes responses for SQL error patterns and anomalies:
Verification Logic:
- Checks for database error messages (MySQL, PostgreSQL, MSSQL, Oracle)
- Analyzes query structure disruption
- Detects timing-based anomalies
- Returns confidence based on indicator strength
XSS Verification
XSS verification performs context-aware analysis:
Multi-Stage Verification:
sequenceDiagram
participant Test as test_xss_v2()
participant VV as VulnerabilityVerifier
participant HV as HeadlessBrowserXSSVerifier
Test->>VV: verify_xss(response, payload)
VV->>VV: check reflection
VV->>VV: analyze context
VV->>VV: detect encoding
VV-->>Test: (is_vuln, confidence, description)
alt confidence == HIGH
Test->>HV: verify_xss_execution(response, payload)
HV->>HV: render in browser
HV->>HV: detect script execution
HV-->>Test: (is_executed, evidence)
alt is_executed
Test->>Test: upgrade to CRITICAL
Test->>Test: mark browser_verified=True
end
end
Sources: wshawk/scanner_v2.py:248-268
The verification escalates HIGH confidence findings to browser verification (see Playwright XSS Verification).
Command Injection Verification
Command injection verification detects execution indicators:
Detection Patterns:
- Command output strings (e.g.,
uid=,gid=,groups=) - System file contents (e.g.,
/etc/passwdpatterns) - Timing-based detection for sleep commands
- Error messages from shell interpreters
Path Traversal Verification
Path traversal verification identifies file access:
Verification Indicators:
- Unix system file patterns (
root:x:0:0,/etc/passwdformat) - Windows file patterns (Windows Registry keys, system paths)
- Directory listing patterns
- File existence errors
Sources: wshawk/scanner_v2.py:358-397
ServerFingerprinter
Purpose and Capabilities
The ServerFingerprinter module identifies the underlying technology stack by analyzing server responses, error messages, and behavior patterns. This enables adaptive testing with server-specific payloads.
Detection Capabilities:
- Programming language (Python, Node.js, PHP, Java, Ruby, Go)
- Web framework (Django, Express, Flask, Spring, Rails)
- Database system (MySQL, PostgreSQL, MongoDB, MSSQL, Oracle)
Module Initialization
self.fingerprinter = ServerFingerprinter()
Fingerprinting Process
The fingerprinter accumulates evidence during the learning phase:
graph LR
subgraph "Evidence Collection"
Resp["Server Responses"]
Errors["Error Messages"]
Headers["Response Patterns"]
end
subgraph "ServerFingerprinter"
Add["add_response()"]
Analyze["Pattern Matching"]
Score["Confidence Scoring"]
end
subgraph "Output"
FP["ServerFingerprint Object"]
Lang["language: str"]
FW["framework: str"]
DB["database: str"]
end
Resp --> Add
Errors --> Add
Headers --> Add
Add --> Analyze
Analyze --> Score
Score --> FP
FP --> Lang
FP --> FW
FP --> DB
Learning Phase Integration:
Each received message is added to the fingerprinter for analysis, then fingerprint() is called to generate conclusions.
Recommended Payload Generation
Once the server is fingerprinted, the scanner requests server-specific payloads:
SQL Injection Example:
If fingerprint.database == "MySQL", recommended payloads might include:
- MySQL-specific functions:
SLEEP(5),BENCHMARK() - MySQL comment syntax:
--,# - MySQL error messages:
CAST(),CONVERT()
Command Injection Example:
If fingerprint.language == "Node.js", recommended payloads might include:
- Node-specific syntax:
require('child_process').exec() - JavaScript eval patterns
- Node.js error patterns
Fingerprint Information Export
The fingerprint data is included in reports:
Sources: wshawk/scanner_v2.py:131-136, wshawk/scanner_v2.py:153-158, wshawk/scanner_v2.py:302-307, wshawk/scanner_v2.py:657-663
Integration with Scanner
Data Flow During Scanning
The three modules work together throughout the scanning lifecycle:
graph TB
subgraph "Phase 1: Learning"
Connect["Connect to WebSocket"]
Observe["Observe Traffic<br/>5 seconds"]
Learn["MessageAnalyzer<br/>learn_from_messages()"]
Finger["ServerFingerprinter<br/>fingerprint()"]
end
subgraph "Phase 2: Testing"
GetBase["Get Base Message<br/>from samples"]
Inject["MessageAnalyzer<br/>inject_payload_into_message()"]
GetRec["ServerFingerprinter<br/>get_recommended_payloads()"]
Send["Send Injected Messages"]
Receive["Receive Responses"]
end
subgraph "Phase 3: Verification"
Verify["VulnerabilityVerifier<br/>verify_*()"]
Assess["Assess Confidence"]
Browser["Browser Verification<br/>(if HIGH)"]
Record["Record Vulnerability"]
end
Connect --> Observe
Observe --> Learn
Observe --> Finger
Learn --> GetBase
Finger --> GetRec
GetBase --> Inject
GetRec --> Inject
Inject --> Send
Send --> Receive
Receive --> Verify
Verify --> Assess
Assess --> Browser
Assess --> Record
Sources: wshawk/scanner_v2.py:542-677
Module State and Lifecycle
Each module maintains state throughout the scan:
| Module | State Tracked | Lifecycle |
|--------|--------------|-----------|
| MessageAnalyzer | detected_format, learned field names | Initialized once, learns during learning phase, used throughout testing |
| VulnerabilityVerifier | Stateless (pure verification logic) | Initialized once, called repeatedly during testing |
| ServerFingerprinter | Accumulated response patterns, confidence scores | Initialized once, accumulates evidence during learning, queried during testing |
Example: SQL Injection Test Flow
Complete flow showing module interaction:
- Get base message from
self.sample_messages[0](populated byMessageAnalyzerduring learning) - Get recommended payloads from
ServerFingerprinterif database detected - Inject payloads using
MessageAnalyzer.inject_payload_into_message() - Send and receive messages via WebSocket
- Verify responses using
VulnerabilityVerifier.verify_sql_injection() - Record vulnerabilities if confidence level is MEDIUM or higher
Module Configuration Flags
The scanner exposes configuration for module behavior:
learning_complete: Flag indicating if learning phase succeededuse_headless_browser: Enable browser-based XSS verification (escalation fromVulnerabilityVerifier)use_oast: Enable out-of-band testing for blind vulnerabilities
Sources: wshawk/scanner_v2.py:28-76, wshawk/scanner_v2.py:143-210
Summary
The analysis and verification modules form the intelligence layer of WSHawk:
| Module | Primary Function | Key Output | Integration Point |
|--------|------------------|------------|-------------------|
| MessageAnalyzer | Understand message structure and inject payloads | MessageFormat, injected messages | Learning phase, all vulnerability tests |
| VulnerabilityVerifier | Heuristically verify vulnerabilities from responses | ConfidenceLevel, descriptions | All vulnerability tests |
| ServerFingerprinter | Identify server technology stack | ServerFingerprint, recommended payloads | Learning phase, adaptive payload selection |
These modules enable WSHawk to:
- Operate without configuration: Automatically learns message formats
- Reduce false positives: Multi-factor verification with confidence levels
- Adapt to targets: Server-specific payload recommendations
- Escalate verification: Integration with browser and OAST verification for high-confidence findings
Sources: wshawk/scanner_v2.py:1-678, CHANGELOG.md:1-101