Payload Management System
Payload Management System
Relevant source files
Purpose and Scope
The Payload Management System is WSHawk's foundational component responsible for managing, loading, and mutating the 22,000+ attack vectors used in vulnerability detection. This system provides a centralized repository of payloads organized by attack type, supports multiple file formats, and implements an intelligent mutation engine for WAF bypass. This document covers payload file organization, loading mechanisms, mutation strategies, and integration with testing modules.
For information about how payloads are used in specific vulnerability tests, see Vulnerability Detection Modules. For guidance on adding new payload collections, see Adding Extensions.
Sources: High-level system diagrams
Payload Collection Overview
WSHawk maintains a comprehensive collection of over 22,000 attack vectors organized in the payloads/ directory. This collection supports both offensive testing (SQL injection, XSS, XXE, SSRF, NoSQL, path traversal, command injection) and defensive validation (DNS exfiltration, CSWSH).
Payload Organization Structure
Sources: pyproject.toml L51
Payload Statistics
| Attack Type | Approximate Vector Count | Primary Format | | --- | --- | --- | | SQL Injection | ~8,000 | .txt, .json | | XSS (Cross-Site Scripting) | ~6,000 | .txt, .json | | XXE (XML External Entity) | ~2,000 | .txt, .json | | SSRF (Server-Side Request Forgery) | ~1,500 | .txt | | NoSQL Injection | ~1,500 | .txt, .json | | Path Traversal | ~1,500 | .txt | | Command Injection | ~1,500 | .txt | | CSWSH (Malicious Origins) | 216+ | .txt | | Total | 22,000+ | Mixed |
Sources: High-level system diagrams
Payload File Formats
WSHawk supports two payload file formats to accommodate different payload complexity levels and context-awareness requirements.
Text-Based Payloads (.txt)
Text-based payload files contain one attack vector per line. This format is optimized for simple, context-independent payloads.
File Naming Convention: Descriptive names indicating attack type (e.g., sql_injection.txt, xss_payloads.txt, malicious_origins.txt)
Structure:
payload_string_1
payload_string_2
payload_string_3
...
Example Use Case: The malicious_origins.txt file used in CSWSH testing contains 216+ malicious origin headers for cross-site WebSocket hijacking validation.
Sources: MANIFEST.in L5
JSON-Based Payloads (.json)
JSON-based payload files support structured, context-aware payloads with metadata. These files are organized in subdirectories under payloads/ and enable advanced features like encoding variations and injection context specifications.
File Organization: Stored in subdirectories (payloads/*/*.json)
Schema Example:
Advantages:
- Context-specific payload selection based on message format intelligence
- Multiple encoding variations for WAF bypass
- Metadata for vulnerability verification confidence scoring
- Categorization via tags for targeted testing
Sources: pyproject.toml L51
Payload Loading and Management
WSPayloads Class
The WSPayloads class serves as the central payload repository, responsible for discovering, loading, and providing access to all payload collections.
Loading Workflow:
- Initialization:
WSPayloadsconstructor discovers payload files using package data paths defined inpyproject.tomlandsetup.py - Text File Loading:
load_txt_payloads()method reads.txtfiles, splits by newline, and stores in categorized dictionaries - JSON File Loading:
load_json_payloads()method parses.jsonfiles, validates schema, and builds structured payload objects - Caching: All payloads are cached in memory for fast access during scanning
- Access Methods: Modules retrieve payloads via attack-type-specific getter methods (e.g.,
get_sql_payloads(),get_xss_payloads())
Sources: pyproject.toml L51
High-level system diagrams
External File Loading
WSHawk's payload system supports external payload files, enabling users to supplement the built-in collection with custom attack vectors or organization-specific payloads.
Integration Points:
- Custom payload files can be placed in the
payloads/directory following naming conventions - The
WSPayloadsclass automatically discovers and loads files matching*.txtand**/*.jsonpatterns - No code modifications required—the file discovery mechanism is pattern-based
Sources: MANIFEST.in L5-L6
High-level system diagrams
Intelligent Mutation Engine
The mutation engine transforms base payloads into variants designed to bypass Web Application Firewalls (WAFs) and input validation filters. This system implements eight distinct evasion strategies.
Mutation Architecture
Sources: High-level system diagrams, pyproject.toml L28
BaseMutator Abstract Class
The BaseMutator class in wshawk/mutators/ defines the mutation interface. All mutation strategies inherit from this base class and implement specific transformation logic.
Key Methods:
mutate(payload: str) -> List[str]: Transforms a single payload into multiple variantsis_applicable(context: dict) -> bool: Determines if mutation applies to current message contextget_priority() -> int: Returns mutation priority for ordering
Extension Point: New mutation strategies are added by:
- Creating a new class inheriting from
BaseMutatorinwshawk/mutators/ - Implementing required abstract methods
- Registering the mutator in
wshawk/mutators/__init__.py
Sources: High-level system diagrams
WAF Bypass Strategies
The mutation engine implements eight primary WAF bypass techniques:
| Strategy | Technique | Example Transformation |
| --- | --- | --- |
| Case Variation | Alternate character casing | SELECT → SeLeCt, sELEct |
| Encoding Mutations | Apply URL/Base64/Unicode/HTML encoding | <script> → %3Cscript%3E, \u003Cscript\u003E |
| Whitespace Injection | Insert tabs, newlines, non-breaking spaces | SELECT FROM → SELECT/**/FROM, SELECT\tFROM |
| Comment Insertion | Embed SQL/JavaScript comments | SELECT → SE/**/LECT, SEL--\nECT |
| String Concatenation | Break strings into concatenated parts | 'UNION' → 'UN'+'ION', CHAR(85,78,73,79,78) |
| Double Encoding | Apply encoding twice | < → %3C → %253C |
| Null Byte Injection | Insert null bytes as separators | admin → admin%00, SELECT%00FROM |
| Obfuscation | Use hex/octal/Unicode escapes | alert → \x61\x6C\x65\x72\x74, \141\154\145\162\164 |
Sources: High-level system diagrams, pyproject.toml L28
Context-Aware Mutation Selection
The mutation engine leverages intelligence modules to select applicable mutations:
MessageIntelligence Integration:
- Detects message format (JSON/XML/Binary/Text)
- Identifies injectable fields and contexts
- Filters mutations based on format (e.g., XML comment syntax for XML messages)
ServerFingerprinter Integration:
- Determines backend technology stack (database type, framework, language)
- Prioritizes database-specific mutations (e.g., MySQL
/*! */comments, PostgreSQL$$quoting) - Adjusts encoding strategies based on server-side parsing logic
Workflow:
- Scanner captures message format and server fingerprint during learning phase (see Scanner Engine)
- Intelligence modules provide context dictionary to mutation engine
- Each mutator's
is_applicable()method evaluates relevance - Only applicable mutations are applied, reducing noise and improving efficiency
Sources: High-level system diagrams
Integration with Testing Modules
Payload Flow Architecture
Sources: High-level system diagrams
Offensive Testing Integration
Vulnerability detection modules consume mutated payloads through a standardized interface:
Injection Workflow:
- Payload Request: Vulnerability module requests payloads from
WSPayloadsby attack type - Mutation: Mutation engine generates variants based on context intelligence
- Batch Preparation: Mutated payloads are batched for rate-limited injection
- Injection: Payloads are injected into identified injectable fields in WebSocket messages
- Verification: Responses are analyzed for vulnerability indicators (see Vulnerability Detection Modules)
Example Flow for SQL Injection:
SQLInjectionTestrequests SQL payloads fromWSPayloads.get_sql_payloads()- Mutation engine applies database-specific mutations based on
ServerFingerprinterresults - Mutated payloads injected into JSON fields identified by
MessageIntelligence - Responses parsed for SQL error signatures, time-based delays, or data exfiltration patterns
Sources: High-level system diagrams
Defensive Testing Integration
Defensive validation modules consume raw payloads without mutation:
DNS Exfiltration Prevention Test:
- Loads XXE and SSRF payloads from
WSPayloads - Injects payloads with OAST callbacks to detect egress filtering gaps
- Does not mutate payloads—tests baseline security controls
CSWSH (Cross-Site WebSocket Hijacking) Test:
- Loads 216+ malicious origins from
malicious_origins.txt - Tests WebSocket Origin header validation without mutation
- Evaluates origin-based access control effectiveness
Sources: High-level system diagrams
Package Distribution
The payload collection is distributed as part of the WSHawk package through setuptools' package data mechanism.
Configuration:
recursive-include payloads *.txt
global-include payloads/*.txt
Installation Behavior:
pip install wshawkdownloads and installs all payload files- Payload files are accessible via
pkg_resourcesorimportlib.resourcesAPIs - Docker images include payloads in the container filesystem
- GitHub releases bundle payloads in source distributions
Sources: pyproject.toml L50-L51
Performance Considerations
Memory Management
Payload Caching: The WSPayloads class caches all payloads in memory upon initialization to minimize file I/O during scanning. With 22,000+ payloads averaging ~50 bytes each, the memory footprint is approximately 1.1 MB (acceptable overhead).
Mutation Efficiency: Mutation generation is lazy—variants are computed on-demand during testing rather than pre-generating all possible mutations. This reduces memory usage from potentially hundreds of megabytes to negligible levels.
Loading Optimization
Startup Time: Payload file loading occurs once during scanner initialization (see Scanner Engine). Total loading time is typically < 500ms for the full collection.
Parallelization: Text and JSON file loading can be parallelized using Python's asyncio or concurrent.futures for improved startup performance in Docker containers.
Sources: High-level system diagrams
Summary
The Payload Management System is WSHawk's centralized attack vector repository, providing:
- 22,000+ Attack Vectors organized by type in
payloads/directory - Dual Format Support via
.txt(simple) and.json(structured) files - Intelligent Mutation Engine with eight WAF bypass strategies
- Context-Aware Selection leveraging message format and server fingerprinting intelligence
- Seamless Integration with both offensive and defensive testing modules
- Extensibility through external payload file support and custom mutators
This architecture ensures WSHawk maintains a comprehensive, up-to-date payload collection while enabling advanced bypass techniques and context-specific testing strategies.
Sources: pyproject.toml L50-L51
High-level system diagrams