Payload Management System

Payload Management System

Relevant source files

Purpose and Scope

The Payload Management System is WSHawk's foundational component responsible for managing, loading, and mutating the 22,000+ attack vectors used in vulnerability detection. This system provides a centralized repository of payloads organized by attack type, supports multiple file formats, and implements an intelligent mutation engine for WAF bypass. This document covers payload file organization, loading mechanisms, mutation strategies, and integration with testing modules.

For information about how payloads are used in specific vulnerability tests, see Vulnerability Detection Modules. For guidance on adding new payload collections, see Adding Extensions.

Sources: High-level system diagrams


Payload Collection Overview

WSHawk maintains a comprehensive collection of over 22,000 attack vectors organized in the payloads/ directory. This collection supports both offensive testing (SQL injection, XSS, XXE, SSRF, NoSQL, path traversal, command injection) and defensive validation (DNS exfiltration, CSWSH).

Payload Organization Structure

Sources: pyproject.toml L51

setup.py L52-L53

MANIFEST.in L5

Payload Statistics

| Attack Type | Approximate Vector Count | Primary Format | | --- | --- | --- | | SQL Injection | ~8,000 | .txt, .json | | XSS (Cross-Site Scripting) | ~6,000 | .txt, .json | | XXE (XML External Entity) | ~2,000 | .txt, .json | | SSRF (Server-Side Request Forgery) | ~1,500 | .txt | | NoSQL Injection | ~1,500 | .txt, .json | | Path Traversal | ~1,500 | .txt | | Command Injection | ~1,500 | .txt | | CSWSH (Malicious Origins) | 216+ | .txt | | Total | 22,000+ | Mixed |

Sources: High-level system diagrams


Payload File Formats

WSHawk supports two payload file formats to accommodate different payload complexity levels and context-awareness requirements.

Text-Based Payloads (.txt)

Text-based payload files contain one attack vector per line. This format is optimized for simple, context-independent payloads.

File Naming Convention: Descriptive names indicating attack type (e.g., sql_injection.txt, xss_payloads.txt, malicious_origins.txt)

Structure:

payload_string_1
payload_string_2
payload_string_3
...

Example Use Case: The malicious_origins.txt file used in CSWSH testing contains 216+ malicious origin headers for cross-site WebSocket hijacking validation.

Sources: MANIFEST.in L5

pyproject.toml L51

JSON-Based Payloads (.json)

JSON-based payload files support structured, context-aware payloads with metadata. These files are organized in subdirectories under payloads/ and enable advanced features like encoding variations and injection context specifications.

File Organization: Stored in subdirectories (payloads/*/*.json)

Schema Example:

Advantages:

  • Context-specific payload selection based on message format intelligence
  • Multiple encoding variations for WAF bypass
  • Metadata for vulnerability verification confidence scoring
  • Categorization via tags for targeted testing

Sources: pyproject.toml L51

setup.py L52-L53


Payload Loading and Management

WSPayloads Class

The WSPayloads class serves as the central payload repository, responsible for discovering, loading, and providing access to all payload collections.

Loading Workflow:

  1. Initialization: WSPayloads constructor discovers payload files using package data paths defined in pyproject.toml and setup.py
  2. Text File Loading: load_txt_payloads() method reads .txt files, splits by newline, and stores in categorized dictionaries
  3. JSON File Loading: load_json_payloads() method parses .json files, validates schema, and builds structured payload objects
  4. Caching: All payloads are cached in memory for fast access during scanning
  5. Access Methods: Modules retrieve payloads via attack-type-specific getter methods (e.g., get_sql_payloads(), get_xss_payloads())

Sources: pyproject.toml L51

setup.py L50-L54

High-level system diagrams

External File Loading

WSHawk's payload system supports external payload files, enabling users to supplement the built-in collection with custom attack vectors or organization-specific payloads.

Integration Points:

  • Custom payload files can be placed in the payloads/ directory following naming conventions
  • The WSPayloads class automatically discovers and loads files matching *.txt and **/*.json patterns
  • No code modifications required—the file discovery mechanism is pattern-based

Sources: MANIFEST.in L5-L6

High-level system diagrams


Intelligent Mutation Engine

The mutation engine transforms base payloads into variants designed to bypass Web Application Firewalls (WAFs) and input validation filters. This system implements eight distinct evasion strategies.

Mutation Architecture

Sources: High-level system diagrams, pyproject.toml L28

BaseMutator Abstract Class

The BaseMutator class in wshawk/mutators/ defines the mutation interface. All mutation strategies inherit from this base class and implement specific transformation logic.

Key Methods:

  • mutate(payload: str) -> List[str]: Transforms a single payload into multiple variants
  • is_applicable(context: dict) -> bool: Determines if mutation applies to current message context
  • get_priority() -> int: Returns mutation priority for ordering

Extension Point: New mutation strategies are added by:

  1. Creating a new class inheriting from BaseMutator in wshawk/mutators/
  2. Implementing required abstract methods
  3. Registering the mutator in wshawk/mutators/__init__.py

Sources: High-level system diagrams

WAF Bypass Strategies

The mutation engine implements eight primary WAF bypass techniques:

| Strategy | Technique | Example Transformation | | --- | --- | --- | | Case Variation | Alternate character casing | SELECTSeLeCt, sELEct | | Encoding Mutations | Apply URL/Base64/Unicode/HTML encoding | <script>%3Cscript%3E, \u003Cscript\u003E | | Whitespace Injection | Insert tabs, newlines, non-breaking spaces | SELECT FROMSELECT/**/FROM, SELECT\tFROM | | Comment Insertion | Embed SQL/JavaScript comments | SELECTSE/**/LECT, SEL--\nECT | | String Concatenation | Break strings into concatenated parts | 'UNION''UN'+'ION', CHAR(85,78,73,79,78) | | Double Encoding | Apply encoding twice | <%3C%253C | | Null Byte Injection | Insert null bytes as separators | adminadmin%00, SELECT%00FROM | | Obfuscation | Use hex/octal/Unicode escapes | alert\x61\x6C\x65\x72\x74, \141\154\145\162\164 |

Sources: High-level system diagrams, pyproject.toml L28

Context-Aware Mutation Selection

The mutation engine leverages intelligence modules to select applicable mutations:

MessageIntelligence Integration:

  • Detects message format (JSON/XML/Binary/Text)
  • Identifies injectable fields and contexts
  • Filters mutations based on format (e.g., XML comment syntax for XML messages)

ServerFingerprinter Integration:

  • Determines backend technology stack (database type, framework, language)
  • Prioritizes database-specific mutations (e.g., MySQL /*! */ comments, PostgreSQL $$ quoting)
  • Adjusts encoding strategies based on server-side parsing logic

Workflow:

  1. Scanner captures message format and server fingerprint during learning phase (see Scanner Engine)
  2. Intelligence modules provide context dictionary to mutation engine
  3. Each mutator's is_applicable() method evaluates relevance
  4. Only applicable mutations are applied, reducing noise and improving efficiency

Sources: High-level system diagrams


Integration with Testing Modules

Payload Flow Architecture

Sources: High-level system diagrams

Offensive Testing Integration

Vulnerability detection modules consume mutated payloads through a standardized interface:

Injection Workflow:

  1. Payload Request: Vulnerability module requests payloads from WSPayloads by attack type
  2. Mutation: Mutation engine generates variants based on context intelligence
  3. Batch Preparation: Mutated payloads are batched for rate-limited injection
  4. Injection: Payloads are injected into identified injectable fields in WebSocket messages
  5. Verification: Responses are analyzed for vulnerability indicators (see Vulnerability Detection Modules)

Example Flow for SQL Injection:

  • SQLInjectionTest requests SQL payloads from WSPayloads.get_sql_payloads()
  • Mutation engine applies database-specific mutations based on ServerFingerprinter results
  • Mutated payloads injected into JSON fields identified by MessageIntelligence
  • Responses parsed for SQL error signatures, time-based delays, or data exfiltration patterns

Sources: High-level system diagrams

Defensive Testing Integration

Defensive validation modules consume raw payloads without mutation:

DNS Exfiltration Prevention Test:

  • Loads XXE and SSRF payloads from WSPayloads
  • Injects payloads with OAST callbacks to detect egress filtering gaps
  • Does not mutate payloads—tests baseline security controls

CSWSH (Cross-Site WebSocket Hijacking) Test:

  • Loads 216+ malicious origins from malicious_origins.txt
  • Tests WebSocket Origin header validation without mutation
  • Evaluates origin-based access control effectiveness

Sources: High-level system diagrams


Package Distribution

The payload collection is distributed as part of the WSHawk package through setuptools' package data mechanism.

Configuration:

pyproject.toml L50-L51

setup.py L50-L54

MANIFEST.in L5-L6

recursive-include payloads *.txt
global-include payloads/*.txt

Installation Behavior:

  • pip install wshawk downloads and installs all payload files
  • Payload files are accessible via pkg_resources or importlib.resources APIs
  • Docker images include payloads in the container filesystem
  • GitHub releases bundle payloads in source distributions

Sources: pyproject.toml L50-L51

setup.py L50-L54

MANIFEST.in L5-L6


Performance Considerations

Memory Management

Payload Caching: The WSPayloads class caches all payloads in memory upon initialization to minimize file I/O during scanning. With 22,000+ payloads averaging ~50 bytes each, the memory footprint is approximately 1.1 MB (acceptable overhead).

Mutation Efficiency: Mutation generation is lazy—variants are computed on-demand during testing rather than pre-generating all possible mutations. This reduces memory usage from potentially hundreds of megabytes to negligible levels.

Loading Optimization

Startup Time: Payload file loading occurs once during scanner initialization (see Scanner Engine). Total loading time is typically < 500ms for the full collection.

Parallelization: Text and JSON file loading can be parallelized using Python's asyncio or concurrent.futures for improved startup performance in Docker containers.

Sources: High-level system diagrams


Summary

The Payload Management System is WSHawk's centralized attack vector repository, providing:

  • 22,000+ Attack Vectors organized by type in payloads/ directory
  • Dual Format Support via .txt (simple) and .json (structured) files
  • Intelligent Mutation Engine with eight WAF bypass strategies
  • Context-Aware Selection leveraging message format and server fingerprinting intelligence
  • Seamless Integration with both offensive and defensive testing modules
  • Extensibility through external payload file support and custom mutators

This architecture ensures WSHawk maintains a comprehensive, up-to-date payload collection while enabling advanced bypass techniques and context-specific testing strategies.

Sources: pyproject.toml L50-L51

setup.py L50-L54

MANIFEST.in L5-L6

High-level system diagrams