Project Structure and Architecture

Project Structure and Architecture

Relevant source files

Purpose and Scope

This document details the internal organization of the WSHawk codebase, including directory structure, module layout, package architecture, and code organization patterns. It is intended for developers who need to navigate the source code, understand component relationships, and make structural contributions.

For setting up a development environment, see Development Environment Setup. For adding specific features like new vulnerability modules or mutators, see Adding Extensions.


Package Structure Overview

WSHawk is organized as a standard Python package with a single top-level package directory (wshawk/) containing all production code, plus auxiliary directories for payloads, tests, documentation, and configuration files.

Repository Root Layout

wshawk/                          # Repository root
├── wshawk/                      # Main Python package (all production code)
│   ├── __init__.py              # Package initialization
│   ├── __main__.py              # Primary CLI entry point (wshawk command)
│   ├── interactive.py           # Interactive CLI entry point
│   ├── advanced_cli.py          # Advanced CLI entry point
│   ├── defensive_cli.py         # Defensive validation CLI entry point
│   ├── scanner_v2.py            # Core scanner engine (WSHawkV2 class)
│   ├── message_intelligence.py  # Message format detection
│   ├── server_fingerprinter.py  # Server technology fingerprinting
│   ├── vulnerability_verifier.py # Vulnerability verification logic
│   ├── session_hijacking.py     # Session security tests
│   ├── defensive_validation.py  # Defensive test orchestrator
│   ├── dns_exfiltration.py      # DNS exfiltration prevention test
│   ├── bot_detection.py         # Bot detection validation test
│   ├── cswsh_test.py            # Cross-Site WebSocket Hijacking test
│   ├── wss_security.py          # WSS/TLS security validation
│   ├── cvss_scorer.py           # CVSS v3.1 scoring engine
│   ├── report_generator.py      # HTML report generation
│   ├── logger.py                # Centralized logging
│   ├── mutators/                # Payload mutation strategies
│   │   ├── __init__.py          # Mutator factory (create_default_mutators)
│   │   ├── base_mutator.py      # BaseMutator abstract class
│   │   ├── case_mutation.py     # Case manipulation mutator
│   │   ├── encoding_mutation.py # Encoding-based mutator
│   │   ├── comment_mutation.py  # SQL/XML comment injection
│   │   ├── concat_mutation.py   # String concatenation mutator
│   │   ├── unicode_mutation.py  # Unicode/normalization mutator
│   │   └── ...                  # Additional mutator implementations
│   └── payloads/                # Payload data files
│       ├── sqli_payloads.txt    # SQL injection vectors
│       ├── xss_payloads.txt     # XSS attack vectors
│       ├── xxe_payloads.json    # XXE test payloads
│       ├── ssrf_payloads.txt    # SSRF test vectors
│       ├── nosql_payloads.txt   # NoSQL injection payloads
│       ├── path_traversal.txt   # Path traversal vectors
│       ├── cmd_injection.txt    # Command injection vectors
│       └── malicious_origins.txt # CSWSH origin test cases
├── tests/                       # Test suite
│   └── test_modules_quick.py    # Quick module tests
├── examples/                    # Usage examples
├── docs/                        # Documentation
├── pyproject.toml               # Modern Python package metadata
├── setup.py                     # Legacy setuptools configuration
├── requirements.txt             # Development dependencies
├── Dockerfile                   # Docker image definition
├── docker-compose.yml           # Docker Compose configuration
├── README.md                    # Primary documentation
├── CONTRIBUTING.md              # Contribution guidelines
├── SECURITY.md                  # Security policy
├── CODE_OF_CONDUCT.md           # Community standards
└── CHANGELOG.md                 # Version history

Sources: pyproject.toml L1-L52

setup.py L1-L63

CONTRIBUTING.md L95-L114


Entry Points and CLI Command Architecture

WSHawk exposes four CLI commands and a Python API, all defined through setuptools console script entry points. Each command maps to a specific function in the package.

Entry Point Mapping

| Command | Module Path | Function | Purpose | | --- | --- | --- | --- | | wshawk | wshawk.__main__ | cli() | Standard quick scan with all features | | wshawk-interactive | wshawk.interactive | cli() | Menu-driven interactive testing mode | | wshawk-advanced | wshawk.advanced_cli | cli() | Advanced mode with granular control flags | | wshawk-defensive | wshawk.defensive_cli | cli() | Defensive validation for blue teams |

Sources: pyproject.toml L41-L45

setup.py L41-L47


Core Module Organization

The wshawk/ package contains approximately 15-20 Python modules organized by functional responsibility. Modules are flat (no deep nesting except for mutators/ and payloads/), promoting discoverability.

Module Categories

1. Scanner Engine and Orchestration

| Module | Primary Class/Function | Responsibility | | --- | --- | --- | | scanner_v2.py | WSHawkV2 | Core scanner engine, test orchestration, learning phase | | __main__.py | cli(), WSPayloads | Primary CLI, payload loading, quick scan workflow | | interactive.py | cli() | Interactive menu system, user-guided testing | | advanced_cli.py | cli() | Advanced CLI argument parsing, feature flags | | defensive_cli.py | cli() | Defensive validation CLI orchestration |

2. Intelligence and Context Analysis

| Module | Primary Class/Function | Responsibility | | --- | --- | --- | | message_intelligence.py | MessageIntelligence | Message format detection (JSON/XML/binary), field extraction | | server_fingerprinter.py | ServerFingerprinter | Technology stack detection (database, framework, language) | | vulnerability_verifier.py | VulnerabilityVerifier | Multi-layer vulnerability verification, false positive reduction |

3. Offensive Testing Modules

| Module | Primary Class/Function | Responsibility | | --- | --- | --- | | session_hijacking.py | SessionHijackingTester | 6 session security tests (reuse, spoofing, impersonation, etc.) | | (Inferred) sql_injection.py | SQL injection detection | SQL injection testing across database types | | (Inferred) xss_detection.py | XSS detection | XSS testing with browser verification | | (Inferred) xxe_testing.py | XXE testing | XML External Entity testing with OAST | | (Inferred) ssrf_testing.py | SSRF detection | Server-Side Request Forgery testing |

4. Defensive Validation Modules

| Module | Primary Class/Function | Responsibility | | --- | --- | --- | | defensive_validation.py | DefensiveValidator | Defensive test orchestrator, result aggregation | | dns_exfiltration.py | DNSExfiltrationTest | DNS exfiltration prevention validation | | bot_detection.py | BotDetectionTest | Bot detection mechanism validation | | cswsh_test.py | CSWSHTest | Cross-Site WebSocket Hijacking testing (216+ origins) | | wss_security.py | WSSSecurityTest | WSS/TLS protocol security validation |

5. Output and Reporting

| Module | Primary Class/Function | Responsibility | | --- | --- | --- | | cvss_scorer.py | CVSSScorer | CVSS v3.1 score calculation, severity classification | | report_generator.py | ReportGenerator | HTML report generation, screenshot embedding | | logger.py | Module-level functions | Centralized logging with colored output |

Sources: CONTRIBUTING.md L95-L114


Mutator System Architecture

The mutation system follows an extensible strategy pattern with a base class and concrete implementations. All mutators reside in wshawk/mutators/.

Mutator Base Class Contract

The BaseMutator abstract class defines the interface that all mutators must implement:

Mutator Registration

The factory function create_default_mutators() in wshawk/mutators/__init__.py instantiates and returns all registered mutators. To add a new mutator:

  1. Create wshawk/mutators/new_mutator.py inheriting from BaseMutator
  2. Implement the mutate() method with mutation logic
  3. Add instantiation in create_default_mutators() function

Sources: CONTRIBUTING.md L102-L108


Payload File Organization

Payload files are stored in wshawk/payloads/ and packaged as package data (not installed as separate files). This ensures payloads are accessible when installed via pip.

Payload File Types

| File Pattern | Format | Usage | Example | | --- | --- | --- | --- | | *.txt | Plain text, one payload per line | Simple vector lists | sqli_payloads.txt, xss_payloads.txt | | *.json | Structured JSON | Complex payloads with metadata | xxe_payloads.json | | **/*.json | Nested JSON files | Categorized payloads | sqli/mysql_specific.json |

Package Data Configuration

Both pyproject.toml and setup.py declare payload files as package data, ensuring they are included in distribution packages:

pyproject.toml:

setup.py:

Payload Loading Pattern

Modules typically load payloads using importlib.resources or pkg_resources to access package data at runtime:

Sources: pyproject.toml L50-L51

setup.py L50-L55

CONTRIBUTING.md L109-L114


Dependency Architecture

WSHawk has a minimal production dependency footprint with only 4 core dependencies, plus optional development dependencies.

Production Dependencies

| Dependency | Version | Purpose | Used By | | --- | --- | --- | --- | | websockets | ≥12.0 | WebSocket client/server implementation | All scanning modules | | playwright | ≥1.40.0 | Headless browser automation | XSS verification, bot detection | | aiohttp | ≥3.9.0 | Async HTTP client | OAST integration, HTTP-based tests | | PyYAML | ≥6.0 | YAML configuration parsing | Authentication sequences, config files |

Development Dependencies

| Dependency | Version | Purpose | | --- | --- | --- | | pytest | ≥7.4.0 | Test framework | | pytest-asyncio | ≥0.21.0 | Async test support | | colorama | ≥0.4.6 | Terminal color support (optional) |

Python Version Support

WSHawk supports Python 3.8 through 3.13, declared in both package metadata files:

Classifier declarations specify explicit support for 3.8, 3.9, 3.10, 3.11, 3.12, and 3.13.

Sources: pyproject.toml L13-L34

setup.py L8-L40

requirements.txt L1-L20


Architectural Patterns

WSHawk employs several design patterns to maintain code quality and extensibility:

1. Strategy Pattern (Mutators)

The BaseMutator abstract class with concrete implementations (CaseMutator, EncodingMutator, etc.) allows runtime selection and composition of mutation strategies. The factory pattern (create_default_mutators()) centralizes mutator instantiation.

2. Facade Pattern (Scanner Engine)

WSHawkV2 acts as a facade over multiple subsystems (intelligence modules, vulnerability tests, session hijacking, reporting), providing a unified interface for all testing operations. CLI commands interact only with the scanner facade, not with individual subsystems.

3. Template Method Pattern (Vulnerability Tests)

Each vulnerability detection module follows a common workflow:

  1. Load payloads from payloads/ directory
  2. Apply mutations via mutator chain
  3. Inject into WebSocket messages
  4. Analyze responses for vulnerability indicators
  5. Calculate CVSS scores
  6. Generate findings

This pattern is implemented consistently across SQL injection, XSS, XXE, SSRF, and other modules.

4. Module-Level Separation of Concerns

Each module has a single, well-defined responsibility:

  • message_intelligence.py → Format detection only
  • server_fingerprinter.py → Technology fingerprinting only
  • cvss_scorer.py → CVSS score calculation only

This flat structure (minimal nesting) reduces coupling and improves testability.

5. Package Data Pattern

By bundling payloads as package data rather than external files, WSHawk ensures installation atomicity—a single pip install provides a fully functional scanner without requiring separate payload downloads or configuration.

Sources: CONTRIBUTING.md L42-L114

pyproject.toml L50-L51


Testing Structure

The test suite is located in tests/ directory, separate from the production package:

tests/
└── test_modules_quick.py    # Quick module validation tests

Tests are executed via pytest:

The test file test_modules_quick.py validates core module functionality without requiring external dependencies or live WebSocket servers, enabling fast CI/CD validation.

Sources: CONTRIBUTING.md L50-L61

requirements.txt L14-L16


Configuration Files

WSHawk uses a dual configuration approach for broad compatibility:

pyproject.toml (Modern)

Purpose: Primary package metadata using PEP 517/518 standards Build Backend: setuptools.build_meta Advantages: Modern, declarative, standardized

Key sections:

  • [build-system] → Build dependencies
  • [project] → Package metadata, dependencies, entry points
  • [tool.setuptools.package-data] → Package data inclusion
  • [project.urls] → Project links

setup.py (Legacy)

Purpose: Backward compatibility with older pip versions Build Backend: Direct setuptools invocation Advantages: Broad compatibility with pip <19.0

Functionality is functionally identical to pyproject.toml—both declare the same entry points, dependencies, and package data. The duplication ensures WSHawk works across all pip versions from 18.x to 24.x.

Sources: pyproject.toml L1-L52

setup.py L1-L63


Summary: Navigating the Codebase

To navigate WSHawk's codebase effectively:

  1. Entry Points → Start at wshawk/__main__.py, wshawk/interactive.py, wshawk/advanced_cli.py, or wshawk/defensive_cli.py depending on the CLI command
  2. Core Logic → Examine scanner_v2.py (WSHawkV2 class) for orchestration logic
  3. Intelligence → Review message_intelligence.py, server_fingerprinter.py, and vulnerability_verifier.py for context-aware testing
  4. Offensive Tests → Explore individual vulnerability modules (SQL, XSS, XXE, etc.) and session_hijacking.py
  5. Defensive Tests → See defensive_validation.py and its four sub-tests (DNS, bot, CSWSH, WSS)
  6. Mutations → Navigate to wshawk/mutators/ for WAF bypass strategies
  7. Payloads → Check wshawk/payloads/ for attack vector collections
  8. Output → Review cvss_scorer.py and report_generator.py for result processing

The flat module structure (single wshawk/ package directory) and consistent naming conventions (*_test.py for test modules, *_mutation.py for mutators) make the codebase highly navigable.

Sources: CONTRIBUTING.md L1-L155

pyproject.toml L1-L52

setup.py L1-L63