Configuration System

Configuration System

The following files were used as context for generating this wiki page:

Purpose and Scope

This document describes WSHawk's hierarchical configuration system, which provides a unified interface for managing scanner settings, integration credentials, and operational parameters. The configuration system supports YAML-based configuration files (wshawk.yaml), environment variable resolution, and runtime overrides through the CLI.

For information about integration-specific setup, see Jira Integration, DefectDojo Integration, and Webhook Notifications. For CLI flag documentation, see CLI Command Reference and Advanced CLI Options.


Configuration Architecture

Overview

WSHawk's configuration system is built around the WSHawkConfig class, which manages a hierarchical key-value store with support for:

  • Hierarchical YAML files with dot-notation key access (scanner.rate_limit)
  • Environment variable resolution using the env: prefix
  • File-based secret resolution using the file: prefix
  • Runtime overrides via CLI flags
  • Default fallback values when keys are missing

Configuration Loading Flow

flowchart TD
    CLI["CLI Entry Point<br/>(wshawk, wshawk-advanced)"]
    LoadConfig["WSHawkConfig.load()"]
    CheckFile{"wshawk.yaml<br/>exists?"}
    ParseYAML["Parse YAML<br/>with PyYAML"]
    ApplyDefaults["Apply Default Values"]
    ResolveSecrets["Resolve Secrets<br/>(env:, file: prefixes)"]
    CLIOverride["Apply CLI Flag Overrides<br/>(config.set())"]
    Scanner["WSHawkV2 Scanner<br/>(config.get())"]
    WebApp["Web Dashboard<br/>(config.get())"]
    
    CLI --> LoadConfig
    LoadConfig --> CheckFile
    CheckFile -->|Yes| ParseYAML
    CheckFile -->|No| ApplyDefaults
    ParseYAML --> ResolveSecrets
    ResolveSecrets --> ApplyDefaults
    ApplyDefaults --> CLIOverride
    CLIOverride --> Scanner
    CLIOverride --> WebApp
    
    style LoadConfig fill:#f9f9f9
    style ResolveSecrets fill:#f9f9f9

Sources: wshawk/scanner_v2.py:48-53, wshawk/advanced_cli.py:86-97


Configuration File Format

wshawk.yaml Structure

The configuration file uses YAML format with a hierarchical structure organized into functional sections:

# Scanner Configuration
scanner:
  rate_limit: 10              # Requests per second
  timeout: 5                  # Connection timeout in seconds
  features:
    playwright: false         # Browser-based XSS verification
    oast: true               # Out-of-band vulnerability detection
    binary_analysis: false   # Binary WebSocket message analysis
    smart_payloads: false    # Adaptive payload evolution

# Web Dashboard Configuration
web:
  host: "0.0.0.0"
  port: 5000
  auth:
    enabled: true
    password: "env:WSHAWK_WEB_PASSWORD"  # Resolved from environment
  database: "sqlite:///~/.wshawk/scans.db"

# Reporting Configuration
reporting:
  output_dir: "."
  formats:
    - html
    - json
  include_screenshots: true
  include_traffic_logs: true

# Integration Configuration
integrations:
  jira:
    enabled: false
    url: "https://company.atlassian.net"
    email: "security@company.com"
    api_token: "env:JIRA_API_TOKEN"
    project: "SEC"
    
  defectdojo:
    enabled: false
    url: "https://defectdojo.company.com"
    api_key: "env:DEFECTDOJO_API_KEY"
    product_id: 1
    
  webhooks:
    - url: "env:SLACK_WEBHOOK_URL"
      platform: "slack"
      enabled: true

Sources: README.md:137-150, wshawk/advanced_cli.py:86-97


Generating Configuration Template

WSHawk provides a command to generate a starter configuration template:

python3 -m wshawk.config --generate

This creates wshawk.yaml.example in the current directory. Rename it to wshawk.yaml and customize the values.

Sources: README.md:139-144


WSHawkConfig Class API

Class Methods and Usage

The WSHawkConfig class provides the primary interface for accessing configuration values throughout the codebase.

Configuration API Methods

classDiagram
    class WSHawkConfig {
        +load() WSHawkConfig$
        +get(key, default) Any
        +set(key, value) void
        +resolve_secret(value) str
        -_config dict
        -_load_yaml() dict
        -_apply_defaults() void
    }
    
    class WSHawkV2 {
        +config WSHawkConfig
        +__init__(url, config)
    }
    
    class AdvancedCLI {
        +main() async
    }
    
    class WebApp {
        +run_web(host, port, auth_enabled)
    }
    
    WSHawkV2 --> WSHawkConfig : uses
    AdvancedCLI --> WSHawkConfig : loads & overrides
    WebApp --> WSHawkConfig : reads settings

Sources: wshawk/scanner_v2.py:48-56, wshawk/advanced_cli.py:86-97


Loading Configuration

The configuration is loaded at application startup using the load() class method:

from wshawk.config import WSHawkConfig

# Load configuration from wshawk.yaml (if exists) with defaults
config = WSHawkConfig.load()

Loading Behavior:

  1. Searches for wshawk.yaml in the current directory
  2. If not found, searches in ~/.wshawk/wshawk.yaml
  3. If not found, uses hardcoded defaults
  4. Resolves all secrets (environment variables, file paths)
  5. Applies default values for missing keys

Sources: wshawk/scanner_v2.py:48-52, wshawk/advanced_cli.py:87-88


Reading Configuration Values

Configuration values are accessed using dot-notation keys with optional default values:

# Get value with default fallback
rate_limit = config.get('scanner.rate_limit', 10)
output_dir = config.get('reporting.output_dir', '.')
formats = config.get('reporting.formats', ['json'])

# Nested keys
playwright_enabled = config.get('scanner.features.playwright', False)
web_host = config.get('web.host', '127.0.0.1')
web_port = config.get('web.port', 5000)

# Integration settings
jira_enabled = config.get('integrations.jira.enabled', False)
jira_url = config.get('integrations.jira.url')

Key Resolution Rules:

  • Keys use dot-notation to traverse the YAML hierarchy
  • If a key is missing, the default value is returned
  • If no default is provided and key is missing, returns None

Sources: wshawk/scanner_v2.py:55-56, wshawk/scanner_v2.py:768-796, wshawk/advanced_cli.py:106-110


Setting Configuration Values

Runtime configuration overrides are applied using the set() method:

# Override configuration at runtime (typically from CLI flags)
config.set('web.host', '0.0.0.0')
config.set('web.port', 8080)
config.set('scanner.rate_limit', 50)
config.set('scanner.features.playwright', True)

This is commonly used to apply CLI flag values over the YAML configuration.

Sources: wshawk/advanced_cli.py:91-97


Secret Resolution

Environment Variable Resolution

Sensitive values like API tokens and passwords should not be hardcoded in wshawk.yaml. Instead, use the env: prefix to resolve values from environment variables at runtime:

integrations:
  jira:
    api_token: "env:JIRA_API_TOKEN"
  defectdojo:
    api_key: "env:DEFECTDOJO_API_KEY"
    
web:
  auth:
    password: "env:WSHAWK_WEB_PASSWORD"

Resolution Process:

flowchart LR
    ConfigValue["Config Value:<br/>'env:JIRA_API_TOKEN'"]
    CheckPrefix{"Starts with<br/>'env:'?"}
    ExtractName["Extract var name:<br/>'JIRA_API_TOKEN'"]
    GetEnv["os.environ.get()"]
    CheckExists{"Exists?"}
    ReturnValue["Return resolved value"]
    ReturnOriginal["Return original string"]
    
    ConfigValue --> CheckPrefix
    CheckPrefix -->|Yes| ExtractName
    CheckPrefix -->|No| ReturnOriginal
    ExtractName --> GetEnv
    GetEnv --> CheckExists
    CheckExists -->|Yes| ReturnValue
    CheckExists -->|No| ReturnOriginal

Sources: README.md:144-149


File-Based Secret Resolution

For secrets stored in files (e.g., Docker secrets, Kubernetes secrets), use the file: prefix:

integrations:
  jira:
    api_token: "file:/run/secrets/jira_token"
  defectdojo:
    api_key: "file:/var/secrets/defectdojo_key"

The configuration loader reads the file contents and uses them as the configuration value.

Sources: README.md:143-149


Configuration Hierarchy and Precedence

WSHawk uses a multi-layered configuration system with the following precedence (highest to lowest):

Configuration Precedence Order

| Priority | Source | Scope | Example | |----------|--------|-------|---------| | 1 | CLI Flags | Single run | --rate 50, --port 8080 | | 2 | Environment Variables | Process/container | WSHAWK_WEB_PASSWORD=secret | | 3 | wshawk.yaml (current dir) | Project-specific | ./wshawk.yaml | | 4 | wshawk.yaml (home dir) | User-specific | ~/.wshawk/wshawk.yaml | | 5 | Hardcoded Defaults | Built-in | rate_limit=10, port=5000 |

Configuration Resolution Flow:

flowchart TD
    Request["Component requests config value<br/>(config.get('scanner.rate_limit'))"]
    
    CLI{"CLI Override<br/>exists?"}
    YAML{"wshawk.yaml<br/>value exists?"}
    EnvVar{"Environment<br/>variable exists?"}
    Default["Use Default Value<br/>(from method parameter)"]
    
    Return["Return resolved value"]
    
    Request --> CLI
    CLI -->|Yes| Return
    CLI -->|No| YAML
    YAML -->|Yes| EnvVar
    YAML -->|No| Default
    EnvVar -->|Needs resolution| Return
    EnvVar -->|Direct value| Return
    Default --> Return
    
    style Request fill:#f9f9f9
    style Return fill:#f9f9f9

Sources: wshawk/advanced_cli.py:86-97, wshawk/scanner_v2.py:48-56


Configuration Schema Reference

Scanner Configuration

| Key | Type | Default | Description | |-----|------|---------|-------------| | scanner.rate_limit | int | 10 | Maximum requests per second | | scanner.timeout | int | 5 | Connection timeout in seconds | | scanner.features.playwright | bool | false | Enable browser-based XSS verification | | scanner.features.oast | bool | true | Enable Out-of-Band vulnerability detection | | scanner.features.binary_analysis | bool | false | Enable binary message analysis | | scanner.features.smart_payloads | bool | false | Enable adaptive payload evolution |

Usage in Code:


Web Dashboard Configuration

| Key | Type | Default | Description | |-----|------|---------|-------------| | web.host | string | "127.0.0.1" | Dashboard bind address | | web.port | int | 5000 | Dashboard port | | web.auth.enabled | bool | true | Enable password authentication | | web.auth.password | string | - | Dashboard password (use env: prefix) | | web.database | string | "~/.wshawk/scans.db" | SQLite database path |

Usage in Code:


Reporting Configuration

| Key | Type | Default | Description | |-----|------|---------|-------------| | reporting.output_dir | string | "." | Directory for report files | | reporting.formats | list | ["html", "json"] | Export formats (html, json, csv, sarif) | | reporting.include_screenshots | bool | true | Include XSS verification screenshots | | reporting.include_traffic_logs | bool | true | Include raw WebSocket traffic |

Usage in Code:


Integration Configuration

| Key | Type | Default | Description | |-----|------|---------|-------------| | integrations.jira.enabled | bool | false | Enable Jira integration | | integrations.jira.url | string | - | Jira instance URL | | integrations.jira.api_token | string | - | Jira API token (use env: prefix) | | integrations.jira.project | string | "SEC" | Default project key | | integrations.defectdojo.enabled | bool | false | Enable DefectDojo integration | | integrations.defectdojo.url | string | - | DefectDojo instance URL | | integrations.defectdojo.api_key | string | - | DefectDojo API key (use env: prefix) | | integrations.webhooks | list | [] | Webhook configurations |

Sources: wshawk/scanner_v2.py:808-848


Configuration Usage Patterns

Pattern 1: Scanner Initialization with Config

from wshawk.scanner_v2 import WSHawkV2
from wshawk.config import WSHawkConfig

# Load global configuration
config = WSHawkConfig.load()

# Pass to scanner (scanner respects config settings)
scanner = WSHawkV2("ws://target.com", config=config)

# Scanner automatically reads:
# - rate_limit from config.get('scanner.rate_limit', 10)
# - output_dir from config.get('reporting.output_dir', '.')
# - formats from config.get('reporting.formats', ['json'])

Sources: wshawk/scanner_v2.py:40-56


Pattern 2: CLI Flag Overrides

from wshawk.config import WSHawkConfig

# Load base configuration
config = WSHawkConfig.load()

# Apply CLI overrides (from argparse)
if args.host:
    config.set('web.host', args.host)
if args.port:
    config.set('web.port', args.port)
if args.rate:
    config.set('scanner.rate_limit', args.rate)
if args.playwright:
    config.set('scanner.features.playwright', True)

# Now all components use overridden values
scanner = WSHawkV2(url, config=config)

Sources: wshawk/advanced_cli.py:86-97


Pattern 3: Integration Credential Resolution

# In wshawk.yaml:
# integrations:
#   jira:
#     api_token: "env:JIRA_API_TOKEN"

# At runtime:
config = WSHawkConfig.load()

# Value is automatically resolved from environment
jira_token = config.get('integrations.jira.api_token')
# If JIRA_API_TOKEN env var is set, jira_token contains its value
# Otherwise, jira_token is None or the literal string "env:JIRA_API_TOKEN"

Sources: README.md:144-149


Pattern 4: Web Dashboard Configuration

from wshawk.config import WSHawkConfig

config = WSHawkConfig.load()

# Read web settings
host = config.get('web.host', '127.0.0.1')
port = config.get('web.port', 5000)
auth_enabled = config.get('web.auth.enabled', True)
auth_password = config.get('web.auth.password')  # Resolved from env var

# Database path handling
db_path = config.get('web.database')
if db_path and db_path.startswith('sqlite:///'):
    clean_db_path = db_path.replace('sqlite:///', '')
else:
    clean_db_path = None

# Launch dashboard with resolved configuration
run_web(host=host, port=port, auth_enabled=auth_enabled, 
        auth_password=auth_password, db_path=clean_db_path)

Sources: wshawk/advanced_cli.py:106-123


Configuration in Different Modes

Quick Scan Mode

wshawk ws://target.com

Uses default configuration with no customization required.


Advanced CLI Mode

wshawk-advanced ws://target.com --smart-payloads --playwright --rate 20
  • Loads wshawk.yaml if present
  • Applies CLI flag overrides for smart_payloads, playwright, and rate
  • Scanner respects all configuration values

Sources: wshawk/advanced_cli.py:1-299


Web Dashboard Mode

export WSHAWK_WEB_PASSWORD='secure-password'
wshawk --web --port 8080
  • Loads configuration for web dashboard settings
  • Resolves password from environment variable
  • Applies port override from CLI flag

Sources: README.md:117-127, wshawk/advanced_cli.py:100-124


Docker Deployment

docker run -e WSHAWK_WEB_PASSWORD='secure' \
           -v ./wshawk.yaml:/app/wshawk.yaml \
           rothackers/wshawk --web
  • Mounts custom configuration file into container
  • Environment variables override YAML values
  • All integrations work via environment-resolved secrets

Sources: README.md:64-78


Best Practices

Security Best Practices

  1. Never commit secrets to wshawk.yaml

    # BAD - Secret in config file
    jira:
      api_token: "actual-secret-token-123"
    
    # GOOD - Environment variable reference
    jira:
      api_token: "env:JIRA_API_TOKEN"
    
  2. Use file-based secrets in orchestration environments

    # For Kubernetes/Docker Swarm secrets
    jira:
      api_token: "file:/run/secrets/jira_token"
    
  3. Set restrictive permissions on configuration files

    chmod 600 ~/.wshawk/wshawk.yaml
    

Sources: README.md:122-127


Configuration Organization

  1. Project-specific settings./wshawk.yaml (version controlled, no secrets)
  2. User-specific settings~/.wshawk/wshawk.yaml (personal defaults)
  3. Secrets → Environment variables or secret files
  4. Runtime overrides → CLI flags

This separation ensures configuration is portable and secure.


Multi-Environment Configuration

For different environments (dev, staging, production), use environment-specific configuration files:

# Development
cp wshawk.dev.yaml wshawk.yaml
wshawk-advanced ws://dev.target.com

# Production
cp wshawk.prod.yaml wshawk.yaml
export WSHAWK_WEB_PASSWORD='prod-password'
wshawk --web

Or use environment variable prefixes to switch configurations dynamically.


Troubleshooting

Configuration Not Loading

Symptom: Scanner uses default values instead of YAML configuration

Causes:

  1. wshawk.yaml not in current directory or ~/.wshawk/
  2. YAML syntax errors
  3. Incorrect key names (dot-notation)

Solution:

# Verify file location
ls -la wshawk.yaml

# Validate YAML syntax
python3 -c "import yaml; yaml.safe_load(open('wshawk.yaml'))"

# Check key names match schema
grep -r "config.get(" wshawk/

Secret Resolution Failures

Symptom: Configuration values are literal strings like "env:JIRA_TOKEN" instead of resolved values

Causes:

  1. Environment variable not set
  2. Incorrect env: prefix syntax
  3. Typo in variable name

Solution:

# Verify environment variable is set
echo $JIRA_API_TOKEN

# Check for typos in wshawk.yaml
grep "env:" wshawk.yaml

# Set missing variables
export JIRA_API_TOKEN='your-token-here'

CLI Override Not Working

Symptom: CLI flags don't override YAML values

Cause: CLI flag applied after component initialization

Solution: Ensure config.set() is called before passing config to scanner:

# Correct order
config = WSHawkConfig.load()
config.set('scanner.rate_limit', args.rate)  # Override BEFORE scanner init
scanner = WSHawkV2(url, config=config)

Sources: wshawk/advanced_cli.py:86-172


Summary

The WSHawk configuration system provides:

  • Hierarchical YAML configuration with wshawk.yaml
  • Secret resolution via env: and file: prefixes
  • Multi-layer precedence (CLI → YAML → defaults)
  • Dot-notation key access (config.get('scanner.rate_limit'))
  • Runtime overrides via config.set()
  • Secure credential management without hardcoded secrets

This design ensures WSHawk can be deployed in diverse environments (local development, Docker, Kubernetes, CI/CD) while maintaining security and flexibility.

Sources: README.md:137-150, wshawk/scanner_v2.py:40-56, wshawk/advanced_cli.py:86-123