Building from Source

Building from Source

The following files were used as context for generating this wiki page:

This document provides detailed instructions for building WSHawk from source code, including Python package builds and Docker image builds. This is intended for developers who need to modify WSHawk, contribute patches, or create custom builds. For standard installation methods, see Installation Methods.

Prerequisites

Python Environment

WSHawk requires Python 3.8 or later. The build system supports Python versions:

| Python Version | Support Status | Build System | |---------------|---------------|--------------| | 3.8 - 3.13 | Production/Stable | setuptools + wheel | | < 3.8 | Not Supported | N/A |

Sources: pyproject.toml:13, setup.py:39

System Dependencies

The following system packages are required during the build process:

# Debian/Ubuntu
apt-get install -y gcc ca-certificates

# RHEL/CentOS
yum install -y gcc ca-certificates

# Alpine
apk add gcc musl-dev ca-certificates

The gcc compiler is needed to build native extensions for dependencies like websockets. The ca-certificates package is required for TLS/SSL connectivity.

Sources: Dockerfile:10-12, Dockerfile:29-31

Build Tools

Install Python build dependencies:

pip install --upgrade pip setuptools wheel

The build system is defined in pyproject.toml:1-3 and requires:

  • setuptools>=61.0
  • wheel

Sources: pyproject.toml:1-3


Cloning the Repository

git clone https://github.com/regaan/wshawk.git
cd wshawk

The repository structure includes:

wshawk/
├── wshawk/              # Main package directory
│   ├── __init__.py
│   ├── __main__.py
│   ├── scanner_v2.py
│   ├── payloads/        # Payload collections
│   └── web/             # Web dashboard assets
├── setup.py             # Traditional setuptools config
├── pyproject.toml       # Modern PEP 517 config
├── MANIFEST.in          # Package data specification
├── Dockerfile           # Container build definition
└── README.md

Sources: setup.py:24, MANIFEST.in:1-6


Build Methods

Method 1: Editable Development Install

For active development, install WSHawk in editable mode so that code changes are immediately reflected:

pip install -e .

This creates symlinks to the source directory rather than copying files. The entry points defined in pyproject.toml:42-46 become available:

  • wshawkwshawk.__main__:cli
  • wshawk-interactivewshawk.interactive:cli
  • wshawk-advancedwshawk.advanced_cli:cli
  • wshawk-defensivewshawk.defensive_cli:cli

Sources: pyproject.toml:42-46, setup.py:41-47

Method 2: Build Wheel Package

To create a distributable wheel package:

# Build wheel in dist/ directory
python -m build

# Or using setup.py directly
python setup.py bdist_wheel

This creates dist/wshawk-3.0.0-py3-none-any.whl. Install it with:

pip install dist/wshawk-3.0.0-py3-none-any.whl

The wheel includes all package data specified in pyproject.toml:51-57:

[tool.setuptools.package-data]
"wshawk" = [
    "payloads/*.txt",
    "payloads/**/*.json",
    "web/templates/*.html",
    "web/static/*"
]

Sources: pyproject.toml:5-7, pyproject.toml:51-57

Method 3: Source Distribution

Create a source distribution (.tar.gz):

python setup.py sdist

This creates dist/wshawk-3.0.0.tar.gz and includes files specified in MANIFEST.in:1-6:

README.md
LICENSE
requirements.txt
wshawk/payloads/*
wshawk/web/templates/*
wshawk/web/static/*

Sources: MANIFEST.in:1-6, setup.py:49


Build Process Diagram

graph TB
    subgraph "Source Repository"
        Git["GitHub Repository<br/>regaan/wshawk"]
        Setup["setup.py<br/>setuptools config"]
        Pyproject["pyproject.toml<br/>PEP 517 config"]
        Manifest["MANIFEST.in<br/>package data spec"]
    end
    
    subgraph "Build Configuration"
        BuildSys["build-system<br/>setuptools.build_meta"]
        Deps["dependencies<br/>websockets>=12.0<br/>playwright>=1.40.0<br/>aiohttp>=3.9.0<br/>PyYAML>=6.0<br/>flask>=3.0.0"]
        PkgData["package_data<br/>payloads/*.txt<br/>web/templates/*.html<br/>web/static/*"]
        EntryPts["entry_points<br/>wshawk<br/>wshawk-interactive<br/>wshawk-advanced<br/>wshawk-defensive"]
    end
    
    subgraph "Build Process"
        Editable["pip install -e .<br/>Editable Install<br/>Symlinks to source"]
        Wheel["python -m build<br/>Wheel Build<br/>wshawk-3.0.0-py3-none-any.whl"]
        Sdist["python setup.py sdist<br/>Source Distribution<br/>wshawk-3.0.0.tar.gz"]
    end
    
    subgraph "Package Artifacts"
        DevInstall["Development Install<br/>Immediate code changes<br/>CLI commands available"]
        WheelFile["dist/wshawk-3.0.0-py3-none-any.whl<br/>Binary distribution<br/>Fast install"]
        TarFile["dist/wshawk-3.0.0.tar.gz<br/>Source archive<br/>PyPI upload"]
    end
    
    Git --> Setup
    Git --> Pyproject
    Git --> Manifest
    
    Pyproject --> BuildSys
    Pyproject --> Deps
    Pyproject --> PkgData
    Pyproject --> EntryPts
    Setup --> Deps
    Setup --> EntryPts
    Manifest --> PkgData
    
    BuildSys --> Editable
    BuildSys --> Wheel
    BuildSys --> Sdist
    
    Editable --> DevInstall
    Wheel --> WheelFile
    Sdist --> TarFile

Sources: pyproject.toml:1-58, setup.py:1-65, MANIFEST.in:1-6


Dependency Resolution

The build process resolves dependencies from pyproject.toml:29-35:

dependencies = [
    "websockets>=12.0",
    "playwright>=1.40.0",
    "aiohttp>=3.9.0",
    "PyYAML>=6.0",
    "flask>=3.0.0",
]

Core Dependencies

| Package | Version | Purpose | |---------|---------|---------| | websockets | >=12.0 | WebSocket client protocol | | playwright | >=1.40.0 | Headless browser for XSS verification | | aiohttp | >=3.9.0 | Async HTTP for OAST integration | | PyYAML | >=6.0 | Configuration file parsing | | flask | >=3.0.0 | Web dashboard backend |

Sources: pyproject.toml:29-35, setup.py:9-14

Playwright Browser Installation

After installation, Playwright browsers must be installed separately:

# Install Chromium for XSS verification
playwright install chromium

# Install all browsers (optional)
playwright install

This is referenced in the Dockerfile but not automated during pip install. The HeadlessBrowserXSSVerifier class in the scanner requires this step for browser-based verification.

Sources: Dockerfile:28-31


Docker Image Build

Multi-Stage Build Process

The Dockerfile:1-66 implements a two-stage build:

graph LR
    subgraph "Stage 1: Builder"
        Base1["python:3.11-slim<br/>Base Image"]
        BuildDeps["apt-get install<br/>gcc"]
        CopySource["COPY setup.py pyproject.toml<br/>COPY wshawk/"]
        PipInstall["pip install .<br/>Build & Install"]
    end
    
    subgraph "Stage 2: Runtime"
        Base2["python:3.11-slim<br/>Fresh Base"]
        RuntimeDeps["apt-get install<br/>ca-certificates"]
        CopyArtifacts["COPY --from=builder<br/>/usr/local/lib/python3.11/site-packages<br/>/usr/local/bin/wshawk*"]
        NonRoot["useradd wshawk:1000<br/>Non-root security"]
    end
    
    subgraph "Final Image"
        Labels["OCI Labels<br/>maintainer, version<br/>source, description"]
        EntryPoint["ENTRYPOINT wshawk<br/>CMD --help"]
        HealthCheck["HEALTHCHECK<br/>wshawk --help"]
    end
    
    Base1 --> BuildDeps
    BuildDeps --> CopySource
    CopySource --> PipInstall
    
    Base2 --> RuntimeDeps
    RuntimeDeps --> CopyArtifacts
    PipInstall --> CopyArtifacts
    CopyArtifacts --> NonRoot
    
    NonRoot --> Labels
    Labels --> EntryPoint
    EntryPoint --> HealthCheck

Sources: Dockerfile:1-66

Building the Image

# Build with default tag
docker build -t wshawk:3.0.0 .

# Build with cache disabled
docker build --no-cache -t wshawk:3.0.0 .

# Multi-architecture build (requires buildx)
docker buildx build --platform linux/amd64,linux/arm64 -t wshawk:3.0.0 .

The multi-stage build reduces final image size by excluding build dependencies like gcc from the runtime image.

Sources: Dockerfile:4-21, Dockerfile:23-41

Image Metadata

The Dockerfile includes OpenContainers-compliant labels for registry integration:

LABEL org.opencontainers.image.source="https://github.com/regaan/wshawk"
LABEL org.opencontainers.image.description="Professional WebSocket security scanner..."
LABEL org.opencontainers.image.licenses="MIT"
LABEL org.opencontainers.image.version="3.0.0"

These labels enable GitHub Container Registry (GHCR) integration and metadata display in Docker Hub.

Sources: Dockerfile:55-65

Security Hardening

The Dockerfile implements security best practices:

  1. Non-Root User: Dockerfile:38-41

    RUN useradd -m -u 1000 wshawk && \
        chown -R wshawk:wshawk /app
    USER wshawk
    
  2. Minimal Base Image: Uses python:3.11-slim instead of full Python image

  3. Layer Optimization: Multi-stage build excludes build dependencies from runtime

  4. Health Check: Dockerfile:48-49

    HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
        CMD wshawk --help || exit 1
    

Sources: Dockerfile:23, Dockerfile:38-41, Dockerfile:48-49


Package Data Inclusion

Payload Files

The build system includes all payload files using glob patterns in pyproject.toml:52-57:

wshawk/payloads/*.txt          → Static text payloads
wshawk/payloads/**/*.json      → JSON payload collections (nested)

These files are referenced by the WSPayloads class during scanning operations.

Sources: pyproject.toml:52-54, MANIFEST.in:4

Web Dashboard Assets

The Flask-based web dashboard requires template and static files:

wshawk/web/templates/*.html    → Jinja2 templates
wshawk/web/static/*            → CSS, JavaScript, images

The include_package_data=True directive in setup.py:49 ensures these are bundled.

Sources: pyproject.toml:55-56, setup.py:50-56, MANIFEST.in:5-6

Verification

After installation, verify package data is present:

python -c "import wshawk; import pkg_resources; print(pkg_resources.resource_filename('wshawk', 'payloads'))"

This should output the path to the installed payloads directory.

Sources: pyproject.toml:51-57


Code Entity Mapping

graph TB
    subgraph "Build Configuration Files"
        PyprojectToml["pyproject.toml"]
        SetupPy["setup.py"]
        ManifestIn["MANIFEST.in"]
        DockerFile["Dockerfile"]
    end
    
    subgraph "Python Package Metadata"
        ProjectName["project.name<br/>'wshawk'"]
        ProjectVersion["project.version<br/>'3.0.0'"]
        RequiresPython["requires-python<br/>'>=3.8'"]
        Dependencies["dependencies[]<br/>websockets, playwright,<br/>aiohttp, PyYAML, flask"]
    end
    
    subgraph "Entry Points"
        EP1["wshawk → wshawk.__main__:cli"]
        EP2["wshawk-interactive → wshawk.interactive:cli"]
        EP3["wshawk-advanced → wshawk.advanced_cli:cli"]
        EP4["wshawk-defensive → wshawk.defensive_cli:cli"]
    end
    
    subgraph "Package Structure"
        MainPkg["wshawk/"]
        MainInit["wshawk/__init__.py"]
        MainMain["wshawk/__main__.py"]
        ScannerV2["wshawk/scanner_v2.py"]
        PayloadsDir["wshawk/payloads/"]
        WebDir["wshawk/web/"]
    end
    
    subgraph "Build Artifacts"
        WheelArtifact["dist/wshawk-3.0.0-py3-none-any.whl"]
        SdistArtifact["dist/wshawk-3.0.0.tar.gz"]
        DockerImage["wshawk:3.0.0<br/>Docker Image"]
    end
    
    subgraph "Installed Files"
        SitePackages["/usr/local/lib/python3.11/site-packages/wshawk/"]
        BinScripts["/usr/local/bin/wshawk*"]
        PayloadsInstalled["site-packages/wshawk/payloads/"]
        WebInstalled["site-packages/wshawk/web/"]
    end
    
    PyprojectToml --> ProjectName
    PyprojectToml --> ProjectVersion
    PyprojectToml --> RequiresPython
    PyprojectToml --> Dependencies
    PyprojectToml --> EP1
    PyprojectToml --> EP2
    PyprojectToml --> EP3
    PyprojectToml --> EP4
    
    SetupPy --> ProjectName
    SetupPy --> Dependencies
    SetupPy --> EP1
    
    ManifestIn --> PayloadsDir
    ManifestIn --> WebDir
    
    MainPkg --> MainInit
    MainPkg --> MainMain
    MainPkg --> ScannerV2
    MainPkg --> PayloadsDir
    MainPkg --> WebDir
    
    EP1 --> MainMain
    
    ProjectName --> WheelArtifact
    ProjectVersion --> WheelArtifact
    ProjectName --> SdistArtifact
    
    DockerFile --> DockerImage
    SetupPy --> DockerImage
    
    WheelArtifact --> SitePackages
    WheelArtifact --> BinScripts
    SitePackages --> PayloadsInstalled
    SitePackages --> WebInstalled

Sources: pyproject.toml:1-58, setup.py:1-65, MANIFEST.in:1-6, Dockerfile:1-66


Verification Steps

1. Verify Installation

# Check version
wshawk --version

# Verify CLI entry points
which wshawk
which wshawk-interactive
which wshawk-advanced
which wshawk-defensive

All four entry points from pyproject.toml:42-46 should be available.

Sources: pyproject.toml:42-46

2. Verify Package Data

# Check payloads directory
python -c "from wshawk.payloads import WSPayloads; print(WSPayloads().get_xss_payloads()[:5])"

# Verify web templates exist
python -c "import wshawk.web; import pkg_resources; print(pkg_resources.resource_filename('wshawk.web', 'templates'))"

Sources: pyproject.toml:52-57

3. Verify Dependencies

# List installed dependencies
pip show wshawk

# Verify specific versions
python -c "import websockets; print(websockets.__version__)"
python -c "import playwright; print(playwright.__version__)"

Versions should match the constraints in pyproject.toml:29-35.

Sources: pyproject.toml:29-35

4. Verify Docker Image

# Check image exists
docker images | grep wshawk

# Verify entry point
docker run --rm wshawk:3.0.0 --help

# Verify non-root user
docker run --rm wshawk:3.0.0 whoami  # Should output: wshawk

# Check health status
docker inspect --format='{{.Config.Healthcheck}}' wshawk:3.0.0

Sources: Dockerfile:38-41, Dockerfile:48-49, Dockerfile:52-53


Troubleshooting

Missing Payload Files

Symptom: FileNotFoundError when scanning

Solution: Ensure MANIFEST.in includes payload directories:

grep -r "payloads" MANIFEST.in

The MANIFEST.in:4 line should read: recursive-include wshawk/payloads *

Sources: MANIFEST.in:4

GCC Build Errors

Symptom: error: command 'gcc' failed during pip install

Solution: Install system compiler:

# Debian/Ubuntu
apt-get install build-essential python3-dev

# Alpine
apk add gcc musl-dev python3-dev

Sources: Dockerfile:10-12

Playwright Browser Missing

Symptom: Error: Browser is not installed when using --playwright

Solution: Install browsers post-installation:

playwright install chromium

This is separate from the pip install process and must be run explicitly.

Sources: Dockerfile:28-31

Docker Build Fails on ARM

Symptom: Build fails on Apple Silicon or ARM servers

Solution: Use buildx for multi-arch builds:

docker buildx create --use
docker buildx build --platform linux/arm64 -t wshawk:3.0.0 .

Sources: Dockerfile:1-66


Build Optimization

Reducing Wheel Size

Exclude test files and documentation from the wheel:

# In setup.py
packages=find_packages(exclude=["tests", "tests.*", "examples", "examples.*", "docs"])

This is already implemented in setup.py:24.

Sources: setup.py:24

Docker Image Size

The multi-stage build already optimizes image size. Additional optimizations:

# Use alpine base (smaller but requires additional build steps)
FROM python:3.11-alpine

# Squash layers (experimental)
docker build --squash -t wshawk:3.0.0 .

Sources: Dockerfile:4, Dockerfile:23

Caching Dependencies

Speed up repeated builds by caching pip downloads:

# Local cache directory
pip install --cache-dir=/tmp/pip-cache -e .

# Docker build cache
docker build --build-arg PIP_CACHE_DIR=/tmp/pip-cache .

CI/CD Integration

For automated builds in CI/CD pipelines, see CI/CD Integration. The build process integrates with:

  • GitHub Actions: .github/workflows/docker-build.yml
  • GitLab CI: .gitlab-ci.yml
  • Generic CI: python -m build && pip install dist/*.whl

The pyproject.toml:1-3 build system configuration is compatible with all modern Python build frontends (pip, build, poetry).

Sources: pyproject.toml:1-3