Rate Limiting and Session State

The following files were used as context for generating this wiki page:

Purpose and Scope

This document describes two critical infrastructure components of the WSHawk scanner: the TokenBucketRateLimiter and the SessionStateMachine. The rate limiter prevents server overload and detection by controlling request throughput with adaptive adjustment capabilities. The session state machine tracks WebSocket connection lifecycle and manages authentication sequences from YAML configurations.

For information about the payload mutation and WAF evasion strategies that work alongside rate limiting, see Payload Mutation and WAF Evasion. For authentication configuration details, see Configuration and Authentication.

TokenBucketRateLimiter

Overview

The TokenBucketRateLimiter implements a token bucket algorithm to control the rate of payload injection during vulnerability scanning. It supports both fixed-rate and adaptive rate limiting modes, automatically adjusting throughput based on server response behavior.

Sources: wshawk/scanner_v2.py:19, wshawk/scanner_v2.py:45-49

Architecture and Design

The rate limiter is initialized as part of the WSHawkV2 scanner with three key parameters:

| Parameter | Description | Default Value | |-----------|-------------|---------------| | tokens_per_second | Base request rate (RPS) | Configurable via max_rps parameter | | bucket_size | Maximum burst capacity | max_rps * 2 | | enable_adaptive | Enable adaptive rate adjustment | True |

graph TB
    subgraph "Token Bucket Algorithm"
        Bucket["Token Bucket<br/>(bucket_size capacity)"]
        Refill["Token Refill Process<br/>(tokens_per_second rate)"]
        Request["Incoming Request<br/>await acquire()"]
    end
    
    subgraph "Adaptive Control System"
        Monitor["Response Monitor<br/>(timeouts, errors)"]
        Adjuster["Rate Adjuster<br/>(increase/decrease RPS)"]
        Stats["Statistics Tracker<br/>(total_requests, total_waits)"]
    end
    
    Refill -->|"Adds tokens periodically"| Bucket
    Request -->|"Consumes 1 token"| Bucket
    Bucket -->|"Token available"| Allow["Request Allowed"]
    Bucket -->|"No tokens"| Wait["Request Waits"]
    
    Allow --> Monitor
    Wait --> Monitor
    Monitor -->|"High error rate"| Adjuster
    Adjuster -->|"Adjusts rate"| Refill
    
    Request --> Stats
    Wait --> Stats
    
    style Bucket fill:#f9f9f9
    style Adjuster fill:#e8e8e8

Sources: wshawk/scanner_v2.py:45-49

Adaptive Rate Control

The adaptive feature monitors server responses and automatically reduces request rate when detecting signs of server stress:

Timeout Detection: Frequent asyncio.TimeoutError exceptions trigger rate reduction
Error Patterns: HTTP 429 (Too Many Requests) or connection failures
Rate Adjustments: Tracked in statistics as adaptive_adjustments

When enabled, the system maintains optimal throughput without overwhelming the target server or triggering rate-based detection mechanisms.

Sources: wshawk/scanner_v2.py:48

Integration with Scanner

The rate limiter is integrated at the scanner initialization level and used throughout vulnerability testing modules:

flowchart LR
    subgraph "WSHawkV2 Initialization"
        Init["__init__()<br/>scanner_v2.py:33-49"]
        RLCreate["TokenBucketRateLimiter<br/>tokens_per_second=max_rps<br/>enable_adaptive=True"]
    end
    
    subgraph "Vulnerability Testing"
        SQL["test_sql_injection_v2()"]
        XSS["test_xss_v2()"]
        CMD["test_command_injection_v2()"]
        SSRF["test_ssrf_v2()"]
        Other["Other test methods"]
    end
    
    subgraph "Rate Limiting Mechanism"
        Acquire["await rate_limiter.acquire()"]
        Send["await ws.send(payload)"]
        Sleep["await asyncio.sleep(0.05)"]
    end
    
    Init --> RLCreate
    RLCreate --> SQL
    RLCreate --> XSS
    RLCreate --> CMD
    RLCreate --> SSRF
    RLCreate --> Other
    
    SQL --> Acquire
    SSRF --> Acquire
    Acquire --> Send
    Send --> Sleep
    
    style RLCreate fill:#f0f0f0
    style Acquire fill:#e0e0e0

Example Usage in SSRF Testing:

The SSRF test explicitly calls acquire() before each request to enforce rate limiting:

wshawk/scanner_v2.py:509

await self.rate_limiter.acquire()
msg = json.dumps({"action": "fetch_url", "url": target})
await ws.send(msg)

Implicit Rate Limiting:

Most test methods use implicit rate limiting through asyncio.sleep() calls after each request:

wshawk/scanner_v2.py:204, wshawk/scanner_v2.py:285, wshawk/scanner_v2.py:351

Sources: wshawk/scanner_v2.py:45-49, wshawk/scanner_v2.py:509, wshawk/scanner_v2.py:204, wshawk/scanner_v2.py:285

Statistics and Monitoring

After scan completion, the scanner retrieves and displays rate limiter statistics:

graph LR
    Scanner["WSHawkV2.run_heuristic_scan()"]
    GetStats["rate_limiter.get_stats()"]
    StatsDict["Statistics Dictionary"]
    Display["Logger.info() Output"]
    
    Scanner --> GetStats
    GetStats --> StatsDict
    StatsDict --> Display
    
    subgraph "Statistics Keys"
        TR["total_requests<br/>(int)"]
        TW["total_waits<br/>(int)"]
        CR["current_rate<br/>(float)"]
        AA["adaptive_adjustments<br/>(int)"]
    end
    
    StatsDict --> TR
    StatsDict --> TW
    StatsDict --> CR
    StatsDict --> AA
    
    style StatsDict fill:#f5f5f5

Statistics Output:

wshawk/scanner_v2.py:672-675

rate_stats = self.rate_limiter.get_stats()
Logger.info(f"Rate limiter: {rate_stats['total_requests']} requests, {rate_stats['total_waits']} waits")
Logger.info(f"  Current rate: {rate_stats['current_rate']}, Adaptive adjustments: {rate_stats['adaptive_adjustments']}")

The statistics provide visibility into:

Total Requests: How many payloads attempted to send
Total Waits: How many times the limiter blocked requests
Current Rate: Real-time requests per second
Adaptive Adjustments: Number of automatic rate changes

Sources: wshawk/scanner_v2.py:672-675

SessionStateMachine

Overview

The SessionStateMachine tracks WebSocket connection state transitions and manages authentication sequences. It provides a formal state model for connection lifecycle management and supports loading multi-step authentication flows from YAML configuration files.

Sources: wshawk/scanner_v2.py:18, wshawk/scanner_v2.py:44

State Transitions

The state machine tracks the following connection states:

stateDiagram-v2
    [*] --> Initialized: WSHawkV2.__init__()
    Initialized --> Connected: connect() success
    Connected --> Learning: learning_phase() start
    Learning --> Testing: learning complete
    Testing --> Testing: vulnerability tests loop
    Testing --> Closed: ws.close()
    Closed --> [*]
    
    Initialized --> Failed: connect() error
    Failed --> [*]
    
    note right of Connected
        State updated via:
        state_machine._update_state('connected')
    end note
    
    note right of Testing
        All test methods execute
        while in this state
    end note

State Update Example:

The scanner updates the state machine when establishing a connection:

wshawk/scanner_v2.py:77-85

async def connect(self):
    """Establish WebSocket connection"""
    try:
        ws = await websockets.connect(self.url, additional_headers=self.headers)
        self.state_machine._update_state('connected')
        return ws
    except Exception as e:
        Logger.error(f"Connection failed: {e}")
        return None

Sources: wshawk/scanner_v2.py:44, wshawk/scanner_v2.py:77-85

YAML-Based Authentication Sequences

The session state machine supports loading complex authentication sequences from YAML files. This enables testing WebSocket endpoints that require multi-step authentication flows before vulnerability testing can begin.

graph TB
    subgraph "Authentication Configuration"
        YAMLFile["auth_sequence.yaml<br/>(User-provided)"]
        Parser["PyYAML Parser"]
        Sequence["Authentication Steps<br/>(ordered list)"]
    end
    
    subgraph "SessionStateMachine"
        Loader["load_sequence_from_yaml()<br/>scanner_v2.py:62"]
        StateTracker["Internal State Tracking"]
        Executor["Execute auth steps<br/>before testing"]
    end
    
    subgraph "WSHawkV2 Scanner"
        Init["__init__() with<br/>auth_sequence parameter"]
        Connect["connect()"]
        Learning["learning_phase()"]
        Testing["test_*_v2() methods"]
    end
    
    YAMLFile --> Parser
    Parser --> Sequence
    Sequence --> Loader
    
    Init -->|"if auth_sequence"| Loader
    Loader --> StateTracker
    StateTracker --> Executor
    
    Executor --> Connect
    Connect --> Learning
    Learning --> Testing
    
    style Loader fill:#e8e8e8
    style YAMLFile fill:#f0f0f0

Code Integration:

The authentication sequence is loaded during scanner initialization if provided:

wshawk/scanner_v2.py:60-62

# Load auth sequence if provided
if auth_sequence:
    self.state_machine.load_sequence_from_yaml(auth_sequence)

This allows the scanner to be instantiated with custom authentication:

scanner = WSHawkV2(
    url="wss://example.com/secure-ws",
    auth_sequence="auth_config.yaml"
)

Sources: wshawk/scanner_v2.py:34, wshawk/scanner_v2.py:60-62

Authentication Flow Example

A typical authentication sequence loaded from YAML might include:

| Step | Action | Purpose | |------|--------|---------| | 1 | Send {"action": "login", "user": "...", "pass": "..."} | Initial authentication | | 2 | Await {"status": "authenticated", "token": "..."} | Token receipt | | 3 | Send {"action": "subscribe", "channel": "..."} | Channel subscription | | 4 | Await {"status": "subscribed"} | Confirmation |

The state machine ensures these steps complete successfully before vulnerability testing begins, maintaining the authenticated session throughout the scan.

Sources: wshawk/scanner_v2.py:60-62

Interaction Between Systems

Coordinated Operation

The rate limiter and session state machine operate independently but coordinate through the scanner to provide robust, controlled vulnerability testing:

sequenceDiagram
    participant CLI as "CLI/API Caller"
    participant Scanner as "WSHawkV2"
    participant StateMachine as "SessionStateMachine"
    participant RateLimiter as "TokenBucketRateLimiter"
    participant WS as "WebSocket"
    
    CLI->>Scanner: "__init__(url, auth_sequence, max_rps)"
    Scanner->>StateMachine: "create instance"
    Scanner->>RateLimiter: "create instance(max_rps)"
    
    opt Authentication Required
        Scanner->>StateMachine: "load_sequence_from_yaml()"
        StateMachine-->>Scanner: "auth steps loaded"
    end
    
    Scanner->>WS: "connect()"
    WS-->>Scanner: "connection established"
    Scanner->>StateMachine: "_update_state('connected')"
    
    Scanner->>Scanner: "learning_phase()"
    Scanner->>StateMachine: "_update_state('learning')"
    
    loop For Each Payload
        Scanner->>RateLimiter: "acquire()"
        RateLimiter-->>Scanner: "token granted"
        Scanner->>WS: "send(payload)"
        WS-->>Scanner: "response"
        
        opt Adaptive Mode Enabled
            RateLimiter->>RateLimiter: "adjust rate if needed"
        end
    end
    
    Scanner->>Scanner: "get_stats()"
    Scanner->>CLI: "return results + statistics"

Key Coordination Points:

Initialization: Both systems created with scanner-specific parameters
Authentication: State machine loads sequences; rate limiter inactive during auth
Testing Phase: Rate limiter enforces throughput while state machine tracks connection state
Adaptive Adjustments: Rate limiter responds to server behavior independently of state machine
Statistics: Both systems provide metrics for scan summary

Sources: wshawk/scanner_v2.py:33-49, wshawk/scanner_v2.py:77-85, wshawk/scanner_v2.py:542-677

Initialization in Scanner Context

The complete initialization sequence showing both systems:

graph TB
    subgraph "WSHawkV2.__init__()"
        Params["Parameters:<br/>url, headers, auth_sequence, max_rps"]
        
        subgraph "Core State Management"
            SM["SessionStateMachine()"]
            RL["TokenBucketRateLimiter(<br/>tokens_per_second=max_rps,<br/>bucket_size=max_rps*2,<br/>enable_adaptive=True)"]
        end
        
        subgraph "Analysis Modules"
            MA["MessageAnalyzer()"]
            VV["VulnerabilityVerifier()"]
            FP["ServerFingerprinter()"]
        end
        
        subgraph "Verification Modules"
            HV["HeadlessBrowserXSSVerifier<br/>(conditional)"]
            OAST["OASTProvider<br/>(conditional)"]
        end
        
        subgraph "Reporting"
            Reporter["EnhancedHTMLReporter()"]
        end
        
        AuthCheck{"auth_sequence<br/>provided?"}
        LoadAuth["state_machine.load_sequence_from_yaml()"]
    end
    
    Params --> SM
    Params --> RL
    Params --> MA
    Params --> VV
    Params --> FP
    Params --> Reporter
    
    Params --> AuthCheck
    AuthCheck -->|Yes| LoadAuth
    LoadAuth --> SM
    
    style SM fill:#e8e8e8
    style RL fill:#e8e8e8
    style Params fill:#f5f5f5

Sources: wshawk/scanner_v2.py:33-75

Statistics Collection

Both systems contribute to the final scan report:

| System | Statistics Provided | |--------|---------------------| | TokenBucketRateLimiter | total_requests, total_waits, current_rate, adaptive_adjustments | | SessionStateMachine | Current state, authentication status (implicit in scan success) | | WSHawkV2 Scanner | messages_sent, messages_received, scan duration, vulnerabilities found |

The combined statistics provide a complete picture of scan behavior and performance:

wshawk/scanner_v2.py:631-675

Sources: wshawk/scanner_v2.py:631-675, wshawk/scanner_v2.py:672-675

Summary

The TokenBucketRateLimiter and SessionStateMachine provide critical infrastructure for controlled, stateful vulnerability scanning:

Rate Limiting: Prevents server overload with adaptive throughput control
Session Management: Tracks connection lifecycle and authentication state
Coordinated Operation: Independent but complementary systems working through the scanner
Observability: Comprehensive statistics for scan behavior analysis

These systems enable WSHawk to perform aggressive vulnerability testing while maintaining server stability and avoiding detection through rate-based anomalies.

Sources: wshawk/scanner_v2.py:1-678, CHANGELOG.md:87