Report Formats and Export

Name: WSHawk
Author: Regaan

The following files were used as context for generating this wiki page:

This document describes WSHawk's reporting and export capabilities, including the structure and generation of HTML, JSON, CSV, and SARIF formats. For configuration of external integrations that consume these reports (Jira, DefectDojo, webhooks), see Jira Integration, DefectDojo Integration, and Webhook Notifications. For the internal database storage of scan data, see Infrastructure Persistence Layer. For advanced report customization and CVSS scoring details, see Report Format and Output.

Report Generation Overview

WSHawk generates security assessment reports in multiple formats to serve different audiences and integration requirements. All reports contain the same core vulnerability data but are structured differently for their intended use case.

Report Generation Flow

graph TB
    Scanner["WSHawkV2.run_heuristic_scan()"]
    Vulns[("vulnerabilities[]<br/>List of findings")]
    
    Scanner --> Vulns
    
    Vulns --> GenReport["generate_report()"]
    GenReport --> JSONStructure["JSON Report Structure<br/>{scan_info, vulnerabilities, summary}"]
    
    JSONStructure --> HTMLGen["generate_html_report()"]
    JSONStructure --> CSVGen["generate_csv_report()"]
    JSONStructure --> SARIFGen["generate_sarif_report()"]
    JSONStructure --> RawJSON["save_json_report()"]
    
    HTMLGen --> HTMLFile["wshawk_report_YYYYMMDD_HHMMSS.html"]
    CSVGen --> CSVFile["wshawk_report_YYYYMMDD_HHMMSS.csv"]
    SARIFGen --> SARIFFile["wshawk_report_YYYYMMDD_HHMMSS.sarif"]
    RawJSON --> JSONFile["wshawk_report_YYYYMMDD_HHMMSS.json"]
    
    HTMLFile --> Human["Human Analysts"]
    JSONFile --> SIEM["SIEM/SOC Platforms"]
    CSVFile --> Spreadsheet["Excel/Data Analysis"]
    SARIFFile --> GitHub["GitHub Security Tab<br/>GitLab Security Dashboard"]

Sources: wshawk/main.py:903-1000, README.md:176-186, RELEASE_SUMMARY.md:52-56

Core Report Data Structure

All report formats derive from a common JSON data structure generated by the generate_report() method.

Report Schema

| Field | Type | Description | |-------|------|-------------| | scan_info | Object | Metadata about the scan execution | | scan_info.target | String | WebSocket URL that was scanned | | scan_info.start_time | ISO8601 | Scan initiation timestamp | | scan_info.end_time | ISO8601 | Scan completion timestamp | | scan_info.duration | Float | Total scan time in seconds | | scan_info.scanner | String | Tool identifier ("WSHawk by Regaan") | | scan_info.messages_sent | Integer | Total WebSocket messages transmitted | | scan_info.messages_received | Integer | Total responses received | | vulnerabilities | Array | List of vulnerability findings | | summary | Object | Aggregated statistics | | summary.total | Integer | Total number of findings | | summary.critical | Integer | Count of CRITICAL severity findings | | summary.high | Integer | Count of HIGH severity findings | | summary.medium | Integer | Count of MEDIUM severity findings |

Vulnerability Object Structure

Each vulnerability in the vulnerabilities array contains:

| Field | Type | Description | |-------|------|-------------| | type | String | Vulnerability classification (e.g., "SQL Injection") | | severity | Enum | CRITICAL, HIGH, MEDIUM, or LOW | | description | String | Human-readable explanation of the finding | | payload | String | Attack vector that triggered the vulnerability | | response | String (Optional) | Server response snippet (truncated to 200 chars) | | recommendation | String | Remediation guidance | | cvss_score | Float (Optional) | CVSS v3.1 score (0.0-10.0) | | cvss_vector | String (Optional) | CVSS v3.1 vector string | | confidence | String (Optional) | Detection confidence: LOW, MEDIUM, HIGH |

Sources: wshawk/main.py:903-927

HTML Report Format

HTML reports are the primary output for human analysts, featuring styled presentation with embedded screenshots and interactive elements.

HTML Report Features

graph LR
    HTMLReport["wshawk_report_*.html"]
    
    HTMLReport --> Header["Header Section<br/>Scanner branding<br/>Scan metadata"]
    HTMLReport --> Executive["Executive Summary<br/>Vulnerability counts<br/>Severity distribution"]
    HTMLReport --> Findings["Detailed Findings<br/>Per-vulnerability cards"]
    HTMLReport --> Traffic["Traffic Logs<br/>Request/Response pairs"]
    HTMLReport --> Fingerprint["Server Fingerprint<br/>Technology stack"]
    HTMLReport --> Footer["Footer<br/>Timestamp<br/>Legal disclaimer"]
    
    Findings --> VulnCard["Vulnerability Card"]
    VulnCard --> CVSSBadge["CVSS Score Badge"]
    VulnCard --> Payload["Payload Code Block"]
    VulnCard --> Evidence["Evidence Section"]
    VulnCard --> Remediation["Remediation Steps"]
    
    Evidence --> Screenshot["Screenshot (if XSS)"]
    Evidence --> ResponseSnippet["Response Snippet"]

HTML Generation Implementation

The generate_html_report() method constructs a self-contained HTML document with inline CSS.

Key Components:

CSS Styling wshawk/main.py:938-970
- Responsive design with flexbox layout
- Color-coded severity badges (CRITICAL=red, HIGH=orange, MEDIUM=yellow)
- Syntax-highlighted code blocks for payloads
- Print-friendly media queries
Dynamic Content Rendering wshawk/main.py:971-1000
- Iterates over self.vulnerabilities to generate finding cards
- Truncates long responses to prevent DOM bloat
- HTML-escapes user-controlled data to prevent self-XSS
- Embeds screenshots as base64 data URIs (when available)
File Naming Convention
- Pattern: wshawk_report_YYYYMMDD_HHMMSS.html
- Example: wshawk_report_20250115_143022.html

Sources: wshawk/main.py:928-1000, README.md:176-186

JSON Report Format

JSON reports provide machine-readable structured data for programmatic consumption.

JSON Export Example

{
  "scan_info": {
    "target": "ws://example.com/chat",
    "start_time": "2025-01-15T14:30:22.123456",
    "end_time": "2025-01-15T14:45:18.987654",
    "duration": 896.864,
    "scanner": "WSHawk by Regaan",
    "messages_sent": 15432,
    "messages_received": 14201
  },
  "vulnerabilities": [
    {
      "type": "SQL Injection",
      "severity": "CRITICAL",
      "description": "SQL injection vulnerability in WebSocket message",
      "payload": "' OR '1'='1",
      "response": "mysql_error: You have an error in your SQL syntax...",
      "recommendation": "Use parameterized queries and input validation",
      "cvss_score": 9.8,
      "cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H"
    }
  ],
  "summary": {
    "total": 12,
    "critical": 3,
    "high": 5,
    "medium": 4
  }
}

JSON Use Cases

SIEM Ingestion: Parse findings into security information and event management platforms
API Integration: Consumed by the Web Dashboard REST API docs/V3_COMPLETE_GUIDE.md:318-330
Automated Triage: Programmatic analysis of vulnerability types and severity
Historical Analysis: Time-series vulnerability trend tracking

Sources: wshawk/main.py:903-927, RELEASE_SUMMARY.md:53

CSV Report Format

CSV reports enable spreadsheet analysis and bulk data processing.

CSV Schema

| Column | Description | |--------|-------------| | Target | WebSocket URL | | Vulnerability Type | Classification (e.g., "XSS") | | Severity | CRITICAL/HIGH/MEDIUM/LOW | | Payload | Attack vector | | Description | Finding explanation | | Recommendation | Remediation guidance | | CVSS Score | Numeric score (if available) | | Timestamp | ISO8601 detection time |

CSV Generation Flow

graph LR
    JSONReport["JSON Report Structure"]
    CSVWriter["csv.DictWriter"]
    
    JSONReport --> Iterator["Iterate vulnerabilities[]"]
    Iterator --> Flatten["Flatten nested objects"]
    Flatten --> CSVWriter
    CSVWriter --> WriteHeader["Write header row"]
    WriteHeader --> WriteRows["Write finding rows"]
    WriteRows --> CSVFile["wshawk_report_*.csv"]
    
    CSVFile --> Excel["Microsoft Excel"]
    CSVFile --> Pandas["Python pandas"]
    CSVFile --> SQL["SQL COPY/IMPORT"]

CSV Export Advantages

Data Pivoting: Excel pivot tables for vulnerability distribution analysis
Bulk Operations: Mass update of remediation status
Database Import: Direct loading into PostgreSQL/MySQL with COPY FROM
Compliance Reporting: Merge with asset inventory for audit reports

Sources: RELEASE_SUMMARY.md:53

SARIF Report Format

SARIF (Static Analysis Results Interchange Format) is an OASIS standard for security tool interoperability, designed for CI/CD pipeline integration.

SARIF Structure for WSHawk

graph TB
    SARIFRoot["$schema: sarif-schema-2.1.0.json"]
    
    SARIFRoot --> Version["version: 2.1.0"]
    SARIFRoot --> Runs["runs[]"]
    
    Runs --> Tool["tool:<br/>WSHawk driver"]
    Runs --> Results["results[]"]
    
    Tool --> ToolName["name: WSHawk"]
    Tool --> ToolVersion["version: 3.0.0"]
    Tool --> InfoURI["informationUri:<br/>https://wshawk.rothackers.com"]
    
    Results --> Result["result object"]
    Result --> RuleID["ruleId:<br/>WS-SQL-001"]
    Result --> Level["level:<br/>error/warning/note"]
    Result --> Message["message.text:<br/>SQL injection detected"]
    Result --> Locations["locations[]"]
    
    Locations --> PhysicalLoc["physicalLocation"]
    PhysicalLoc --> ArtifactLoc["artifactLocation.uri:<br/>ws://target.com"]
    PhysicalLoc --> Region["region.snippet.text:<br/>Payload"]

SARIF Integration Points

| Platform | Integration Method | Documentation | |----------|-------------------|---------------| | GitHub | Code Scanning API | Upload via github/codeql-action/upload-sarif@v2 | | GitLab | Security Dashboard | Artifact path in .gitlab-ci.yml | | Azure DevOps | Sarif SAST Scans Tab | Publish via SARIF extension | | SonarQube | External Issues | Generic Issue Import format |

SARIF Generation Example

{
  "$schema": "https://raw.githubusercontent.com/oasis-tcs/sarif-spec/master/Schemata/sarif-schema-2.1.0.json",
  "version": "2.1.0",
  "runs": [
    {
      "tool": {
        "driver": {
          "name": "WSHawk",
          "version": "3.0.0",
          "informationUri": "https://github.com/regaan/wshawk"
        }
      },
      "results": [
        {
          "ruleId": "WS-XSS-001",
          "level": "error",
          "message": {
            "text": "Cross-Site Scripting vulnerability detected in WebSocket message"
          },
          "locations": [
            {
              "physicalLocation": {
                "artifactLocation": {
                  "uri": "ws://example.com/chat"
                },
                "region": {
                  "snippet": {
                    "text": "<script>alert(1)</script>"
                  }
                }
              }
            }
          ]
        }
      ]
    }
  ]
}

Sources: RELEASE_SUMMARY.md:54, docs/V3_COMPLETE_GUIDE.md:363-377

Report Export Mechanisms

Command-Line Export Options

WSHawk supports multiple export formats via CLI flags:

# Default: HTML only
wshawk-advanced ws://target.com

# Explicit format selection
wshawk-advanced ws://target.com --format html
wshawk-advanced ws://target.com --format json
wshawk-advanced ws://target.com --format csv
wshawk-advanced ws://target.com --format sarif

# Multiple formats
wshawk-advanced ws://target.com --format html --format json --format sarif

# All formats
wshawk-advanced ws://target.com --all-formats

Programmatic Export via Python API

from wshawk.scanner_v2 import WSHawkV2

scanner = WSHawkV2("ws://target.com")
await scanner.run_heuristic_scan()

# Generate all formats
html_path = scanner.generate_html_report("report.html")
json_report = scanner.generate_report()  # Returns dict
csv_path = scanner.generate_csv_report("report.csv")
sarif_path = scanner.generate_sarif_report("report.sarif")

Web Dashboard Export

The Web Management Dashboard provides browser-based export:

graph LR
    Dashboard["Web Dashboard<br/>Scan History View"]
    
    Dashboard --> ViewReport["View Report Button"]
    Dashboard --> ExportMenu["Export Dropdown"]
    
    ViewReport --> HTMLRender["Render HTML in browser"]
    
    ExportMenu --> DownloadHTML["Download HTML"]
    ExportMenu --> DownloadJSON["Download JSON"]
    ExportMenu --> DownloadCSV["Download CSV"]
    ExportMenu --> DownloadSARIF["Download SARIF"]
    
    DownloadHTML --> Browser["Browser Downloads"]
    DownloadJSON --> Browser
    DownloadCSV --> Browser
    DownloadSARIF --> Browser

REST API Export Endpoint:

GET /api/scans/{scan_id}/report?format=json
GET /api/scans/{scan_id}/report?format=html
GET /api/scans/{scan_id}/report?format=csv
GET /api/scans/{scan_id}/report?format=sarif

Authorization: Bearer <api_key>

Sources: README.md:103-136, docs/V3_COMPLETE_GUIDE.md:310-330

SOC and SIEM Integration Patterns

Integration Architecture

graph TB
    WSHawk["WSHawk Scanner"]
    
    WSHawk --> FileExport["File-Based Export"]
    WSHawk --> APIExport["API-Based Export"]
    WSHawk --> WebhookPush["Webhook Push"]
    
    FileExport --> JSONFile["JSON Report"]
    FileExport --> CSVFile["CSV Report"]
    FileExport --> SARIFFile["SARIF Report"]
    
    APIExport --> DefectDojo["DefectDojo API<br/>/api/v2/import-scan/"]
    APIExport --> Jira["Jira API<br/>/rest/api/3/issue"]
    
    WebhookPush --> Slack["Slack Incoming Webhook"]
    WebhookPush --> Teams["Microsoft Teams Connector"]
    WebhookPush --> Discord["Discord Webhook"]
    
    JSONFile --> Splunk["Splunk HEC<br/>(HTTP Event Collector)"]
    JSONFile --> ELK["ELK Stack<br/>Logstash JSON input"]
    CSVFile --> QRadar["IBM QRadar<br/>CSV ingestion"]
    SARIFFile --> GitHubSec["GitHub Security Tab"]
    
    DefectDojo --> CentralizedVulnDB["Centralized Vuln DB"]
    Jira --> TicketingWorkflow["Ticketing Workflow"]

SIEM Integration: Splunk Example

Splunk HTTP Event Collector Configuration:

# Export JSON report
wshawk-advanced ws://target.com --format json

# Send to Splunk HEC
curl -k https://splunk.company.com:8088/services/collector \
  -H "Authorization: Splunk <HEC_TOKEN>" \
  -d @wshawk_report_20250115_143022.json

Splunk Search Queries:

# Find all critical WebSocket vulnerabilities
index=security sourcetype=wshawk 
| spath vulnerabilities{}.severity
| search vulnerabilities{}.severity=CRITICAL
| stats count by vulnerabilities{}.type

# Time-series vulnerability trend
index=security sourcetype=wshawk
| timechart span=1d count(vulnerabilities{}) by vulnerabilities{}.type

SIEM Integration: ELK Stack Example

Logstash Pipeline Configuration:

input {
  file {
    path => "/var/log/wshawk/reports/*.json"
    codec => "json"
    type => "wshawk_scan"
  }
}

filter {
  if [type] == "wshawk_scan" {
    split {
      field => "vulnerabilities"
    }
    
    mutate {
      add_field => {
        "vuln_type" => "%{[vulnerabilities][type]}"
        "severity" => "%{[vulnerabilities][severity]}"
        "target" => "%{[scan_info][target]}"
      }
    }
  }
}

output {
  elasticsearch {
    hosts => ["http://elasticsearch:9200"]
    index => "wshawk-findings-%{+YYYY.MM.dd}"
  }
}

SIEM Integration: IBM QRadar Example

QRadar Custom Log Source:

Create Custom Log Source Type: "WSHawk Scanner"
Configure CSV ingestion with field mapping
Create custom properties for vulnerability classification

QRadar AQL Query:

SELECT 
  "Target" as target_url,
  "Vulnerability Type" as vuln_type,
  "Severity" as severity,
  COUNT(*) as occurrence_count
FROM events
WHERE "Log Source" = 'WSHawk Scanner'
  AND "Severity" IN ('CRITICAL', 'HIGH')
  AND deviceTime > NOW() - 7 DAYS
GROUP BY target_url, vuln_type, severity
ORDER BY occurrence_count DESC

CI/CD Pipeline Integration Example

GitHub Actions Workflow:

name: WebSocket Security Scan

on:
  push:
    branches: [main]
  schedule:
    - cron: '0 2 * * *'  # Daily at 2 AM

jobs:
  wshawk-scan:
    runs-on: ubuntu-latest
    
    steps:
      - name: Run WSHawk
        run: |
          docker run --rm -v $(pwd)/reports:/reports \
            rothackers/wshawk:latest \
            wshawk-advanced ws://staging.company.com \
            --format sarif --format json
      
      - name: Upload SARIF to GitHub Security
        uses: github/codeql-action/upload-sarif@v2
        with:
          sarif_file: reports/wshawk_report_*.sarif
      
      - name: Archive Reports
        uses: actions/upload-artifact@v3
        with:
          name: security-reports
          path: reports/
      
      - name: Fail on Critical Findings
        run: |
          CRITICAL=$(jq '.summary.critical' reports/*.json)
          if [ "$CRITICAL" -gt 0 ]; then
            echo "Found $CRITICAL critical vulnerabilities"
            exit 1
          fi

Sources: README.md:199-239, docs/V3_COMPLETE_GUIDE.md:352-377

Report Storage and Lifecycle Management

File System Storage

Default storage location for generated reports:

~/.wshawk/
├── reports/
│   ├── wshawk_report_20250115_143022.html
│   ├── wshawk_report_20250115_143022.json
│   ├── wshawk_report_20250115_143022.csv
│   └── wshawk_report_20250115_143022.sarif
├── screenshots/
│   └── xss_verification_*.png
└── traffic_logs/
    └── traffic_20250115_143022.log

Database Storage (Web Dashboard)

The Web Dashboard persists reports to SQLite with WAL mode:

graph TB
    ScanExecution["Scanner Execution"]
    
    ScanExecution --> DBInsert["INSERT INTO scans"]
    DBInsert --> ScansTable["scans table<br/>id, target, start_time, end_time"]
    
    ScanExecution --> VulnInsert["INSERT INTO vulnerabilities"]
    VulnInsert --> VulnsTable["vulnerabilities table<br/>scan_id, type, severity, payload"]
    
    ScansTable --> Query["SELECT * FROM scans<br/>ORDER BY start_time DESC"]
    VulnsTable --> Query
    
    Query --> DashboardView["Dashboard Scan History"]
    DashboardView --> ReportRegen["Regenerate Report<br/>from DB data"]

Database Schema:

-- Scans metadata table
CREATE TABLE scans (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    target TEXT NOT NULL,
    start_time TEXT NOT NULL,
    end_time TEXT,
    duration REAL,
    messages_sent INTEGER,
    messages_received INTEGER,
    status TEXT DEFAULT 'running'
);

-- Vulnerabilities findings table
CREATE TABLE vulnerabilities (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    scan_id INTEGER NOT NULL,
    type TEXT NOT NULL,
    severity TEXT NOT NULL,
    description TEXT,
    payload TEXT,
    response TEXT,
    recommendation TEXT,
    cvss_score REAL,
    FOREIGN KEY (scan_id) REFERENCES scans(id) ON DELETE CASCADE
);

-- Index for fast severity filtering
CREATE INDEX idx_severity ON vulnerabilities(severity);
CREATE INDEX idx_scan_id ON vulnerabilities(scan_id);

Sources: RELEASE_SUMMARY.md:16-19, docs/V3_COMPLETE_GUIDE.md:122-125

Report Retention and Cleanup

Automatic Cleanup Configuration

Configure report retention in wshawk.yaml:

reports:
  retention_days: 30
  max_reports: 100
  auto_cleanup: true
  
  storage:
    path: "~/.wshawk/reports"
    compress_old: true  # gzip reports older than 7 days

Manual Cleanup Commands

# Delete reports older than 30 days
find ~/.wshawk/reports -name "wshawk_report_*.html" -mtime +30 -delete

# Archive old reports
tar -czf wshawk_archive_$(date +%Y%m).tar.gz ~/.wshawk/reports/*.html
find ~/.wshawk/reports -name "*.html" -mtime +30 -delete

# Database vacuum (reclaim space)
sqlite3 ~/.wshawk/scans.db "VACUUM;"

Sources: docs/V3_COMPLETE_GUIDE.md:294-298

Summary of Report Format Comparison

| Feature | HTML | JSON | CSV | SARIF | |---------|------|------|-----|-------| | Primary Audience | Human analysts | APIs/Scripts | Data analysts | CI/CD pipelines | | Styling | CSS embedded | None | None | None | | Screenshots | Embedded base64 | External refs | Not supported | External refs | | CVSS Scoring | Visual badges | Numeric values | Numeric column | Standardized rules | | Payload Display | Syntax highlighted | Raw strings | Escaped strings | Snippet objects | | File Size | Large (MB) | Medium | Small | Medium | | Parsing Complexity | High (DOM) | Low (JSON) | Very Low (CSV) | Medium (JSON+Schema) | | Machine Readable | No | Yes | Yes | Yes | | Human Readable | Yes | Moderate | Yes (spreadsheet) | No | | GitHub Integration | Manual upload | No | No | Native (Security tab) | | SIEM Ingestion | No | Yes (HEC/REST) | Yes (file import) | No (wrong format) |

This comparison helps select the appropriate format based on downstream consumption requirements.

Sources: README.md:176-186, RELEASE_SUMMARY.md:52-56