Report Formats and Export
Report Formats and Export
The following files were used as context for generating this wiki page:
- .github/workflows/ghcr-publish.yml
- README.md
- RELEASE_3.0.0.md
- RELEASE_SUMMARY.md
- docs/V3_COMPLETE_GUIDE.md
- wshawk/main.py
This document describes WSHawk's reporting and export capabilities, including the structure and generation of HTML, JSON, CSV, and SARIF formats. For configuration of external integrations that consume these reports (Jira, DefectDojo, webhooks), see Jira Integration, DefectDojo Integration, and Webhook Notifications. For the internal database storage of scan data, see Infrastructure Persistence Layer. For advanced report customization and CVSS scoring details, see Report Format and Output.
Report Generation Overview
WSHawk generates security assessment reports in multiple formats to serve different audiences and integration requirements. All reports contain the same core vulnerability data but are structured differently for their intended use case.
Report Generation Flow
graph TB
Scanner["WSHawkV2.run_heuristic_scan()"]
Vulns[("vulnerabilities[]<br/>List of findings")]
Scanner --> Vulns
Vulns --> GenReport["generate_report()"]
GenReport --> JSONStructure["JSON Report Structure<br/>{scan_info, vulnerabilities, summary}"]
JSONStructure --> HTMLGen["generate_html_report()"]
JSONStructure --> CSVGen["generate_csv_report()"]
JSONStructure --> SARIFGen["generate_sarif_report()"]
JSONStructure --> RawJSON["save_json_report()"]
HTMLGen --> HTMLFile["wshawk_report_YYYYMMDD_HHMMSS.html"]
CSVGen --> CSVFile["wshawk_report_YYYYMMDD_HHMMSS.csv"]
SARIFGen --> SARIFFile["wshawk_report_YYYYMMDD_HHMMSS.sarif"]
RawJSON --> JSONFile["wshawk_report_YYYYMMDD_HHMMSS.json"]
HTMLFile --> Human["Human Analysts"]
JSONFile --> SIEM["SIEM/SOC Platforms"]
CSVFile --> Spreadsheet["Excel/Data Analysis"]
SARIFFile --> GitHub["GitHub Security Tab<br/>GitLab Security Dashboard"]
Sources: wshawk/main.py:903-1000, README.md:176-186, RELEASE_SUMMARY.md:52-56
Core Report Data Structure
All report formats derive from a common JSON data structure generated by the generate_report() method.
Report Schema
| Field | Type | Description |
|-------|------|-------------|
| scan_info | Object | Metadata about the scan execution |
| scan_info.target | String | WebSocket URL that was scanned |
| scan_info.start_time | ISO8601 | Scan initiation timestamp |
| scan_info.end_time | ISO8601 | Scan completion timestamp |
| scan_info.duration | Float | Total scan time in seconds |
| scan_info.scanner | String | Tool identifier ("WSHawk by Regaan") |
| scan_info.messages_sent | Integer | Total WebSocket messages transmitted |
| scan_info.messages_received | Integer | Total responses received |
| vulnerabilities | Array | List of vulnerability findings |
| summary | Object | Aggregated statistics |
| summary.total | Integer | Total number of findings |
| summary.critical | Integer | Count of CRITICAL severity findings |
| summary.high | Integer | Count of HIGH severity findings |
| summary.medium | Integer | Count of MEDIUM severity findings |
Vulnerability Object Structure
Each vulnerability in the vulnerabilities array contains:
| Field | Type | Description |
|-------|------|-------------|
| type | String | Vulnerability classification (e.g., "SQL Injection") |
| severity | Enum | CRITICAL, HIGH, MEDIUM, or LOW |
| description | String | Human-readable explanation of the finding |
| payload | String | Attack vector that triggered the vulnerability |
| response | String (Optional) | Server response snippet (truncated to 200 chars) |
| recommendation | String | Remediation guidance |
| cvss_score | Float (Optional) | CVSS v3.1 score (0.0-10.0) |
| cvss_vector | String (Optional) | CVSS v3.1 vector string |
| confidence | String (Optional) | Detection confidence: LOW, MEDIUM, HIGH |
Sources: wshawk/main.py:903-927
HTML Report Format
HTML reports are the primary output for human analysts, featuring styled presentation with embedded screenshots and interactive elements.
HTML Report Features
graph LR
HTMLReport["wshawk_report_*.html"]
HTMLReport --> Header["Header Section<br/>Scanner branding<br/>Scan metadata"]
HTMLReport --> Executive["Executive Summary<br/>Vulnerability counts<br/>Severity distribution"]
HTMLReport --> Findings["Detailed Findings<br/>Per-vulnerability cards"]
HTMLReport --> Traffic["Traffic Logs<br/>Request/Response pairs"]
HTMLReport --> Fingerprint["Server Fingerprint<br/>Technology stack"]
HTMLReport --> Footer["Footer<br/>Timestamp<br/>Legal disclaimer"]
Findings --> VulnCard["Vulnerability Card"]
VulnCard --> CVSSBadge["CVSS Score Badge"]
VulnCard --> Payload["Payload Code Block"]
VulnCard --> Evidence["Evidence Section"]
VulnCard --> Remediation["Remediation Steps"]
Evidence --> Screenshot["Screenshot (if XSS)"]
Evidence --> ResponseSnippet["Response Snippet"]
HTML Generation Implementation
The generate_html_report() method constructs a self-contained HTML document with inline CSS.
Key Components:
-
CSS Styling wshawk/main.py:938-970
- Responsive design with flexbox layout
- Color-coded severity badges (CRITICAL=red, HIGH=orange, MEDIUM=yellow)
- Syntax-highlighted code blocks for payloads
- Print-friendly media queries
-
Dynamic Content Rendering wshawk/main.py:971-1000
- Iterates over
self.vulnerabilitiesto generate finding cards - Truncates long responses to prevent DOM bloat
- HTML-escapes user-controlled data to prevent self-XSS
- Embeds screenshots as base64 data URIs (when available)
- Iterates over
-
File Naming Convention
- Pattern:
wshawk_report_YYYYMMDD_HHMMSS.html - Example:
wshawk_report_20250115_143022.html
- Pattern:
Sources: wshawk/main.py:928-1000, README.md:176-186
JSON Report Format
JSON reports provide machine-readable structured data for programmatic consumption.
JSON Export Example
{
"scan_info": {
"target": "ws://example.com/chat",
"start_time": "2025-01-15T14:30:22.123456",
"end_time": "2025-01-15T14:45:18.987654",
"duration": 896.864,
"scanner": "WSHawk by Regaan",
"messages_sent": 15432,
"messages_received": 14201
},
"vulnerabilities": [
{
"type": "SQL Injection",
"severity": "CRITICAL",
"description": "SQL injection vulnerability in WebSocket message",
"payload": "' OR '1'='1",
"response": "mysql_error: You have an error in your SQL syntax...",
"recommendation": "Use parameterized queries and input validation",
"cvss_score": 9.8,
"cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H"
}
],
"summary": {
"total": 12,
"critical": 3,
"high": 5,
"medium": 4
}
}
JSON Use Cases
- SIEM Ingestion: Parse findings into security information and event management platforms
- API Integration: Consumed by the Web Dashboard REST API docs/V3_COMPLETE_GUIDE.md:318-330
- Automated Triage: Programmatic analysis of vulnerability types and severity
- Historical Analysis: Time-series vulnerability trend tracking
Sources: wshawk/main.py:903-927, RELEASE_SUMMARY.md:53
CSV Report Format
CSV reports enable spreadsheet analysis and bulk data processing.
CSV Schema
| Column | Description |
|--------|-------------|
| Target | WebSocket URL |
| Vulnerability Type | Classification (e.g., "XSS") |
| Severity | CRITICAL/HIGH/MEDIUM/LOW |
| Payload | Attack vector |
| Description | Finding explanation |
| Recommendation | Remediation guidance |
| CVSS Score | Numeric score (if available) |
| Timestamp | ISO8601 detection time |
CSV Generation Flow
graph LR
JSONReport["JSON Report Structure"]
CSVWriter["csv.DictWriter"]
JSONReport --> Iterator["Iterate vulnerabilities[]"]
Iterator --> Flatten["Flatten nested objects"]
Flatten --> CSVWriter
CSVWriter --> WriteHeader["Write header row"]
WriteHeader --> WriteRows["Write finding rows"]
WriteRows --> CSVFile["wshawk_report_*.csv"]
CSVFile --> Excel["Microsoft Excel"]
CSVFile --> Pandas["Python pandas"]
CSVFile --> SQL["SQL COPY/IMPORT"]
CSV Export Advantages
- Data Pivoting: Excel pivot tables for vulnerability distribution analysis
- Bulk Operations: Mass update of remediation status
- Database Import: Direct loading into PostgreSQL/MySQL with
COPY FROM - Compliance Reporting: Merge with asset inventory for audit reports
Sources: RELEASE_SUMMARY.md:53
SARIF Report Format
SARIF (Static Analysis Results Interchange Format) is an OASIS standard for security tool interoperability, designed for CI/CD pipeline integration.
SARIF Structure for WSHawk
graph TB
SARIFRoot["$schema: sarif-schema-2.1.0.json"]
SARIFRoot --> Version["version: 2.1.0"]
SARIFRoot --> Runs["runs[]"]
Runs --> Tool["tool:<br/>WSHawk driver"]
Runs --> Results["results[]"]
Tool --> ToolName["name: WSHawk"]
Tool --> ToolVersion["version: 3.0.0"]
Tool --> InfoURI["informationUri:<br/>https://wshawk.rothackers.com"]
Results --> Result["result object"]
Result --> RuleID["ruleId:<br/>WS-SQL-001"]
Result --> Level["level:<br/>error/warning/note"]
Result --> Message["message.text:<br/>SQL injection detected"]
Result --> Locations["locations[]"]
Locations --> PhysicalLoc["physicalLocation"]
PhysicalLoc --> ArtifactLoc["artifactLocation.uri:<br/>ws://target.com"]
PhysicalLoc --> Region["region.snippet.text:<br/>Payload"]
SARIF Integration Points
| Platform | Integration Method | Documentation |
|----------|-------------------|---------------|
| GitHub | Code Scanning API | Upload via github/codeql-action/upload-sarif@v2 |
| GitLab | Security Dashboard | Artifact path in .gitlab-ci.yml |
| Azure DevOps | Sarif SAST Scans Tab | Publish via SARIF extension |
| SonarQube | External Issues | Generic Issue Import format |
SARIF Generation Example
{
"$schema": "https://raw.githubusercontent.com/oasis-tcs/sarif-spec/master/Schemata/sarif-schema-2.1.0.json",
"version": "2.1.0",
"runs": [
{
"tool": {
"driver": {
"name": "WSHawk",
"version": "3.0.0",
"informationUri": "https://github.com/regaan/wshawk"
}
},
"results": [
{
"ruleId": "WS-XSS-001",
"level": "error",
"message": {
"text": "Cross-Site Scripting vulnerability detected in WebSocket message"
},
"locations": [
{
"physicalLocation": {
"artifactLocation": {
"uri": "ws://example.com/chat"
},
"region": {
"snippet": {
"text": "<script>alert(1)</script>"
}
}
}
}
]
}
]
}
]
}
Sources: RELEASE_SUMMARY.md:54, docs/V3_COMPLETE_GUIDE.md:363-377
Report Export Mechanisms
Command-Line Export Options
WSHawk supports multiple export formats via CLI flags:
# Default: HTML only
wshawk-advanced ws://target.com
# Explicit format selection
wshawk-advanced ws://target.com --format html
wshawk-advanced ws://target.com --format json
wshawk-advanced ws://target.com --format csv
wshawk-advanced ws://target.com --format sarif
# Multiple formats
wshawk-advanced ws://target.com --format html --format json --format sarif
# All formats
wshawk-advanced ws://target.com --all-formats
Programmatic Export via Python API
from wshawk.scanner_v2 import WSHawkV2
scanner = WSHawkV2("ws://target.com")
await scanner.run_heuristic_scan()
# Generate all formats
html_path = scanner.generate_html_report("report.html")
json_report = scanner.generate_report() # Returns dict
csv_path = scanner.generate_csv_report("report.csv")
sarif_path = scanner.generate_sarif_report("report.sarif")
Web Dashboard Export
The Web Management Dashboard provides browser-based export:
graph LR
Dashboard["Web Dashboard<br/>Scan History View"]
Dashboard --> ViewReport["View Report Button"]
Dashboard --> ExportMenu["Export Dropdown"]
ViewReport --> HTMLRender["Render HTML in browser"]
ExportMenu --> DownloadHTML["Download HTML"]
ExportMenu --> DownloadJSON["Download JSON"]
ExportMenu --> DownloadCSV["Download CSV"]
ExportMenu --> DownloadSARIF["Download SARIF"]
DownloadHTML --> Browser["Browser Downloads"]
DownloadJSON --> Browser
DownloadCSV --> Browser
DownloadSARIF --> Browser
REST API Export Endpoint:
GET /api/scans/{scan_id}/report?format=json
GET /api/scans/{scan_id}/report?format=html
GET /api/scans/{scan_id}/report?format=csv
GET /api/scans/{scan_id}/report?format=sarif
Authorization: Bearer <api_key>
Sources: README.md:103-136, docs/V3_COMPLETE_GUIDE.md:310-330
SOC and SIEM Integration Patterns
Integration Architecture
graph TB
WSHawk["WSHawk Scanner"]
WSHawk --> FileExport["File-Based Export"]
WSHawk --> APIExport["API-Based Export"]
WSHawk --> WebhookPush["Webhook Push"]
FileExport --> JSONFile["JSON Report"]
FileExport --> CSVFile["CSV Report"]
FileExport --> SARIFFile["SARIF Report"]
APIExport --> DefectDojo["DefectDojo API<br/>/api/v2/import-scan/"]
APIExport --> Jira["Jira API<br/>/rest/api/3/issue"]
WebhookPush --> Slack["Slack Incoming Webhook"]
WebhookPush --> Teams["Microsoft Teams Connector"]
WebhookPush --> Discord["Discord Webhook"]
JSONFile --> Splunk["Splunk HEC<br/>(HTTP Event Collector)"]
JSONFile --> ELK["ELK Stack<br/>Logstash JSON input"]
CSVFile --> QRadar["IBM QRadar<br/>CSV ingestion"]
SARIFFile --> GitHubSec["GitHub Security Tab"]
DefectDojo --> CentralizedVulnDB["Centralized Vuln DB"]
Jira --> TicketingWorkflow["Ticketing Workflow"]
SIEM Integration: Splunk Example
Splunk HTTP Event Collector Configuration:
# Export JSON report
wshawk-advanced ws://target.com --format json
# Send to Splunk HEC
curl -k https://splunk.company.com:8088/services/collector \
-H "Authorization: Splunk <HEC_TOKEN>" \
-d @wshawk_report_20250115_143022.json
Splunk Search Queries:
# Find all critical WebSocket vulnerabilities
index=security sourcetype=wshawk
| spath vulnerabilities{}.severity
| search vulnerabilities{}.severity=CRITICAL
| stats count by vulnerabilities{}.type
# Time-series vulnerability trend
index=security sourcetype=wshawk
| timechart span=1d count(vulnerabilities{}) by vulnerabilities{}.type
SIEM Integration: ELK Stack Example
Logstash Pipeline Configuration:
input {
file {
path => "/var/log/wshawk/reports/*.json"
codec => "json"
type => "wshawk_scan"
}
}
filter {
if [type] == "wshawk_scan" {
split {
field => "vulnerabilities"
}
mutate {
add_field => {
"vuln_type" => "%{[vulnerabilities][type]}"
"severity" => "%{[vulnerabilities][severity]}"
"target" => "%{[scan_info][target]}"
}
}
}
}
output {
elasticsearch {
hosts => ["http://elasticsearch:9200"]
index => "wshawk-findings-%{+YYYY.MM.dd}"
}
}
SIEM Integration: IBM QRadar Example
QRadar Custom Log Source:
- Create Custom Log Source Type: "WSHawk Scanner"
- Configure CSV ingestion with field mapping
- Create custom properties for vulnerability classification
QRadar AQL Query:
SELECT
"Target" as target_url,
"Vulnerability Type" as vuln_type,
"Severity" as severity,
COUNT(*) as occurrence_count
FROM events
WHERE "Log Source" = 'WSHawk Scanner'
AND "Severity" IN ('CRITICAL', 'HIGH')
AND deviceTime > NOW() - 7 DAYS
GROUP BY target_url, vuln_type, severity
ORDER BY occurrence_count DESC
CI/CD Pipeline Integration Example
GitHub Actions Workflow:
name: WebSocket Security Scan
on:
push:
branches: [main]
schedule:
- cron: '0 2 * * *' # Daily at 2 AM
jobs:
wshawk-scan:
runs-on: ubuntu-latest
steps:
- name: Run WSHawk
run: |
docker run --rm -v $(pwd)/reports:/reports \
rothackers/wshawk:latest \
wshawk-advanced ws://staging.company.com \
--format sarif --format json
- name: Upload SARIF to GitHub Security
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: reports/wshawk_report_*.sarif
- name: Archive Reports
uses: actions/upload-artifact@v3
with:
name: security-reports
path: reports/
- name: Fail on Critical Findings
run: |
CRITICAL=$(jq '.summary.critical' reports/*.json)
if [ "$CRITICAL" -gt 0 ]; then
echo "Found $CRITICAL critical vulnerabilities"
exit 1
fi
Sources: README.md:199-239, docs/V3_COMPLETE_GUIDE.md:352-377
Report Storage and Lifecycle Management
File System Storage
Default storage location for generated reports:
~/.wshawk/
├── reports/
│ ├── wshawk_report_20250115_143022.html
│ ├── wshawk_report_20250115_143022.json
│ ├── wshawk_report_20250115_143022.csv
│ └── wshawk_report_20250115_143022.sarif
├── screenshots/
│ └── xss_verification_*.png
└── traffic_logs/
└── traffic_20250115_143022.log
Database Storage (Web Dashboard)
The Web Dashboard persists reports to SQLite with WAL mode:
graph TB
ScanExecution["Scanner Execution"]
ScanExecution --> DBInsert["INSERT INTO scans"]
DBInsert --> ScansTable["scans table<br/>id, target, start_time, end_time"]
ScanExecution --> VulnInsert["INSERT INTO vulnerabilities"]
VulnInsert --> VulnsTable["vulnerabilities table<br/>scan_id, type, severity, payload"]
ScansTable --> Query["SELECT * FROM scans<br/>ORDER BY start_time DESC"]
VulnsTable --> Query
Query --> DashboardView["Dashboard Scan History"]
DashboardView --> ReportRegen["Regenerate Report<br/>from DB data"]
Database Schema:
-- Scans metadata table
CREATE TABLE scans (
id INTEGER PRIMARY KEY AUTOINCREMENT,
target TEXT NOT NULL,
start_time TEXT NOT NULL,
end_time TEXT,
duration REAL,
messages_sent INTEGER,
messages_received INTEGER,
status TEXT DEFAULT 'running'
);
-- Vulnerabilities findings table
CREATE TABLE vulnerabilities (
id INTEGER PRIMARY KEY AUTOINCREMENT,
scan_id INTEGER NOT NULL,
type TEXT NOT NULL,
severity TEXT NOT NULL,
description TEXT,
payload TEXT,
response TEXT,
recommendation TEXT,
cvss_score REAL,
FOREIGN KEY (scan_id) REFERENCES scans(id) ON DELETE CASCADE
);
-- Index for fast severity filtering
CREATE INDEX idx_severity ON vulnerabilities(severity);
CREATE INDEX idx_scan_id ON vulnerabilities(scan_id);
Sources: RELEASE_SUMMARY.md:16-19, docs/V3_COMPLETE_GUIDE.md:122-125
Report Retention and Cleanup
Automatic Cleanup Configuration
Configure report retention in wshawk.yaml:
reports:
retention_days: 30
max_reports: 100
auto_cleanup: true
storage:
path: "~/.wshawk/reports"
compress_old: true # gzip reports older than 7 days
Manual Cleanup Commands
# Delete reports older than 30 days
find ~/.wshawk/reports -name "wshawk_report_*.html" -mtime +30 -delete
# Archive old reports
tar -czf wshawk_archive_$(date +%Y%m).tar.gz ~/.wshawk/reports/*.html
find ~/.wshawk/reports -name "*.html" -mtime +30 -delete
# Database vacuum (reclaim space)
sqlite3 ~/.wshawk/scans.db "VACUUM;"
Sources: docs/V3_COMPLETE_GUIDE.md:294-298
Summary of Report Format Comparison
| Feature | HTML | JSON | CSV | SARIF | |---------|------|------|-----|-------| | Primary Audience | Human analysts | APIs/Scripts | Data analysts | CI/CD pipelines | | Styling | CSS embedded | None | None | None | | Screenshots | Embedded base64 | External refs | Not supported | External refs | | CVSS Scoring | Visual badges | Numeric values | Numeric column | Standardized rules | | Payload Display | Syntax highlighted | Raw strings | Escaped strings | Snippet objects | | File Size | Large (MB) | Medium | Small | Medium | | Parsing Complexity | High (DOM) | Low (JSON) | Very Low (CSV) | Medium (JSON+Schema) | | Machine Readable | No | Yes | Yes | Yes | | Human Readable | Yes | Moderate | Yes (spreadsheet) | No | | GitHub Integration | Manual upload | No | No | Native (Security tab) | | SIEM Ingestion | No | Yes (HEC/REST) | Yes (file import) | No (wrong format) |
This comparison helps select the appropriate format based on downstream consumption requirements.
Sources: README.md:176-186, RELEASE_SUMMARY.md:52-56