Performance Optimization

This page documents GeoHazardWatch's performance characteristics, known bottlenecks, and optimization strategies.

Server Startup Performance

Server startup time is critical for developer experience and production deployments. The following optimizations have been implemented.

Baseline Measurements

Initial testing with ~2,800 pages revealed significant startup delays:

Metric	Before	After	Improvement
Total startup time	48+ seconds	~3 seconds	16x faster
Link graph build	54 seconds	0.15 seconds	360x faster
Page count	2,834	3,225	(more pages, faster startup)

Root Causes Identified

Triple Page Scan on Startup

All pages were read from disk three separate times during initialization:

Pass	Component	Purpose	Original Time
1	`FileSystemProvider.refreshPageList()`	Build page cache	~0.4s
2	`LunrSearchProvider.buildIndex()`	Build search index	~1.5s
3	`RenderingManager.buildLinkGraph()`	Build link graph	~41s

Link Graph O(n²) Behavior

RenderingManager.buildLinkGraph() exhibited quadratic complexity:

Read all pages sequentially with await pageManager.getPage()
For each page link, called pageNameMatcher.findMatch() with linear search
With ~10 links per page average = 28,340 searches × 2,834 iterations

Sequential Async Reads

All three passes used a slow sequential pattern instead of parallel operations.

Benchmark results for different read patterns:

Pattern	Time (2,834 files)
Sync reads	71ms
Parallel async	122ms
Sequential async	268ms
Sequential + YAML parse	411ms

Optimizations Implemented

Content Caching in FileSystemProvider

Page content is now cached during refreshPageList()
Eliminates 2 of 3 disk passes during startup
Files modified: src/providers/FileSystemProvider.ts

PageNameMatcher Index

Added buildIndex() method for O(1) lookups instead of O(n)
Hash-based index replaces linear search through all page names
Files modified: src/utils/PageNameMatcher.ts

Parallel Page Loading

RenderingManager.buildLinkGraph() now uses Promise.all() for parallel loading
Batch processing instead of sequential await in loops
Files modified: src/managers/RenderingManager.ts

Metadata-Only Operations

Replaced getPage() with getPageMetadata() where only metadata is needed
getPageMetadata() is synchronous and requires no disk I/O
Affected files: WikiRoutes.ts, UserManager.ts, ImportManager.ts

Storage Performance

Performance varies significantly based on storage type:

Storage Type	Per-file latency	Estimated Startup (3 passes)
Local SSD	0.03ms	2-3 seconds
NAS (1Gbps)	2-5ms	30-60 seconds
Cloud (S3)	50-100ms	10-15 minutes

Content caching is particularly important for NAS and cloud storage where per-file latency is high.

Markup Parser Performance

The markup parser includes built-in performance monitoring for parsing operations.

Configuration Options

{
  "amdwiki": {
    "markup": {
      "performance": {
        "monitoring": true,
        "alertThresholds": {
          "parseTime": 100
        }
      },
      "cache": {
        "metricsEnabled": true
      }
    }
  }
}

Available Metrics

Parse time per page
Cache hit/miss ratios
Alert thresholds for slow operations

Best Practices

Use SSD storage for the data directory when possible
Avoid network storage for large instances (>1000 pages)
Monitor startup time after configuration changes
Use getPageMetadata() instead of getPage() when only metadata is needed

Future Optimizations

Potential improvements not yet implemented:

Incremental link graph updates instead of full rebuild
Startup timing metrics in logs for monitoring

For instances with 10,000+ pages: switch to ElasticsearchSearchProvider — persistent indexing means no cold rebuild on restart, and each page save is a single-document upsert rather than a full in-memory rebuild. See Configuration Properties Reference for the config keys.

See also Configuration for configuration options and System Settings for server settings.

GitHub Issue #250 - Original performance analysis