Observatory Dashboard

A static HTML dashboard for runs, benchmarks, and lessons learned from your CCO sessions.

What is the Observatory?

The Observatory is a static HTML dashboard that visualizes your CCO runtime history. It provides a comprehensive view of past runs, benchmark comparisons, and cross-run insights to help you understand patterns and improve your workflow.

Unlike dynamic dashboards that require a running server, the Observatory generates static HTML files that can be viewed offline and shared easily. The dashboard is built from run artifacts and benchmark data collected during CCO execution.

ℹ️

The Observatory is regenerated after notable runs or benchmarks land. Use python3 scripts/build_report_dashboard.py to rebuild it manually.

Dashboard Location

The Observatory dashboard is available at:

File Description
docs/observatory/index.html The main dashboard view - open this file in a browser
docs/observatory/dashboard-source.html The source template for generating the dashboard

Quick Access

Open docs/observatory/index.html directly in your browser to view the dashboard. No server required.

Features

Run Tracking

The Observatory maintains a complete history of all CCO runs with full artifact preservation:

  • View past runs with artifacts - Browse through previous executions including their outputs, metrics, and logs
  • Trace execution flow via run_id - Use the unique run identifier to track exactly what happened during each session
  • Branch-based isolation - Each run maintains its own branch context for clean reproducibility

Benchmark Comparisons

Compare performance across different runs and configurations:

  • Context usage metrics - Track how efficiently CCO uses context across different scenarios
  • Role distribution charts - Visualize which agents are being used most frequently
  • Task completion rates - Measure success rates and identify bottlenecks

Lessons Learned

The Observatory captures institutional knowledge from your runs:

  • Cross-run memory insights - Patterns that emerge across multiple runs are automatically distilled
  • Pattern recognition - Common success paths and failure modes are highlighted

Paper Alignment Metrics

Measure how well your CCO usage aligns with the research paper's framework (arxiv:2602.20478):

Metric Description
Tasks Count Total number of tasks executed, showing workflow volume
Agent Role Distribution Which specialized agents are being invoked and how often
Context Retrieval Hit Rate Percentage of context lookups that found relevant information
Completion Rates Ratio of completed vs abandoned tasks
Task Patterns Identified patterns in task types and execution flows

Parallel Workstream Visualization

The Observatory includes an SVG Gantt chart showing concurrent runs:

  • Concurrent run tracking - See multiple CCO sessions running in parallel
  • Timeline visualization - Understand temporal relationships between runs
  • Resource utilization - Identify periods of high and low activity

Gantt Chart Features

The SVG-based Gantt visualization is interactive and shows:

  • Run start/end times
  • Duration and overlap between concurrent sessions
  • Agent invocation density across time

Accessing Observatory

bash
# Open directly in browser
open docs/observatory/index.html

# Or rebuild the dashboard first
python3 scripts/build_report_dashboard.py

Files Reference

Path Purpose
docs/observatory/index.html Generated dashboard - view this in browser
docs/observatory/dashboard-source.html Source template - edit this to customize dashboard structure

Metrics Dashboard

The Observatory includes a comprehensive metrics dashboard that provides deep visibility into your CCO usage patterns.

Available Metrics

  • Total Tasks - Cumulative count of all tasks across runs
  • Role Distribution - Breakdown of which specialized agents handled work
  • Context Retrieval Hits - How often relevant context was found in memory
  • Completion Rates - Percentage of tasks that finished successfully
  • Task Patterns - Identified recurring execution patterns

Using Metrics for Continuous Improvement

Review the metrics dashboard regularly to:

  • Identify underutilized agents that could help with certain task types
  • Spot context retrieval patterns that indicate memory gaps
  • Track completion rate trends to catch degradation early
  • Compare benchmark results across different model configurations

Interpreting the Dashboard

The dashboard uses visual indicators to help you quickly assess status:

  • Green indicators - Metrics within expected ranges
  • Yellow indicators - Metrics approaching threshold limits
  • Red indicators - Metrics requiring attention or correction