Observatory Dashboard
A static HTML dashboard for runs, benchmarks, and lessons learned from your CCO sessions.
What is the Observatory?
The Observatory is a static HTML dashboard that visualizes your CCO runtime history. It provides a comprehensive view of past runs, benchmark comparisons, and cross-run insights to help you understand patterns and improve your workflow.
Unlike dynamic dashboards that require a running server, the Observatory generates static HTML files that can be viewed offline and shared easily. The dashboard is built from run artifacts and benchmark data collected during CCO execution.
The Observatory is regenerated after notable runs or benchmarks land. Use python3 scripts/build_report_dashboard.py to rebuild it manually.
Dashboard Location
The Observatory dashboard is available at:
| File | Description |
|---|---|
docs/observatory/index.html |
The main dashboard view - open this file in a browser |
docs/observatory/dashboard-source.html |
The source template for generating the dashboard |
Quick Access
Open docs/observatory/index.html directly in your browser to view the dashboard. No server required.
Features
Run Tracking
The Observatory maintains a complete history of all CCO runs with full artifact preservation:
- View past runs with artifacts - Browse through previous executions including their outputs, metrics, and logs
- Trace execution flow via run_id - Use the unique run identifier to track exactly what happened during each session
- Branch-based isolation - Each run maintains its own branch context for clean reproducibility
Benchmark Comparisons
Compare performance across different runs and configurations:
- Context usage metrics - Track how efficiently CCO uses context across different scenarios
- Role distribution charts - Visualize which agents are being used most frequently
- Task completion rates - Measure success rates and identify bottlenecks
Lessons Learned
The Observatory captures institutional knowledge from your runs:
- Cross-run memory insights - Patterns that emerge across multiple runs are automatically distilled
- Pattern recognition - Common success paths and failure modes are highlighted
Paper Alignment Metrics
Measure how well your CCO usage aligns with the research paper's framework (arxiv:2602.20478):
| Metric | Description |
|---|---|
| Tasks Count | Total number of tasks executed, showing workflow volume |
| Agent Role Distribution | Which specialized agents are being invoked and how often |
| Context Retrieval Hit Rate | Percentage of context lookups that found relevant information |
| Completion Rates | Ratio of completed vs abandoned tasks |
| Task Patterns | Identified patterns in task types and execution flows |
Parallel Workstream Visualization
The Observatory includes an SVG Gantt chart showing concurrent runs:
- Concurrent run tracking - See multiple CCO sessions running in parallel
- Timeline visualization - Understand temporal relationships between runs
- Resource utilization - Identify periods of high and low activity
Gantt Chart Features
The SVG-based Gantt visualization is interactive and shows:
- Run start/end times
- Duration and overlap between concurrent sessions
- Agent invocation density across time
Accessing Observatory
# Open directly in browser open docs/observatory/index.html # Or rebuild the dashboard first python3 scripts/build_report_dashboard.py
Files Reference
| Path | Purpose |
|---|---|
docs/observatory/index.html |
Generated dashboard - view this in browser |
docs/observatory/dashboard-source.html |
Source template - edit this to customize dashboard structure |
Metrics Dashboard
The Observatory includes a comprehensive metrics dashboard that provides deep visibility into your CCO usage patterns.
Available Metrics
- Total Tasks - Cumulative count of all tasks across runs
- Role Distribution - Breakdown of which specialized agents handled work
- Context Retrieval Hits - How often relevant context was found in memory
- Completion Rates - Percentage of tasks that finished successfully
- Task Patterns - Identified recurring execution patterns
Using Metrics for Continuous Improvement
Review the metrics dashboard regularly to:
- Identify underutilized agents that could help with certain task types
- Spot context retrieval patterns that indicate memory gaps
- Track completion rate trends to catch degradation early
- Compare benchmark results across different model configurations
Interpreting the Dashboard
The dashboard uses visual indicators to help you quickly assess status:
- Green indicators - Metrics within expected ranges
- Yellow indicators - Metrics approaching threshold limits
- Red indicators - Metrics requiring attention or correction