HF Space - self-improving scientific agents

Self-Improving Open Scientific Agents

A research demo for scientific AI agents that route tasks through an OpenClaw-style gateway, test Hermes-compatible reasoning lanes, critique their own outputs, and improve through benchmarks, memory, and reproducibility checks.
OpenClaw-style orchestration Hermes-compatible backend lane Self-critique + benchmark gate Statistics + numerical analysis + decision agents

Control Room

Agent team
Reasoning backend lane
Orchestration layer
1 5
Memory mode
Benchmark gate
Safety mode

When off, benchmark-approved lessons are staged but not promoted into next-run memory.

Deployment note: this demo is deterministic and sandboxed. It demonstrates workflow, critique, benchmarking, and product direction without claiming unrestricted autonomous external-system access.
Shared Scientific Agent Protocol scientific-agent-protocol-v1
The self-improving agent uses the same seven-stage handoff contract to judge and improve the Statistics, Numerical Analysis, and Optimal Control + OR agents.
1 Question

Turn the request into a scoped scientific question and data/model brief.

2 Assumptions

Record data, model, numerical, and decision assumptions before method choice.

3 Method Choice

Select a defensible statistical, numerical, or optimization method with alternatives.

4 Code Artifact

Produce Python, R, and MATLAB/Octave drafts with aligned variables and parameters.

5 Diagnostics

Run checks for missingness, stability, feasibility, uncertainty, or reproducibility.

6 Critique

Let the judge/critic challenge overclaims, weak evidence, and missing tests.

7 Final Memo

Return a cautious memo with evidence, limits, next steps, and downloadable artifacts.