Skip to main content

Retrieval, memory, and replay

Aegis treats memory as part of EvidenceGraph, not as a second parallel store. The goal is to recall the right evidence at the right time, preserve provenance, and keep long-horizon work recoverable without replaying everything.

Retrieval scopes

Retrieval should choose scopes deliberately instead of searching everything. The active work posture determines which scopes are opened.

turn

Use for the current request packet and fresh tool outputs.

session

Use for tactical details that still belong to the current exchange.

lineage

Use for resume chains, interrupted work, recent decisions, blockers, and pending next steps.

workspace

Use for project-specific architecture decisions, repo rules, and durable work artifacts.

profile

Use for long-lived preferences, boundaries, and relationship evidence.

A normal active turn should usually search turn, then session, then lineage. A resume path should bias toward lineage, session, and then workspace.

Structured turn storage

Aegis does not want to keep turns as one flat transcript blob. Instead, the long-horizon design uses a structured turn record with four semantic slots:

  • Observation — what the agent perceived at the start of the turn
  • Reasoning — what it decided, planned, or evaluated during the turn
  • Action — what it replied, executed, or mutated
  • Outcome — what succeeded, failed, or stayed open after reconciliation

This structure enables three important behaviors:

  • slot-aware compression instead of one lossy summary
  • targeted retrieval by slot
  • better tracing of action chains and decision history across time

Reasoning availability tiers

Not every provider exposes the same level of reasoning detail. The memory design keeps that explicit instead of pretending all traces are equal.

The reasoning slot can be stored as one of four tiers:

  • raw_trace
    • provider-visible reasoning when policy and budget allow it
  • structured_rationale
    • normalized decisions, rejected options, and blocker analysis
  • decision_summary
    • the shortest durable explanation of why the turn moved in one direction
  • none
    • no durable reasoning trace is available, so later replay relies on observation, action, and outcome evidence only

The retriever must never invent hidden raw reasoning that the provider or runtime did not expose.

Replay guardrails

Replay is for correctness, not spectacle. The runtime should surface the smallest replay slice needed to recover the current decision context.

That means:

  • default to summaries or structured rationale when that is enough
  • open deeper replay only when the active work posture justifies it
  • keep provenance explicit so operators know whether a rationale came from raw trace, runtime projection, or later replay
  • preserve budget discipline instead of flooding the prompt with old history

Compression strategy

The design uses slot-aware compression rather than one global summarizer. Compression is a maintenance process, not a per-turn ritual.

Level 0 — raw

Keep the full structured record for the freshest turns.

Level 1 — slot_summarized

Compress each slot independently so tool calls, actions, and outcomes are not collapsed into vague prose.

Level 2 — merged

Merge adjacent turns that belong to the same work item into a work episode. This preserves the net decision chain and final state change.

Level 3 — archived

Reduce a work episode to a durable memory of:

  • what the goal was
  • what was decided
  • what was produced
  • what the outcome was
  • what corrections mattered

Retrieval pipeline

The candidate pipeline is layered:

  1. resolve intent and active scopes
  2. gather cheap lexical candidates with SQLite FTS5
  3. expand recall with the mmbert-embed-32k-2d-matryoshka backbone
  4. walk graph links across work items, artifacts, lineage, and profile context
  5. rerank by active work relevance, recency, provenance quality, correction lineage, and relationship fit

The recommended embedding modes are:

  • 64d for broad low-cost prefiltering
  • 256d for default online recall
  • 768d for difficult recovery, rebuild, replay, and evaluation paths

Resume packet design

A wake or resume path should reconstruct a compact ResumePacket from:

  • active WorkGraph state
  • pending blockers and deadlines
  • top recalled evidence with reasons
  • relevant profile constraints
  • replay-ready action or reasoning evidence only when it is actually needed
  • optional procedure overlays for known workflows

The aim is not to replay the whole past. The aim is to recover the next correct move.

Where to go next

Once evidence and replay are governed correctly, Aegis can safely learn from outcomes. Continue with Learning and procedures.