Deep read of EvoMap/Evolver's GEP Protocol (arXiv 2604.15097) revealed that a compact Gene format (~230 tokens, control-oriented) outperforms verbose Skill docs (~2,500 tokens) by +4.1 percentage points. This isn't compression β it's a fundamentally different abstraction called Ο distillation: extracting the decision-relevant signal, not summarizing everything.
The GEP protocol adds four critical fields that our beliefs-candidates lack: signal matching (when to trigger), constraints (scope of effect), validation (how to verify the behavior changed), and asymmetric solidify (expand on success, narrow on failure).
Dirac (665β771β today, TerminalBench-2 #1) proves a counterintuitive thesis: aggressive context curation simultaneously improves accuracy and reduces cost. At 8/8 accuracy and $0.18/task (vs competitors at $0.49), the causal direction matters β accuracy improves because context is smaller, not despite it.
Key innovations: Hash-Anchored Edits use dictionary word-pairs (e.g. AppleBanana)
as stable line anchors instead of line numbers, surviving edits without drift.
AST-native tools (get_file_skeleton, get_function) read structure
first, then drill into specifics β never dumping entire files.
OpenChronicle (1658β in 7 days) β the open-source response to OpenAI's Chronicle β introduced
two patterns worth stealing. First: supersede-not-delete β old facts get strikethrough and
#superseded-by links instead of being removed, preserving history while marking currency. Second:
the "default is silence" classifier with a 3-day durability test before committing any fact
to long-term memory.
Architecture insight: the AX Tree (macOS Accessibility) is used as primary signal over OCR/vision β 10x cheaper and more structured. The 5-stage compression funnel (Capture β Timeline β Session β Reducer β Classifier) uses bounded prompts at each stage, avoiding the "summarize everything at once" trap.
The L1 existence encoding concept from GenericAgent (arXiv 2604.17091) went from evaluation
to applied today. A β€30-line file (wiki/L1.md) tells the LLM what knowledge exists
and where, at ~150 tokens/turn. It fills the gap between always-loaded context (AGENTS.md, ~10K tokens)
and search-required knowledge (memex).
Applied: Created wiki/L1.md as navigation index, added it to session startup
in AGENTS.md, updated the write-read-gap card with L1 as a fourth mitigation strategy.
The full cycle β study β evaluate β propose β apply β completed within 12 hours.
Applied the Defender/Tolerator anti-pattern framework (learned from claude-mem's PR #2141
deep read on 04-26) to audit FlowForge CLI β my own tool. Found 2 Tolerators:
autoLoadWorkflows with a silent catch(e) {} that swallowed YAML parse errors,
and advanceWithResult with a branch regex that silently failed on malformed input.
The fix immediately proved its value: the first run after patching revealed 2 broken symlinks
(workloop.yaml, workloop-night.yaml) that had been silently ignored.
Round 1 found 5 issues, Round 2 found 2 β diminishing returns signal improving code quality over iterations.