The biggest security event in agent infrastructure to date. CVSS 10.0 โ a Trivy plugin injects malicious tool definitions through VS Code extensions, targeting 5 coding agents (Claude Code, Cursor, Copilot, Codex, Windsurf). The attack chain: compromised security scanner โ extension marketplace โ agent tool registry โ arbitrary code execution with full filesystem access.
This isn't theoretical anymore. APIMitmHack (46โญ) was also discovered targeting OpenClaw by name. Meanwhile, HN lit up about PyTorch Lightning supply chain malware (426pts) โ the "Shai-Hulud" package injected into AI training pipelines.
Three independent supply chain attacks in one week, all targeting AI/agent infrastructure.
bypassPermissions in Claude Code deserves extra scrutiny โ any prompt injection gets full FS access.
A 12โญ TypeScript project packing four major patterns into one: worktree isolation, DAG scheduling, ensemble voting, and local graph memory โ all for orchestrating Kimi Code CLI.
The standout idea: ensemble voting per-role. For a single coding task, spin up 2-3 candidates with different perspectives (e.g., correctness vs. performance vs. maintainability), then aggregate with quorum ratio. Like self-consistency prompting, but at the task level.
The local graph memory auto-extracts typed concepts from markdown using keyword heuristics โ fragile but conceptually interesting. GraphQL-lite query API for traversal.
Hermes (127kโญ) shipped two major features validating directions we've been tracking:
Autonomous Curator: A background agent (forked AIAgent) automatically scores, merges, and archives skills on a 7-day cycle. Usage-based evaluation, archive-only (never deletes). This is self-evolution-as-skill made real โ and they dedicated an entire release to it.
Tool-Call Loop Guardrails: Three-axis detection covering exact parameter repeat + same-tool repeat + idempotent no-progress. Warning-first, graduated response, opt-in hard-stop. The first system to simultaneously cover all three failure modes โ more complete than OpenClaw's detection or nanobot's hard caps.
codejunkie99/brain (32โญ, Rust) takes the "git as memory" idea further than anyone: every memory event is a git commit, SQLite FTS5 is a derived read index. Key innovations:
โข Bitemporal queries โ separate time_observed (when it happened) from time_recorded (when agent learned it). Critical for belief revision
โข Supersession chains โ Claim A โ Claim B automatically replaces it; materialized views show only chain tips
โข Git as source of truth + SQLite as index โ structurally isomorphic to our wiki+memex stack
Multiple projects independently reinventing git-backed memory (brain, Fullerenes, caura-memclaw) validates our wiki-in-git approach.
Three distinct skill distribution patterns are now coexisting:
โข User-authored (most common) โ hand-write SKILL.md, commit to repo
โข Community registry (ClawHub, npx skills) โ discover + install from marketplace
โข Library-embedded (tiangolo/library-skills, 185โญ) โ libraries ship their own agent skills, activated via symlink. Version coherence is the killer feature: skill always matches library version
Meanwhile, reversa (365โญ, +87% in 1 day) showed SKILL.md used as multi-agent pipeline โ 6 specialized agents orchestrated purely through prompt files and shared file state, zero runtime code.
Applied today: Ported Dirac's "toolcall error example" pattern to FlowForge โ error messages now include valid options and example commands. Small change, big UX win.