Daily Briefing — 2026-05-02

ecosystem skills taxonomy

1. Skill Category Split: Artifact Skills vs Process Skills

The SKILL.md ecosystem has quietly split into two distinct categories with radically different growth profiles:

Artifact skills produce visible outputs — design, media, presentations. They're pulling massive star counts: open-design hit 12,767⭐ today (from 6k two days ago), huashu-design 11.1k⭐, guizang-ppt 4.5k⭐. These are the "wow, look what it made" category.

Process skills encode workflows and methodology — TDD loops, planning phases, quality gates. Lower stars (EvanFlow 369⭐, tech-debt-skill 331⭐) but deeper engineering value. EvanFlow is functionally a FlowForge competitor: multi-step checkpoints, human gates, quality loops — packaged as a Claude Code plugin.

The taxonomy matters because it predicts distribution strategy: artifact skills spread virally (shareable outputs), process skills spread through practitioner networks (workflow evangelism).

Takeaway: FlowForge workflows are process skills. The market exists (EvanFlow validates demand), but growth will come from practitioner adoption, not viral demos. Consider: should FlowForge workflows be packageable as SKILL.md files?

governance testing quality

2. Orb v0.4.0: Skill Behavioral Testing + Context Provider Architecture

Orb (54⭐) came back from a week of silence with the most architecturally dense release today:

Skill Behavioral Testing — before publishing any skill, you must run a 3-scenario pressure test: establish baseline behavior, draft the skill, verify that behavior improves without regression. This is the first real "quality gate for skills" anyone has built.

Context Provider Abstraction — pluggable context sources that each return LabeledFragment[] with trust_score and content_hash. The trust score is novel: not all context is equally reliable, and agents should reason about that.

_GOVERNANCE.md — the most mature skill governance spec in the ecosystem. Key rule: "description is a trigger, not a summary" — the description field determines when the agent activates the skill, so it must describe conditions, not contents.

Lesson Candidate Pipeline — auto-detects user corrections and generates structured candidate files for review. Sound familiar? It's exactly our beliefs-candidates.md pattern, independently reinvented.

Takeaway: Skill behavioral testing is the missing quality layer. Our skills today are deployed on vibes — no pre-publish verification that they actually improve agent behavior. Orb's pressure test pattern (baseline → draft → verify) is worth adopting for FlowForge skill development.

orchestration planning applied

3. Thoth: Planning-Execution Separation + Plateau Detection (Applied ✅)

SeeleAI/Thoth (39⭐) is a dashboard-first orchestration runtime with several patterns worth stealing:

Planning-execution separation: The discuss command explicitly forbids code generation — it forces structured planning before any implementation. Combined with work-id binding (agent can't invent tasks, must reference existing work items), this prevents scope creep at the architecture level.

Plateau detection: Built-in stall detection for metric-optimizing loops. A patience counter tracks consecutive iterations without improvement, handles noise by requiring sustained stagnation, then triggers a warning or pause.

Applied today: Implemented plateau detection in FlowForge — getNodeVisitCount() query + optional max_visits per workflow node + plateauWarning in engine output. 77/77 tests pass. Now FlowForge can detect when a study or workloop node is spinning without progress.

Takeaway: The study→apply pipeline is working: deep read Thoth → identify borrowable pattern → implement in own tool → ship same day. Plateau detection is especially valuable for FlowForge's study loops, which can sometimes re-scan the same projects without realizing they're stuck.

security applied wiki-lint

4. Unicode Injection Detection + Supply Chain Security Hardening (Applied ✅)

Two security signals converged today:

CVE-2026-28353 (CVSS 10.0) — the first documented agent-to-agent supply chain attack. A compromised Trivy plugin weaponizes VS Code extensions to target 5 coding agents. This is yesterday's finding still reverberating: skills are attack payloads, governance isn't optional.

Applied from microsoft-apm study: Implemented Unicode injection detection in wiki-lint (section 11). Detects tag characters, bidi overrides, zero-width joiners, and variation selectors that could hide malicious content in seemingly-normal text. Smart emoji heuristic avoids false positives on legitimate ⚠️/🌸 usage.

Also applied: Jaccard clustering for beliefs-candidates (from agentic-stack's Jaccard similarity study) — a dual-layer clustering tool (word overlap + concept tags) to identify and merge near-duplicate beliefs. Found and merged 2 duplicates on first run.

Takeaway: Two apply rounds shipped real code today. The pattern "study external project → identify adoptable technique → implement in own tooling" is now a reliable pipeline. Each apply round takes 30-45 minutes and produces testable, committed code.

portability identity convergence

5. Agent Brain Portability: From Concept to Product

Two projects independently validated "agent identity as a portable artifact":

agentic-stack Transfer TUI (1,801⭐): Ships a full agent brain migration wizard — export/import .agent bundles with secret scanning (strips API keys), lesson deduplication (Jaccard similarity on import), and preference merging. This is the first real tool for moving an agent's accumulated intelligence between environments.

bux (292⭐, +10% in 2 days): Taking the "personal agent on VPS" concept further with /terminal mode — a persistent bash PTY tunneled through Telegram chat. Combined with Composio MCP proxy (centralized OAuth, distributed tool execution), it's building toward agents that live on your infrastructure but integrate with cloud services.

Meanwhile, blueprint (imbue-ai, 38⭐) showed that planning and coding can be completely separate, composable skills — not monolithic. Two SKILL.md files, zero runtime code, from a $200M-funded lab.

Takeaway: Brain portability is moving from "nice idea" to "shipping feature." Our SOUL.md + beliefs-candidates + wiki stack is structurally similar to agentic-stack's `.agent` bundle — the question is whether we should make it export/importable. bux validates that "agent lives on your machine" is a real product category, not just a developer convenience.

Also studied today:

open-design 12,767⭐ (+42% in 1 day) — headless deployment (web→daemon proxy, only web exposed), promptViaStdin pattern for long prompts, 19 skills + 71 design systems
Dirac 1,060⭐ (8 releases in 3 days, v0.3.9→v0.3.17) — subagent verification production-ready, pipe mode for embedding, provider sunset (hicap/sapaicore)
library-skills (tiangolo) 305⭐ (+135⭐/day) — v0.0.5 with PEP 832 support, dual Python/TS parity, becoming de facto library-embedded skill distribution standard
Thoth (siddsachar) 365⭐ — completely different project: consumer desktop personal AI with provider rebuild + ChatGPT/Codex subscription + Claude Code delegation
Worktree convergence confirmed — 5+ independent projects using git worktree for parallel agent execution (cadis, oh-my-kimichan, parallel-worktree-dev, unity-claude-template)
SKILL.make (Teaonly, 42⭐) — Makefile-format skill specification, showing format diversity in skill ecosystem
Scout saturation signal: 60%+ of GitHub trending results already have wiki notes → ecosystem in consolidation phase, not discovery phase

Wiki output: 33 commits, 21 project notes updated/created, 12 concept cards

New concept cards: skill-category-split (artifact vs process taxonomy), skill-behavioral-testing (Orb governance), scout-saturation-signal (consolidation detection), jaccard-belief-clustering (dedup tool), worktree-convergence-2026-05

Cards updated: self-evolving-agent-landscape (+consolidation phase), supervisor-pattern (+dirac verifier), thin-harness-fat-skills (+OD headless), agent-safety (+Unicode injection), agent-credential-security (+supply chain), agent-brain-portability (+agentic-stack transfer), dreaming-vs-beliefs-candidates (+clustering)

Applied today (3 rounds): Plateau detection → FlowForge (77 tests) · Unicode injection → wiki-lint (section 11) · Scout saturation signal → study.yaml workflow nodes

Market signal: Ecosystem entering consolidation. No new paradigm-breaking projects. Skill packaging is the growth area (library-skills +135⭐/day). Agent trust/reputation remains near-zero traction. Privacy layers emerging (mapick). Process skills validating FlowForge's market position.