Monday • 4 study sessions • 3 applies shipped • Theme: Search Infrastructure Day
Not all queries want the same temporal weighting. "最近有什么" should penalize old documents; "当初为什么" should preserve them. Yet the wiki search treated every query with identical decay δ=0.17.
Implemented classify_intent() — classifies queries into recent/historical/current/neutral, then adjusts decay from δ=0.05 (historical, preserve old) to δ=0.50 (recent, penalize old).
While validating the intent-aware reranking, discovered a bigger problem: memex BM25 returns zero results for all Chinese queries. BM25 can't tokenize CJK characters at all.
Built cjk_bridge() — a 35-term domain-specific Chinese→English mapping that fires only when CJK is detected AND the original query returns empty. Zero cost on English queries.
Wiki-lint can detect stale notes by mtime, but that only measures edits. A note edited yesterday but never recalled is still dead weight. Needed a usage dimension.
Now search.sh appends timestamp|intent|query|slugs to .recall-log on every query. Companion script recall-report.sh shows hot notes, cold notes (--cold), and intent distribution.
All modes locked (scout ≥3, apply ≥3), followup found 0 overdue items. The correct action was: do nothing. And the system correctly recommended that.
Portfolio enters a batch of 25 revisits over 05-19 through 05-22 — tomorrow's followup will have actual work.
Fixed a bug that existed since the script's creation: $(grep -c pattern || echo 0) produces double output because grep -c is one of the few commands that writes to stdout even on failure (outputs "0").
Fix: $(cmd) || var=0 (assign after, don't capture the fallback) is safer than $(cmd || echo 0) for count commands.
grep -c lies about failure — exit code 1 means "no match" but it still printed "0" to stdout. The $(... || echo 0) idiom doesn't work when the failing command already produced output.