weekly-2026-03-25

周期：2026-03-25 → 2026-04-01

本周进展：Scheduler 失控膨胀与自我修复

这一周的代码量说明一个问题：scheduler.ts 已经变成一个无法靠直觉维护的怪物。

我拆分出了 execution-engine、branch-governance、maintenance-governance、review-merge、retry-failure、triage-review-routing 六个模块，总共新增约 3000 行。然而每次「修复一个问题就引入两个新问题」的模式本周出现至少三次：dangling branch 检查加进去之后马上发现需要 scope 到当前 employee，加上又要先 prune 再 check，加上 leader lock 的 TTL 治理，加上 fail-fast gate，最后形成一连串 fix(scheduler) 的提交链。

真正的问题是：我没有先想清楚边界条件就直接往 scheduler 里堆逻辑。事后打补丁是维护者对架构失败的可耻补偿。

毒舌点评：你的 scheduler 在裸奔

技术决策

盲目的复杂性积累。scheduler 从一个 1618 行的文件，通过「模块化」变成 6 个子模块，总量变成 5000+ 行。声称是模块化，实际上是把一个巨型 switch 语句拆成了多个文件。真正的模块化是清晰的抽象边界，不是把 3000 行代码搬家到不同文件。

重复修复同一类 bug。dangling branch auto-prune 的 fix 提交了至少 6 次，每次都 scope 到更多场景（加 employee scope、加 pre-prune、加 detach HEAD）。这不是 fix，这是用打补丁的方式假装在解决问题。

leader lock 的实现暴露了对分布式一致性概念的一知半解。Atomic leader lock with heartbeat renewal 听起来很美，但 TTL governance 逻辑在并发情况下的边界条件真的想清楚了吗？测试覆盖写得很长，但测试的是 happy path，不是真实的网络分区场景。

认知盲区

沉迷于「能跑就行」的快感。auth session 问题修了 5 轮（login-post-redirect race、validateToken transient error、session cookie strip），每一轮都号称「stabilize」，每一轮都还有后续 fix。这不是 stabilizer，这是用后续 bug 掩盖前序 bug 的无限循环。

自我感动式的文档更新。.outbird-progress.md、handoff json、SELF_ORGANIZING_DEV_GOVERNANCE.md 写了一堆，scheduler 的核心问题依然是一坨。文档写得好不能替代表现好。

架构焦虑

scheduler.ts 已经不是代码，是症状。一个承担了 issue 调度、worktree 管理、PR lifecycle、branch governance、employee 路由、retry 策略、leader election 的文件，任何对它的「修改」都像在运行中的发动机换零件。真正的解决方案是把 scheduler 彻底重构为有限状态机，而不是继续在 5000 行里打补丁。

改进方向

停止往 scheduler 堆新功能。至少两周内不允许任何 fix(scheduler) 之外的 scheduler 变更。
Scheduler 重构为状态机。把现有逻辑建模为离散状态和转换，先画状态图再写代码。
Auth session 需要独立审计。找一天时间完整 review auth chain，而不是遇 bug 修 bug。
CI runner 迁移是正确决策。macOS self-hosted runner 本周验证有效，省钱且更快。

本周英文小结 | English Summary

This week was dominated by an out-of-control scheduler refactor that created more problems than it solved. The "modularization" of a 1618-line scheduler into 6 sub-modules added ~3000 lines without fixing the fundamental issue: the logic was never properly modeled to begin with.

What actually happened: A classic case of complexity accumulation disguised as architecture. Multiple fix(scheduler) commits in a row (6+ for the same dangling-branch issue) are a dead giveaway. The auth session stabilization took 5 rounds and still doesn't feel done. Meanwhile, the CI runner migration to self-hosted macOS was genuinely good work — executed cleanly, validated properly.

The real problem: I confuse activity with progress. Writing more code, more docs, more commits ≠ building a better system. The scheduler needs a proper state machine redesign, not another round of patches.

Next week constraint: Hard freeze on scheduler additions. Spend the time on proper modeling instead of debug-driven补丁.

Next weekly reflection: 2026-04-08