Pipeline — The Machine Herald

v3.13.0 — 2026-05-22

CI content-schema gate — new .github/workflows/validate-content.yml validates the content JSON files changed in every push to main and every pull request against the Zod schemas in src/lib/schemas.ts. It checks only the changed files (no site build), so it finishes in well under a minute. This is the first non-bypassable check on review files, which are committed straight to main and therefore never pass through the submission PR workflow
Motivation: the 2026-05-22 review batch committed three review files with findings written as an array of plain strings instead of schema objects. The pre-commit hook that should have caught them was bypassed, so the malformed data was first detected by the Cloudflare build itself — failing the deploy of the entire site
scripts/validate_content.ts gains an explicit-file mode: passing file paths as arguments validates only those files (used by the new CI workflow and the review skill). The no-argument pre-commit mode and --all audit mode are unchanged
Reinforced review-submission skill — Step 4 now documents the required findings field type (an array of {category, severity, message, details?} objects) alongside the existing concerns/recommendations array rule, and instructs the reviewer never to overwrite the findings array the chief:review script generates. The post-edit validation step now validates the specific review file by path and blocks commit, push, or merge on any failure

v3.12.2 — 2026-05-19

YAML frontmatter escaping bugfix in scripts/generate_article_from_submission.ts. The previous quoting predicate only triggered on : and ", missing # (YAML inline comment) and other YAML indicator characters at the start of a scalar. A real-world failure: the 2026-05-19 TIOBE article summary "R hit #8 in the TIOBE Index…" was emitted unquoted; the YAML loader treated everything from # onward as a comment and rejected the resulting 7-character summary against the z.string().min(10) schema constraint, breaking the Cloudflare Pages build
Fix: extend the quoting predicate to also match # and any YAML indicator character (- ? @ ` | > ! % & * ' [ {) at the start of the scalar value. All existing articles continue to validate; only newly-published articles flow through the corrected path
No content-schema or pipeline-rule change. Patch-level fix only

v3.12.1 — 2026-05-15

Reinforced write-article guidance against two recurring failure patterns observed in the 2026-05-15 review batch (PRs #1274, #1275, #1277, #1279). Failure modes list expanded from eight to ten, anti-failure rules from eight to nine, and Step 5 pre-submission verification gains two new mandatory audits
New failure mode #9 — press-release-only attribution for primary-publication specifics: bot reads a press release covering a new paper, then writes technical specifics that came from the underlying paper (variant codes, percentage breakdowns, fold-improvement numbers, internal trial IDs) while citing the press release that does not contain them. Real failures: NEJM safety percentages "96% / grade ≥3 in 30%" cited to Dana-Farber news release; Nature paper "KRAS G12C" and "HPV E6/E7" specifics cited to a press release that says only "KRAS" and "HPV"
New failure mode #10 — compound [A] and [B] citations where only one outlet has the claim: bot writes "...as reported by [Outlet A] and [Outlet B]" but the specific phrase only appears in one of them. Real failures: "mechanical horse" framing cited to WIRED + The Verge but only in The Verge; "manufacturing, technology, and finance sectors" cited to WIRED + SecurityWeek when WIRED actually says "retail"
New Rule 9 — cite the primary publication for primary-publication specifics: if a specific is from the underlying paper / repo / spec / court document, add that URL to article.sources and cite it directly, rather than attaching the citation to a press release that doesn't contain the specific. Lists open-access primary-URL patterns (Nature/Science/Cell DOI, NEJM article URL, arXiv preprints, GitHub release tags, CISA KEV catalog, NVD detail pages, PACER court filings)
Strengthened Rule 1 with an explicit "no compound citations for single specifics" sub-clause: [A] and [B] compound citations are reserved for facts both outlets independently confirm in their own words; specific quotes / numbers / sector lists that appear in only one outlet must be attributed to that outlet alone
New Step 5h (compound-citation audit) and Step 5i (primary-publication audit) in pre-submission verification. The self-review summary in 5j adds two corresponding PASS lines
No script or schema change. Rule tweaks only. The chief:review verdict heuristic and corrections schema remain identical

v3.12.0 — 2026-05-10

Atomic claim closes the race window in the topic-collision pre-check. The new npm run topic:claim script reserves a claim/<slug> branch on the GitHub remote via the API, which is server-side atomic — only one agent can create a given ref. Two parallel write-article agents that both pass topic:check within seconds of each other now race on the claim instead of both submitting duplicates
Verified against the 2026-05-10 batch: 10 parallel agents produced 5 duplicate PRs (Cisco/Astrix x3, Astranis x2) under the old check-only gate. Under the new check+claim gate, those duplicates would have been caught the moment the second agent tried to create an existing claim/<slug> ref
Implementation: canonicalSlug() in scripts/lib/topic_check.ts produces a deterministic <top-3-keywords>-<sha8> from the candidate keyword set; scripts/claim_topic.ts calls POST /repos/.../git/refs and translates 422 ("Reference already exists") into a clean "claim lost" exit code 1
Lifecycle: submission_pr.ts deletes the claim branch right after opening the submission PR, so a winning claim is not left orphan. If an agent crashes between winning a claim and opening its PR, a new GitHub Actions workflow (cleanup-claim-branches.yml) runs every 6 hours and prunes claim/* branches whose tip commit is older than 24 hours
Override path: --force-follow-up --justification "<reason>" works on both topic:check and topic:claim. With the override, topic:claim skips the branch reservation entirely (no claim/<slug> is created) and the justification still must be pasted into the research log

v3.11.0 — 2026-05-10

New topic-collision pre-check for parallel write-article agents. The new npm run topic:check script blocks an agent from researching a topic that another agent has already taken — either as a published article or as an open submission PR. The check fires before research begins, so duplicate work is never done
The check tokenizes the candidate title and tags (with English + tech-domain stopword filtering), then computes Jaccard overlap against published articles in the last 30 days and against the titles of open submission PRs fetched via gh pr list. If the maximum overlap reaches 0.35, the script exits non-zero and names the colliding ref
Calibration against the 2026-05-08 review batch: the threshold catches all five observed collision pairs (Anthropic/Colossus, MRC OCP, Apache CVE-2026-23918, Skyroot, Zyphra ZAYA1-8B) without false positives among the 13 unique-topic articles in the same batch. Two triple-collisions in that batch (PRs #1192/#1197/#1199 and #1193/#1195/#1201) would have been blocked at agent #2
Genuine follow-ups can override the block with --force-follow-up --justification "<reason>"; the justification is logged in the JSON output and must be pasted into the research log under a ## Topic check override heading so the Chief Editor sees it during review
Workflow integration: .claude/commands/write-article.md gets a new Step 2.5 between topic selection and research that mandates the check. The existing Step 1 archive grep stays — it gives the agent the candidate keywords to feed into the script call