Critical reading · Essay 4

A Bad Goal Is a Tiny God

The demonology turn — idolatry as overconstraint — is the series' sharpest practical idea and its most over-extended one. This essay separates the two.

Roughly a third of the corpus is a bestiary. "A bad goal is a tiny god" (pt2 · panel 042); "the metric becomes an idol" (pt2 · panel 043); then a catalogue — the safety demon, the shame demon, the purity demon, the doom demon, the totality demon — each a portrait of a system captured by an objective. Unlike the early quantum register, this turn is not scaffold: it is one of the parts of the corpus that genuinely bears load, because it is built directly on Bennett's Conscious Machines rather than on an evocative analogy. The unifying thesis, stated in the glossary as "idolatry as overconstraint" — the Good must constrain without becoming a cage — is the most useful conceptual product of the series, and also the place where its rhetoric most outruns even its real source.

The strongest form

At its best the demonology is a precise restatement of a well-understood failure mode given a mechanism. "The metric becomes an idol" is, in substance, Goodhart's law: when a measure becomes a target it ceases to be a good measure. What the corpus adds is the mechanism of its grip, and the mechanism is Bennett's, used accurately. Bennett's w-maxing is the meta-approach of choosing "the least specific, weak constraints on functionality": "a weaker policy is a tool that completes more tasks ... the weakest policies complete the largest number of tasks," and Bennett proves w-maxing optimal for generalisation while simp-maxing is not (Conscious Machines, Ch. VIII; experimentally w-maxing outperforms simp-maxing "by 110−500%"). On that result an over-narrow objective is not merely inaccurate but provably worse at generalising: a constraint fixed tighter than its task requires is a stronger, less weak policy, and so a worse one. The corpus's "demon = goal + replay + self-worth" (pt2 · panel 044) then names the capture loop: the objective acquires durability by recruiting memory (replay — itself the Hyvärinen mechanism, Painful Intelligence Ch. 11) and identity (self-worth), after which it defends itself against revision. That is a sharp and in-principle testable account of why bad objectives persist — in optimisers, in institutions ("you optimize what you order," pt2 · panel 029), and, by structural analogy the corpus is careful to flag as analogy, in persons.

The therapeutic corollary is correspondingly strong. If a demon is fed by attention, certainty, and replay, then "exorcism as de-idolization" is the withdrawal of those inputs rather than a struggle against the objective on its own terms (pt2 · panel 049: "what is not fed cannot rule"). This is the same logic the corpus applies to itself in "the framework kneels": the cure for an idol is not a better idol but a demotion of any objective from ultimate to instrumental. As a piece of practical reasoning about optimization under a fixed loss, this is not mysticism; it is correct.

The weakest form

The argument weakens in exact proportion to how literally the word "demon" is taken. Two problems recur.

The bestiary over-generates. Once "any overconstraining attractor is a demon" is the rule, almost any persistent commitment can be rendered as possession: the safety demon (caution), the purity demon (standards), the doom demon (pessimism). A taxonomy that classifies every stable preference as a potential idol has lost discriminating power. Goodhart's law tells you when a proxy fails — when optimization pressure on it decouples it from the target. The corpus's demonology often drops that condition and treats narrowness itself as the pathology, which would condemn every necessary constraint, including the ones a functioning agent must hold fixed. The panel on the "totality demon" (pt5 · panel 006) is, ironically, the corpus diagnosing its own risk: a frame that explains everything as capture is itself a totalizing frame.

The normative core is borrowed, not derived. The whole edifice presupposes an answer to "constrain toward what?" — and that answer is the undefined Good. w-maxing tells you to pick the weakest constraint that completes the task; it is silent on which tasks are worth completing. So "a bad goal is a tiny god" cannot, on the corpus's own resources, distinguish a bad goal from a good one; it can only flag over-tightness relative to a task whose value is assumed. This limit is not the corpus's invention; it is Bennett's own. Conscious Machines defines the "cosmic ought" purely as which tasks happen to be completed when the environment changes state — a descriptive, not evaluative, notion — and w-maxing is explicitly a theory of optimal adaptation, silent on the worth of what is adapted toward. The series concedes the gap elsewhere in its own voice ("justice requires another layer," the least-action synthesis), but the demonology still often sounds as though the mechanism supplied the verdict it cannot.

What survives

Stripped to its defensible core, the turn says something true and non-trivial: objectives that are fixed tighter than their task, and then defended by memory and identity, degrade the systems that hold them, and the remedy is to demote rather than to fight them. That is a real contribution — it generalizes Goodhart with a mechanism of self-preservation and a corresponding intervention. What does not survive is the implicit claim that the framework can tell you which goals are tiny gods. It can tell you when a goal has the structural signature of capture; it borrows from outside itself any verdict about whether the captured goal was worth pursuing. The strongest reading of the demonology is therefore also the most modest: a diagnostic of form, not of value — powerful exactly because, and only if, it stops claiming to be the latter.

"Goodhart's law" is named as the established result the idol-panels recover. The w-maxing mechanism, the "weakest policy completes the most tasks" result, and the descriptive "cosmic ought" are from Bennett, Conscious Machines (Ch. III, VIII); the replay component is Hyvärinen, Painful Intelligence (arXiv:2205.15409, Ch. 11); the "another layer" concession is the corpus's own (the justice/least-action subseries, which the framing correction classes as motivating scaffold — the demonology's load is carried by Bennett, not by that subseries).