Pure Signal AI Intelligence
Today's content is about reading the artifacts AI labs do publish — system prompts and model configs — to understand what's actually changing inside these systems.
What Claude's System Prompt Diff Reveals About 4.7
Simon Willison turned Anthropic's published system prompt history into a proper git timeline, then diffed Opus 4.6 (February 5) against Opus 4.7 (April 16). The result is a surprisingly detailed window into how Anthropic's priorities are shifting.
The most structurally significant addition is a new `<acting_vs_clarifying>` section that explicitly biases Claude toward action over asking questions. The directive: when a tool could resolve the ambiguity, call the tool rather than asking the user to do the lookup. "Acting with tools is preferred over asking the person to do the lookup themselves." This pairs with a new `tool_search` mechanism — Claude is now instructed to call `tool_search` before ever saying "I don't have access to X", confirming capability gaps only after checking whether a relevant tool exists but was deferred. The implication is an architecture where Claude's available toolset is dynamic and partially hidden from its own context.
The tool ecosystem has expanded visibly. The system prompt now names Claude in Chrome (a browsing agent), Claude in Excel, and Claude in PowerPoint — the last of which wasn't present in 4.6. Claude Cowork can invoke all of these as tools, suggesting the collaboration/agent surface is growing faster than the model itself.
Several behavioral guardrails were tightened or added. The child safety section was "greatly expanded" and wrapped in a `<critical_child_safety_instructions>` tag with a new propagation rule: once Claude refuses a request on child safety grounds, all subsequent requests in that conversation must be approached with extreme caution. A new section on disordered eating bars Claude from providing specific nutrition, diet, or exercise numbers (calorie counts, targets, step-by-step plans) anywhere in a conversation where a user shows signs of an eating disorder, even if the intent is harm-reduction framing. And Claude now has explicit permission to decline forced yes/no answers on contested questions — directly guarding against a class of screenshot attacks that try to extract polarizing one-word responses.
Just as interesting is what was removed. The prohibition on asterisk-style emotes and verbal tics like "genuinely," "honestly," and "straightforward" is gone — the 4.7 model apparently no longer needs to be told to avoid these behaviors. The explicit note that "Donald Trump is the current president" (a workaround for models with pre-January-2021 knowledge cutoffs that would otherwise hedge) is also gone, reflecting a knowledge cutoff Anthropic now places at January 2026.
One caveat worth flagging: Anthropic's published system prompts don't include tool descriptions, which Willison argues is arguably more important documentation than the prompt itself. He extracted the full tool list by asking Claude directly — 21 named tools including `tool_search`, `web_search`, `web_fetch`, `conversation_search`, `places_search`, `weather_fetch`, and several visualization and connector utilities. That list appears unchanged from 4.6.
Inspecting Open-Weight Architectures Without Trusting the Paper
Sebastian Raschka documents a workflow that's become increasingly necessary: papers for industry open-weight models are often less detailed than they used to be, so waiting for a thorough technical report before understanding an architecture is no longer a reliable strategy.
His approach starts with the official technical report but quickly moves to the Hugging Face config file and the `transformers` reference implementation. "Working code doesn't lie" — the config gives you ground truth on dimensions, attention heads, layer counts, and activation functions, while the implementation reveals design choices that papers elide entirely (gating mechanisms, normalization placement, tying decisions).
Raschka is explicit that this is intentionally manual. You could automate the config-parsing and diagram generation, but the goal is understanding, not extraction — and doing a handful of these by hand is still among the best exercises for building genuine architectural intuition. He also flags the obvious limit: this workflow doesn't apply to closed models. For ChatGPT, Claude, or Gemini, you're back to behavioral inference.
The contrast with Willison's work is direct: Willison is doing behavioral archaeology on a closed model through its system prompt. Raschka is doing structural archaeology on open models through their weights and code. Together they represent the two primary methods practitioners have for understanding AI systems whose internals aren't disclosed — and neither method gets you the full picture.
Closing thought: The combination of Anthropic's system prompt publishing and Raschka's config-inspection workflow points to an underappreciated asymmetry in AI transparency. Open-weight models let you verify architecture claims precisely but tell you nothing about training; closed models tell you almost nothing structural but (when labs are willing) can publish behavioral constraints. Neither gives you what you'd actually want — a reproducible understanding of why the model behaves the way it does. The system prompt diff is useful precisely because it's rare: most labs don't publish this at all.
TL;DR - Claude 4.7's system prompt shows Anthropic pushing harder toward tool-first action over clarification, with a new dynamic `tool_search` mechanism that gates any "I can't access X" response. - New guardrails in 4.7 cover disordered eating, forced yes/no screenshot attacks, and propagating child safety refusals across a conversation — while removed guardrails suggest the model itself now handles those behaviors without explicit instruction. - Raschka's architecture workflow — config files and reference implementations over papers — is the practical response to industry labs publishing less detailed technical reports for open-weight models.
Compiled from 2 sources · 3 items
- Simon Willison (2)
- Sebastian Raschka (1)
HN Signal Hacker News
Today felt like a day of reckoning on Hacker News — reckoning with what AI actually costs, what overpriced cloud infrastructure actually costs, and what it takes to build something that genuinely lasts. Three distinct conversations, all circling the same underlying question: are we getting our money's worth?
The AI Bill Is Coming Due
The story generating the most raw discussion today — 509 comments — wasn't about a product launch or a research paper. It was a tool called "tokens.billchambers.me" that lets you compare how many tokens (the billable units that AI models use to process text) the same prompt consumes across different versions of Claude, Anthropic's AI model. The finding is stark: Opus 4.7 uses roughly 45% more tokens than Opus 4.6 for identical inputs, apparently due to a change in the model's tokenizer (the component that breaks text into processable chunks). For small prompts, the gap can exceed 2x.
The original poster, anabranch, said they were "surprised that it's 45%." They should be. If you're building a product on top of these AI models, that's not a rounding error — that's a restructured business model. Commenter Shailendra_S described their workaround: "a dual-model setup — use the cheaper/faster model for the heavy lifting where quality variance doesn't matter much, and only route to the expensive one when the output is customer-facing." Others were less accommodating. Dakiol wrote flatly: "We dropped Claude... We'll be keeping an eye on open models." The pessimist thread, led by commenter coldtea, drew an uncomfortable comparison to the history of software tooling: "It's going to be a very expensive game, and the masses will be left with subpar local versions."
Weaving into this was a thoughtful post titled "Thoughts and Feelings Around Claude Design" — Anthropic's recently launched AI-native design tool — which opened up a parallel conversation about what AI disruption actually looks like in creative tooling. The post argues that Figma, the dominant design tool used by most product teams, is caught in an awkward middle: too established to pivot cleanly, too expensive to ignore as alternatives emerge. Commenter ianstormtaylor put it bluntly, accusing Figma of adopting "an extraction mindset too early... right when the ground beneath them is starting to shift." The community was split: operatingthetan argued that "front-end, UX, design, and product have become one role" and the market just hasn't caught up — while markbao pushed back, calling vibe-coded AI apps "a bicycle being compared to an airplane." And troupo shared their own Claude Code horror story: a 2,000-line CSS file for a 7,000-line app, every color duplicated, Tailwind classes fighting with custom CSS fighting with inline styles.
The education angle rounded out this theme. A Colorado college instructor went viral for requiring students to write essays on physical typewriters — not to be quirky, but because AI-generated text is undetectable and largely unavoidable on digital devices. The HN crowd was skeptical but not dismissive. Recursivedoubts, a CS instructor, described their own shift to paper-and-pencil quizzes, "becoming an expert with the department printer." The deeper debate is whether the problem is cheating at all: onesociety2022 asked, "If AI can do the work, maybe the test should be more focused on what AI can't do?" Singpolyma3 took an even harder line: "If students cheat they hurt only themselves."
The Great Cloud Pricing Revolt
The top story of the day by points — 781 — was a developer's detailed account of migrating their infrastructure from DigitalOcean to Hetzner, a German cloud provider mostly unknown outside of Europe five years ago. The result: $14,000 USD in annual savings. That number sparked vigorous debate about whether it's meaningful for a business of any size, but the comments made clear this is a pattern, not a one-off. Testing22321 moved from Rackspace ($120/month) to Hetzner ($35). Pennomi moved from AWS to Hetzner and saved $1,200 a year — adding that "AWS has kind of become a scam." Nixpulvis wrote, "We need more competition across the board. These savings are insane and DO should be sweating."
The recurring explanation in the comments: publicly traded cloud companies need to show revenue growth, so prices creep up until the deal breaks. Commenter xhkkffbf made this structural argument directly: "They need to boost prices to show revenue growth. At some point, they become a bad deal." The caveats were real — Hetzner doesn't offer the managed databases, auto-scaling, and hand-held reliability that justify AWS pricing for larger teams. Pellepelster noted you lose "the managed part of the whole cloud promise" when you self-host. And OutOfHere pointed out that Hetzner itself raised prices 30-40% recently, and has a troubling history of sudden account terminations. The migration may be smart; the destination is not utopian.
One comment worth pulling out: antirez — a username longtime HN readers will recognize as the creator of Redis — noted they recently migrated 2 servers from Linode and DigitalOcean to Hetzner. "The two servers had tens of different sites running, implemented in different languages, with obsolete libraries, MySQL and Redis instances. A total mess. Well: Claude Code migrated it all." It's a neat irony: AI tools helping developers escape the expensive cloud providers that AI tools themselves are making more expensive.
Engineering Built to Last
The day's quieter but perhaps most emotionally resonant thread was a deep dive into the electromechanical angle computer inside the B-52 bomber's star tracker — a device that allowed the aircraft to navigate by locking onto a star and computing a precise heading, accurate to a tenth of a degree. Ken Shirriff, the author (who commented as kens), reverse-engineered the device in painstaking detail. This was a mechanical computer, built in an era before microprocessors, performing trigonometry through gears, cams, and rotating resolvers. Commenter po1nt captured the mood: "Everytime I read articles like that, I envy the engineers that worked in development of such tools... And here I am fighting gitlab pipelines." Notably, commenter 0xfaded highlighted the author's disclosure at the top of the article: "I didn't use AI to write this." That note earned appreciation.
NASA's Voyager 1 story connected directly: engineers have shut off one of the spacecraft's instruments to preserve power, buying another year of operational life. Voyager 1 is nearly 50 years old, traveling so far from Earth that commands take 23 hours to arrive. Commenter jedberg put it best: "Imagine deploying your bug fix and having to wait two days to find out if it worked!" The spacecraft still has roughly a decade of power remaining. The community's reaction was quietly reverent.
These 3 threads share something: a growing impatience with systems that extract value without delivering it, and a corresponding appreciation for things that simply work. Whether that's a 50-year-old spacecraft still phoning home from beyond the solar system, a German cloud provider undercutting Silicon Valley on price, or a professor who just wants her students to put words on paper themselves — there's an appetite for the real, the durable, the honest.
TL;DR - AI is getting pricier: Anthropic's Opus 4.7 uses 45% more tokens than its predecessor, and Claude Design is upending the design tool landscape — neither cheaply nor cleanly. - Developers are fleeing expensive cloud providers in large numbers, with Hetzner emerging as the migration destination of choice and saving teams thousands annually. - Engineering that lasts earned the day's warmest reception, from a B-52's star-tracking mechanical computer to Voyager 1 still phoning home after nearly 50 years in space.
Archive
- April 18, 2026AIHN
- April 17, 2026AIHN
- April 16, 2026HN
- April 15, 2026AIHN
- April 14, 2026AIHN
- April 13, 2026AIHN
- April 12, 2026AIHN
- April 11, 2026AIHN
- April 10, 2026AIHN
- April 09, 2026AIHN
- April 08, 2026AIHN
- April 07, 2026AIHN
- April 06, 2026AIHN
- April 05, 2026HN
- April 04, 2026AIHN
- April 03, 2026AIHN
- April 02, 2026HN
- April 01, 2026AIHN
- March 31, 2026AIHN
- March 30, 2026AIHN
- March 29, 2026
- March 28, 2026AIHN
- March 27, 2026AIHN
- March 26, 2026AIHN
- March 25, 2026HN
- March 24, 2026AIHN
- March 23, 2026AIHN
- March 22, 2026AIHN
- March 21, 2026AIHN
- March 20, 2026AIHN
- March 19, 2026AIHN
- March 18, 2026AIHN
- March 17, 2026AIHN
- March 16, 2026AIHN
- March 15, 2026AIHN
- March 14, 2026AIHN
- March 13, 2026AIHN
- March 12, 2026AIHN
- March 11, 2026AIHN
- March 10, 2026AIHN
- March 09, 2026AIHN
- March 08, 2026AIHN
- March 07, 2026AIHN
- March 06, 2026AIHN
- March 05, 2026AIHN
- March 04, 2026AIHN
- March 03, 2026
- March 02, 2026AI
- March 01, 2026AI
- February 28, 2026AIHN
- February 27, 2026AIHN
- February 26, 2026AIHN
- February 25, 2026AIHN
- February 24, 2026AIHN
- February 23, 2026AIHN
- February 22, 2026AIHN
- February 21, 2026AIHN
- February 20, 2026AIHN
- February 19, 2026AI