Inside Anthropic’s race to train the next AI

Inside Anthropic’s race to train the next AI: why pre-training still decides who wins

Anthropic’s Nick Joseph pulls back the curtain on the hidden engine of modern AI—pre-training—revealing how scaling laws, compute bottlenecks, and even “cursed bugs” shape the future of Claude and the broader race to build smarter, safer models.

Anthropic’s head of training, Nick Joseph, paints a picture of AI progress that is both simple and brutal. “My OKR is still the same as day one: make the loss go down,” he says. In a wide-ranging conversation about the company’s pre-training program, Joseph argues that the teams who can most reliably push that core metric will shape the products people actually use, and the economics rivals must live with.

That stance might sound unfashionable in a year obsessed with “reasoning” and post-training tricks. Yet the industry’s own history keeps backing Joseph up. Since OpenAI popularized the idea that language model loss falls predictably as you scale model size, data and compute, most big step-changes have come from running bigger, cleaner, longer training runs rather than inventing exotic new objectives. The original scaling-laws work quantified that now-famous power-law curve and found architectural details mattered far less than simply turning the crank on scale.

Joseph’s account also doubles as a reality check on what it takes to keep turning that crank. Pre-training is not a whiteboard exercise. It is a months-long test of whether thousands of chips spread across rooms, buildings and sometimes different cloud providers can behave like a single machine. If one chip dies, an entire distributed job can stall. Engineers chase esoteric errors that only appear at scale and weeks into a run. That is the kind of fragility customers never see when a chatbot feels snappy on their phone, but it is the reason they sometimes run into rate limits. Make the model too big or the architecture too communication-heavy and the inference team cannot serve it cheaply, which means fewer tokens for end users and higher costs for developers building on top. Joseph says pre-training and inference must be co-designed, because the former quietly determines the problem the latter needs to solve.

The practical, engineering-first lens matters because Anthropic is competing in a market where rivals are pulling in opposite directions. On one side are labs that argue the scaling story has diminishing returns and want to reorient investment to richer objectives and world models. Meta’s chief scientist Yann LeCun has been among the most vocal critics of over-indexing on scale, saying larger models do not automatically gain the common sense and grounded understanding people need. On the other side are developers like Anthropic that keep finding room on the curve and turn those gains into visible product leaps with Claude. Even companies that shocked the industry by lowering training bills, like DeepSeek, did it by squeezing more out of the same basic pipeline through aggressive efficiency and mixture-of-experts routing rather than abandoning autoregressive pre-training. The market reaction showed how cost curves can be as disruptive as new features.

Joseph does not deny that post-training matters. He frames it as the faster feedback loop where you “tune the personality” and fix sharp edges. That is where Anthropic’s constitutional AI work lives, which tries to encode behavioral rules in a way that can be audited and adjusted. It answers a question ordinary users actually care about: will the assistant refuse harmful requests while still being useful. Anthropic’s more recent “constitutional classifiers” extend that idea to monitor prompts and outputs for jailbreak attempts, boosting refusal rates on risky queries in internal tests and early deployments. The tradeoff is higher serving cost, another reminder that safety features are not just ethics. They are product and infrastructure choices that show up in user experience and margins.

The other rising concern is data. Pre-training thrives on oceans of text, but the internet of 2025 is filling with model-written content. That creates two headaches. First, researchers struggle to measure contamination when evaluation sets leak into training corpora, which can inflate benchmark scores and mislead customers about real-world gains. Second, the flood of synthetic text can degrade future training unless labs get better at filtering and weighting. Recent surveys and shared-task reports document the scope of contamination (pdf) and the limits of current AI-text detectors, especially in adversarial or multilingual settings. The academic community’s consensus is that there is no drop-in, perfectly reliable detector, which means data governance is becoming a core differentiator for labs. Users do not need to follow the methodology minutiae to feel the effects. Cleaner pre-training usually shows up as more grounded answers, fewer hallucinations and better domain transfer in the apps they rely on.

Joseph’s own path helps explain why Anthropic keeps betting on operational excellence. Before joining the company at its founding, he worked at OpenAI and earlier at Vicarious, the robotics startup later acquired by Alphabet’s Intrinsic. That lineage blends applied safety instincts with shop-floor pragmatism about compute, networking, profilers and debuggers. It also fits Anthropic’s public posture as the smaller lab trying to out-execute bigger platforms through efficiency and safety research that feels product-proximate rather than abstract.

Why this matters now is simple. The winners in foundation models set the default experience for everyone else. If Anthropic’s pre-training keeps converting compute into predictable quality, developers get steadier APIs, enterprises get fewer compliance surprises, and consumers get assistants that feel less brittle. If the company stumbles on any of the hard problems Joseph highlights, from month-six training bugs to serving economics, it cedes ground to OpenAI, Google or the next cost-disruptor. The stakes extend beyond market share. As models climb capability ladders, alignment choices made during and after pre-training define how much autonomy companies should safely expose to users. Constitutional approaches are a bid to give society a steering wheel and not just a faster engine. Whether that promise holds up at larger scales will be one of the most consequential tests in AI over the next year.

For now, Joseph’s message is that pre-training is still the gravity well of modern AI. Everything else orbits it.

Watch: YC’s Ankit Gupta interviews Nick Joseph, Anthropic’s head of pre-training, on why next-token training still drives progress, what it takes to wire up thousands of GPUs, profiling at scale, co-design with inference, data quality in a synthetic web, and where RL fits. A candid look at scaling laws, bugs that can derail months, and the org choices behind Claude’s roadmap. Watch the full conversation on YouTube.

Leave a comment

Your email address will not be published. Required fields are marked *