Thinking Machines Lab

Ex-OpenAI CTO Mira Murati launches Tinker to simplify AI model training

Tinker could reshape who gets to build cutting-edge models — and how — in a domain long dominated by deep pockets and hardware power.

A new player entered the crowded AI landscape today. Thinking Machines Lab, a startup founded earlier this year by former OpenAI CTO Mira Murati, announced Tinker, an API that lets researchers fine-tune large language models without building their own training infrastructure.

The promise is straightforward: instead of wrangling GPU clusters and distributed training code, users send training steps through a simple Python interface. The system currently supports popular open-weight models such as Meta’s LLaMA and Alibaba’s Qwen, including massive mixture-of-experts variants with hundreds of billions of parameters.

Tinker relies on LoRA, a method that fine-tunes only small adapter modules rather than the full model. This has become a common workaround in the AI community for reducing costs, but Thinking Machines is offering it as a managed service. The company says switching from a small model to a much larger one is as simple as changing a string in code.

Ex-OpenAI CTO Mira Murati launches Tinker to simplify AI model training

Groups at Princeton, Stanford, Berkeley, and Redwood Research have already been testing the platform. Their projects range from mathematical theorem proving to reinforcement learning experiments with large multi-agent setups.

The timing is notable. Fine-tuning has traditionally required expensive infrastructure, putting it out of reach for smaller labs and independent developers. Platforms like Hugging Face have lowered the barrier with open-source libraries, but those still assume you can access hardware. Tinker shifts the burden to Thinking Machines’ own clusters, making access more like a service than a toolkit.

The stakes are high. If Tinker works at scale, it could erode the advantage of well-funded labs that control compute resources, letting smaller teams run serious experiments. It also raises questions about cost models, safety of custom-trained models, and whether researchers will trust a third-party service with their data.

Tinker is in private beta and free for now. Usage-based pricing will come later, though no details have been released.

What Tinker actually offers (from the docs)

Tinker is not meant to be a black-box model trainer. Rather, it gives you primitives so you write your training logic yourself. The service handles the distributed machinery under the hood.

You’ll typically run code on your local CPU environment (or a simple machine), interacting with Tinker via:

  • forward_backward() — run forward pass and backward pass, accumulate gradients
  • optim_step() — apply optimizer updates
  • sample() — generate tokens/output using current model parameters
  • save_state() / loading routines — checkpointing so you can resume training or export weights

You decide your loss functions, evaluation logic, data pipelines, RL environments. Tinker doesn’t impose a fine-tuning pipeline; it just gives you the building blocks.

Changing which base model you start with is as simple as changing a string in your code — the API abstracts away swapping model weights.

Supported models and limits:

  • It supports open-weight models such as the LLaMA line and Qwen, including Mixture-of-Experts (MoE) variants (e.g. Qwen3-235B-A22B).
  • But it does not currently support full fine-tuning. Instead, Tinker uses LoRA (Low-Rank Adaptation), meaning only adapter modules are trained.
  • You can download your resulting trained weights from Tinker and use them elsewhere (e.g. with another inference stack) — i.e., you’re not locked in.

That said, some features are “on deck” per the docs: image input capabilities for multimodal models, and full fine-tuning support are stated as “future” features.

The docs show how you’d write:

  • Supervised learning loops (classic fine-tuning)
  • Reinforcement learning loops, where your “environment” might call the sample() method to get model actions
  • Preference-based fine-tuning / RLHF (or variants like DPO)
  • Prompt distillation, evaluation setups, hyperparameter sweeps

You get examples and recipes (“the Tinker Cookbook”) showing how many of these setups can be layered on top of the basic primitives.

Head over to the Tinker Docs to learn more.

Leave a comment

Your email address will not be published. Required fields are marked *