Life of a Model on Google Cloud TPU

Join us for a demo showcasing the power of Google Cloud TPU by walking through the complete lifecycle of a model, from post-training to inference at massive scale.

We’ll show you how to post-train a model using new RL capabilities in MaxText and Tunix, then take that same model and serve it on TPU seamlessly with the new JAX backend in vLLM.

Learn about how to rightsize a workload for TPU for both training and inference, optimizing for performance-per-dollar from beginning to end.

Speaker(s):

Author:

Brittany Rockwell

Product Manager

Google Cloud

Brittany Rockwell is a Product Manager for AI Inference at Google Cloud, where she leads the development of the TPU backend in vLLM, the most popular OSS library for language model inference. She currently lives in Seattle, WA.

Author:

Kyle Meggs

Product Manager

Google Cloud

Kyle Meggs is a Product Manager for JAX on Google Cloud. He enjoys helping customers train LLMs with MaxText and diffusion models with MaxDiffusion on the latest TPUs and GPUs.

Session Type:

General Session (Presentation)