Zum Hauptinhalt springen Zur Suche springen Zur Hauptnavigation springen
Beschreibung

Generative AI is revolutionizing industries, and Kubernetes has fast become the backbone for deploying and managing these resource-intensive workloads. This book serves as a practical, hands-on guide for MLOps engineers, software developers, Kubernetes administrators, and AI professionals ready to combine AI innovation with the power of cloud native infrastructure. Authors Roland Huß and Daniele Zonca provide a clear road map for training, fine-tuning, deploying, and scaling GenAI models on Kubernetes, addressing challenges like resource optimization, automation, and security along the way.

With actionable insights with real-world examples, readers will learn to tackle the opportunities and complexities of managing GenAI applications in production environments. Whether you're experimenting with large-scale language models or facing the nuances of AI deployment at scale, you'll uncover expertise you need to operationalize this exciting technology effectively.

  • Learn how to deploy LLMs more efficiently with optimized inference runtimes
  • Get hands-on with GPU scheduling, including hardware detection and multinode scaling
  • Monitor and understand LLM-specific metrics like Time to First Token and token throughput
  • Know when to fine-tune a model or when retrieval augmentation is the better choice
  • Discover how to evaluate models with standardized benchmarks before committing GPU resources
  • Learn to run agentic applications with secure tool integration, identity management, and persistent state

Generative AI is revolutionizing industries, and Kubernetes has fast become the backbone for deploying and managing these resource-intensive workloads. This book serves as a practical, hands-on guide for MLOps engineers, software developers, Kubernetes administrators, and AI professionals ready to combine AI innovation with the power of cloud native infrastructure. Authors Roland Huß and Daniele Zonca provide a clear road map for training, fine-tuning, deploying, and scaling GenAI models on Kubernetes, addressing challenges like resource optimization, automation, and security along the way.

With actionable insights with real-world examples, readers will learn to tackle the opportunities and complexities of managing GenAI applications in production environments. Whether you're experimenting with large-scale language models or facing the nuances of AI deployment at scale, you'll uncover expertise you need to operationalize this exciting technology effectively.

  • Learn how to deploy LLMs more efficiently with optimized inference runtimes
  • Get hands-on with GPU scheduling, including hardware detection and multinode scaling
  • Monitor and understand LLM-specific metrics like Time to First Token and token throughput
  • Know when to fine-tune a model or when retrieval augmentation is the better choice
  • Discover how to evaluate models with standardized benchmarks before committing GPU resources
  • Learn to run agentic applications with secure tool integration, identity management, and persistent state
Über den Autor
Dr. Roland Huss is a seasoned software engineer with over 25 years of experience in the field. Currently working at Red Hat, he is the architect of OpenShift Serverless and a former member of the Knative TOC. Roland is a passionate Java and Golang coder and a sought-after speaker at tech conferences. An advocate of open source, he is an active contributor and enjoys growing chili peppers in his free time.
Details
Erscheinungsjahr: 2026
Genre: Importe, Informatik
Rubrik: Naturwissenschaften & Technik
Medium: Taschenbuch
Inhalt: Einband - flex.(Paperback)
ISBN-13: 9781098171926
ISBN-10: 1098171926
Sprache: Englisch
Einband: Kartoniert / Broschiert
Autor: Huss, Roland
Zonca, Daniele
Hersteller: O'Reilly Media
Verantwortliche Person für die EU: Libri GmbH, Europaallee 1, D-36244 Bad Hersfeld, gpsr@libri.de
Maße: 228 x 174 x 25 mm
Von/Mit: Roland Huss (u. a.)
Erscheinungsdatum: 31.03.2026
Gewicht: 0,704 kg
Artikel-ID: 134825519

Ähnliche Produkte