Zum Hauptinhalt springen Zur Suche springen Zur Hauptnavigation springen
Beschreibung

Overcome challenges in building transactional guarantees on rapidly changing data by using Apache Hudi. With this practical guide, data engineers, data architects, and software architects will discover how to seamlessly build an interoperable lakehouse from disparate data sources and deliver faster insights using your query engine of choice.

Authors Shiyan Xu, Prashant Wason, Bhavani Sudha Saktheeswaran, and Rebecca Bilbro provide practical examples and insights to help you unlock the full potential of data lakehouses for different levels of analytics, from batch to interactive to streaming. You'll also learn how to evaluate storage choices and leverage built-in automated table optimizations to build, maintain, and operate production data applications.

  • Understand the need for transactional data lakehouses and the challenges associated with building them
  • Explore data ecosystem support provided by Apache Hudi for popular data sources and query engines
  • Perform different write and read operations on Apache Hudi tables and effectively use them for various use cases, including batch and stream applications
  • Apply different storage techniques and considerations such as indexing and clustering to maximize your lakehouse performance
  • Build end-to-end incremental data pipelines using Apache Hudi for faster ingestion and fresher analytics

Overcome challenges in building transactional guarantees on rapidly changing data by using Apache Hudi. With this practical guide, data engineers, data architects, and software architects will discover how to seamlessly build an interoperable lakehouse from disparate data sources and deliver faster insights using your query engine of choice.

Authors Shiyan Xu, Prashant Wason, Bhavani Sudha Saktheeswaran, and Rebecca Bilbro provide practical examples and insights to help you unlock the full potential of data lakehouses for different levels of analytics, from batch to interactive to streaming. You'll also learn how to evaluate storage choices and leverage built-in automated table optimizations to build, maintain, and operate production data applications.

  • Understand the need for transactional data lakehouses and the challenges associated with building them
  • Explore data ecosystem support provided by Apache Hudi for popular data sources and query engines
  • Perform different write and read operations on Apache Hudi tables and effectively use them for various use cases, including batch and stream applications
  • Apply different storage techniques and considerations such as indexing and clustering to maximize your lakehouse performance
  • Build end-to-end incremental data pipelines using Apache Hudi for faster ingestion and fresher analytics
Über den Autor
Shiyan Xu is a Founding Engineer at Onehouse and currently working as an Open Source Engineer. He has been an active contributor to Apache Hudi since 2019, and is serving as a PMC member of the project since 2021. Prior to joining Onehouse, Shiyan worked as a tech lead manager at Zendesk, leading the development of a large-scale data lake platform using Apache Hudi. He is passionate about open source development and engaging with community users.
Details
Erscheinungsjahr: 2025
Genre: Importe, Informatik
Rubrik: Naturwissenschaften & Technik
Medium: Taschenbuch
Inhalt: Einband - flex.(Paperback)
ISBN-13: 9781098173838
ISBN-10: 109817383X
Sprache: Englisch
Einband: Kartoniert / Broschiert
Autor: Xu, Shiyan
Wason, Prashant
Saktheeswaran, Bhavani Sudha
Bilbro, Rebecca
Hersteller: O'Reilly Media
Verantwortliche Person für die EU: Libri GmbH, Europaallee 1, D-36244 Bad Hersfeld, gpsr@libri.de
Maße: 233 x 178 x 15 mm
Von/Mit: Shiyan Xu (u. a.)
Erscheinungsdatum: 02.12.2025
Gewicht: 0,467 kg
Artikel-ID: 134233717

Ähnliche Produkte