Building Data Pipelines Using Apache Beam

Printed Book
Sold as: EACH
Author: Meyen, Nuzhi
Date of Publication: 2026
Book classification: Computer & Technology,
No. of pages: 350 Pages
Format: Paperback

This book is printed on demand and is non-refundable after purchase

    Or

    About this Product

    Build Data Pipelines that Survive Scale, Failure, and Change

    Book Description

    Building Data Pipelines Using Apache Beam provides a practical, production-focused guide to using Beams unified programming model to write processing logic once, and run it across multiple runners, without rewriting core code.

    The book begins with the fundamentals of distributed data processing and Beams core abstractions-PCollections, transforms, and pipeline design. You will then progress into stateful and stateless processing, event-time semantics, windows, triggers, watermarks, state, and timers-building the mental models required to reason about correctness at scale.

    What you will learn

    ● Design scalable batch and streaming pipelines with Apache Beam

    ● Implement event-time processing using windows, triggers, watermarks, state, and timers

    ● Build portable pipelines that execute consistently across multiple runners

    ● Apply advanced transformations and coders for efficient data processing

    ● Optimize pipelines for performance, latency, fault tolerance, and cost efficiency

    ● Deploy, monitor, debug, and operate production-grade data pipelines

    Who is This Book For?

    This book is tailored for Data Engineers, Senior Data Engineers, Analytics Engineers, Data Architects, and Platform Engineers who design, build, or operate batch and streaming data systems. Readers should be comfortable with Python or Java, SQL, and basic distributed system concepts such as parallelism, fault tolerance, event-time processing, and cloud-based data platforms.

    Table of Contents

    1. Introduction to Apache Beam and Data Processing

    2. Stateful and Stateless Processing with Apache Beam

    3. Handling Event Time, Windows, and Triggers

    4. Building Pipelines with Apache Beam

    5. Transformations and Coders in Apache Beam

    6. Advanced Pipeline Optimization Techniques

    7. Deploying Apache Beam Pipelines on Different Runners

    8. Monitoring, Debugging, and Tuning Apache Beam Pipelines

    9. Case Studies: Apache Beam in the Real World

    Index

    Show more

    Customer Reviews