Large-Scale Data Engineering and Analytics in Cloud

Performance Tuning and Cost Optimization / Internals, Research, Consulting

  • About
  • About
  • AWS,  Kinesis

    Kinesis Client Library (KCL 2.x) Consumer – Load Balancing, Rebalancing – Taking, Renewing and Stealing Leases

    May 20, 2020

    For zero-downtime, large-scale systems you can have multiple compute clusters located in different availability zones.

    The Kinesis KCL 2.x Consumer is very helpful to build highly scalable, elastic and fault-tolerant streaming data processing pipelines for Amazon Kinesis. Let’s review some of the KCL internals related to the load balancing and response to compute node/cluster failures and how you can tune and monitor such activities.

    Read More
    dmtolpeko

Recent Posts

  • Mar 19, 2021 Spark – Reading Parquet – Why the Number of Tasks can be Much Larger than the Number of Row Groups
  • Mar 07, 2021 Spark – Reading Parquet – Predicate Pushdown for LIKE Operator – EqualTo, StartsWith and Contains Pushed Filters
  • Jan 15, 2021 Parquet 1.x File Format – Footer Content
  • Jan 02, 2021 Flink and S3 Entropy Injection for Checkpoints
  • Jun 25, 2020 Hadoop YARN – Monitoring Resource Consumption by Running Applications in Multi-Cluster Environments

Archives

  • March 2021 (2)
  • January 2021 (2)
  • June 2020 (4)
  • May 2020 (8)
  • April 2020 (3)
  • February 2020 (3)
  • December 2019 (5)
  • November 2019 (4)
  • October 2019 (1)
  • September 2019 (2)
  • August 2019 (1)
  • May 2019 (9)
  • April 2019 (2)
  • January 2019 (3)
  • December 2018 (4)
  • November 2018 (1)
  • October 2018 (6)
  • September 2018 (2)

Categories

  • Amazon (11)
  • Auto Scaling (1)
  • AWS (25)
  • Cost Optimization (1)
  • CPU (2)
  • Data Skew (1)
  • Distributed (1)
  • EC2 (1)
  • EMR (10)
  • ETL (2)
  • Flink (5)
  • Hadoop (14)
  • Hive (17)
  • Hue (1)
  • I/O (20)
  • JVM (3)
  • Kinesis (1)
  • Logs (1)
  • Memory (7)
  • Monitoring (4)
  • ORC (5)
  • Parquet (7)
  • Pig (2)
  • Presto (3)
  • Qubole (2)
  • RDS (1)
  • S3 (17)
  • Snowflake (6)
  • Spark (4)
  • Storage (12)
  • Tez (10)
  • YARN (18)

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org
Savona Theme by Optima Themes