Large-Scale Data Engineering and Analytics in Cloud

Performance Tuning and Optimization / Internals, Research

  • About
  • About
  • Auto Scaling,  Presto

    Presto – Query Auto Scaling Limitations – Low Utilization After Upscale – GROUP BY Partitioned Stage, query-manager.required-workers

    December 4, 2019

    One of the powerful features of Presto is auto scaling of compute resources for already running queries. When a cluster is idle you can reduce its size or even terminate the cluster, and then dynamically add nodes when necessary depending on the compute requirements of queries.

    Unfortunately, there are some limitations since not all stages can scale on the fly, so you have to use some workarounds to have reasonable cold start performance.

    Read More
    dmtolpeko

Recent Posts

  • Jan 15, 2021 Parquet 1.x File Format – Footer Content
  • Jan 02, 2021 Flink and S3 Entropy Injection for Checkpoints
  • Jun 25, 2020 Hadoop YARN – Monitoring Resource Consumption by Running Applications in Multi-Cluster Environments
  • Jun 18, 2020 How Map Column is Written to Parquet – Converting JSON to Map to Increase Read Performance
  • Jun 09, 2020 Flink Streaming to Parquet Files in S3 – Massive Write IOPS on Checkpoint

Archives

  • January 2021 (2)
  • June 2020 (4)
  • May 2020 (8)
  • April 2020 (3)
  • February 2020 (3)
  • December 2019 (5)
  • November 2019 (4)
  • October 2019 (1)
  • September 2019 (2)
  • August 2019 (1)
  • May 2019 (9)
  • April 2019 (2)
  • January 2019 (3)
  • December 2018 (4)
  • November 2018 (1)
  • October 2018 (6)
  • September 2018 (2)

Categories

  • Amazon (11)
  • Auto Scaling (1)
  • AWS (25)
  • Cost Optimization (1)
  • CPU (2)
  • Data Skew (1)
  • Distributed (1)
  • EC2 (1)
  • EMR (10)
  • ETL (2)
  • Flink (5)
  • Hadoop (14)
  • Hive (17)
  • Hue (1)
  • I/O (18)
  • JVM (3)
  • Kinesis (1)
  • Logs (1)
  • Memory (7)
  • Monitoring (4)
  • ORC (5)
  • Parquet (5)
  • Pig (2)
  • Presto (3)
  • Qubole (2)
  • RDS (1)
  • S3 (17)
  • Snowflake (6)
  • Spark (2)
  • Storage (12)
  • Tez (10)
  • YARN (18)

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org
Savona Theme by Optima Themes