For zero-downtime, large-scale systems you can have multiple compute clusters located in different availability zones.
The Kinesis KCL 2.x Consumer is very helpful to build highly scalable, elastic and fault-tolerant streaming data processing pipelines for Amazon Kinesis. Let’s review some of the KCL internals related to the load balancing and response to compute node/cluster failures and how you can tune and monitor such activities.