AlloyDB Hot Standby: Faster failovers, consistent performance

AlloyDB for PostgreSQL is a fully managed, PostgreSQL-compatible database service designed for the most demanding enterprise workloads. It combines the best of PostgreSQL with the power of Google, delivering exceptional performance, scalability, and availability. We are continuously innovating to make AlloyDB even more resilient, and today, we’re excited to announce a significant upgrade to our High Availability (HA) architecture: Hot Standby.

Understanding AlloyDB HA Architecture

1

An AlloyDB primary instance configured for high availability consists of an active node and a standby node, located in different zones within a region for resilience. AlloyDB’s cloud-native architecture separates compute and storage to allow for individual scaling of each resource. Database write-ahead logs (WAL) are synchronously written to a regional log persistor, ensuring durability, while data blocks reside in AlloyDB’s regional storage service. A load balancer directs traffic to the current active node using a stable IP address.

In the traditional HA model, if the active node became unavailable, AlloyDB would automatically initiate a failover. The standby node, previously idle from a PostgreSQL perspective, would start the database, process any remaining logs, and then take over. While this ensures high availability, the database startup time and the subsequent cache warming period could impact application recovery time and performance.

Introducing AlloyDB Hot Standby: The New Architecture

2

With the new Hot Standby capability, we’ve transformed the role of the standby node. Instead of being a passive node, the standby node now continuously applies WAL records streamed from the primary. This architectural shift brings two massive advantages:

  1. Dramatically Reduced Failover Times: Because PostgreSQL is already running, initialized, and actively replicating on the standby, the time required to promote it to primary in the event of a failure is significantly shorter. The system detects the failure (typically within 30 seconds), promotes the standby, and redirects connections. The database startup phase on the standby is eliminated, reducing overall downtime and improving your Recovery Time Objective (RTO).

  2. Consistent Performance After Failover: Since the Hot Standby node is actively replaying logs, its memory caches (like the PostgreSQL buffer cache) are kept “warm.” They contain much of the same frequently accessed data as the primary node’s caches. When a failover occurs, the new primary can serve requests at optimal speed almost immediately. This avoids the performance “brownout” typically seen while caches warm up from disk, ensuring application performance remains stable.

And the best part? This substantial enhancement to availability and resilience comes at no additional cost to you.

See Hot Standby in Action

We’ve prepared a short demonstration to illustrate the difference between the new Hot Standby HA and the legacy HA setup. In the video, we run a benchmark load on two AlloyDB instances and trigger a failover on both simultaneously.

AlloyDB Hot Standby Final Video v1 - GIF

As you can see in the demo:

  • The instance with Hot Standby completes the failover in approximately 15 seconds. Crucially, its transaction per second (TPS) rate returns to the pre-failover levels almost immediately.

  • The instance with Legacy HA takes noticeably longer to complete the failover. Even when it comes back online, the TPS is significantly lower and takes several minutes to ramp back up to the original performance levels as its caches warm up.

This side-by-side comparison clearly shows the benefits of Hot Standby in minimizing downtime and eliminating the post-failover performance impact.

Get Started with Enhanced HA

Hot Standby is being rolled out to newly created AlloyDB instances in PostgreSQL 18, providing an upgraded HA experience automatically, and will be rolling out to the earlier major versions in the coming months. You can continue to rely on AlloyDB’s 99.99% SLA, now backed by even faster failovers and more predictable post-failover performance.

This enhancement underscores our commitment to providing a best-in-class, enterprise-grade managed PostgreSQL experience.

To learn more about AlloyDB’s High Availability features, please refer to the official documentation. New to AlloyDB? Try it out today!