Study AWS Aurora Serverless v2

Amazon launched Aurora Serverless v2 on Apr 21, 2022. Aurora Serverless v2 is designed for applications that have variable workloads. The DB capacity dynamically changes as the workload changes, so we don’t need to provision the capacity to meet the demand of peak load. I evaluated Aurora Serverless v2 to check actual behaviors and fitness for production applications.

On this post, Aurora Serverless v2 PostgreSQL was used for evaluation. Note that MySQL compatible edition is out of scope.

Pricing

Pricing page of Aurora 1 Aurora capacity unit (ACU) = approximately 2 gibibytes (GiB) of memory, corresponding CPU, and networking. Note that I use the terms “ACU” and “DB capacity” interchangeably on this document.

In US East region as of Mar. 1, 2024, $0.12 per ACU hour for Aurora Standard.

Reference: provisioned on-demand instance

For db.r7g.large (2 vCPU, 16 GiB Memory, up to 12.5 Gbps Network bandwidth), $0.276 per hour

Per vCPU hour cost of Aurora Serverless v2 seems good. On the other hand, per memory hour cost is much poorer than provisioned instance. It means applications that have following performance characteristic are likely to be suitable for Aurora Serverless v2 than provisioned instance:

Aurora Serverless v2 has only pricing for on-demand use. Provisioned instance is more cost efficient if you used reserved instance. e.g. if you paid all upfront for 3 years reserved instance of db.r6.large, you can save 65% compared to on demand price. If your purpose to use Aurora Serverless v2 is to save cost, you should carefully compare the cost of serverless and provisioned instance.

Considerations for DB instance configurations

Capacity configuration

Although Aurora Serverless v2 adaptively scale, there are some considerations to avoid a performance cliff in case of a sudden surge of requests. Particularly, too small minimum capacity is problematic for the following issues.

We recommend setting the minimum to a value that allows each DB writer or reader to hold the working set of the application in the buffer pool. That way, the contents of the buffer pool aren’t discarded during idle periods.

(From How Aurora Serverless v2 works)

Note that “buffer pool” is “shared buffers” in postgres terminology.

The scaling rate for an Aurora Serverless v2 DB instance depends on its current capacity. The higher the current capacity, the faster it can scale up. If you need the DB instance to quickly scale up to a very high capacity, consider setting the minimum capacity to a value where the scaling rate meets your requirement.

(From Considerations for the minimum capacity value)

DB parameters

There are certain differences of DB parameters related to DB capacity between provisioned instance and Aurora Serverless v2. I explain only overview here. For further information, please read AWS documentation working with parameter groups.

Expected scaling behaviors

You can find descriptions about how and when scaling events are triggered by Aurora Serverless v2 on the documents.

I quoted some important points as follows.

Scaling is fast because most scaling events operations keep the writer or reader on the same host. In the rare cases that an Aurora Serverless v2 writer or reader is moved from one host to another, Aurora Serverless v2 manages the connections automatically.

Aurora Serverless v2 scaling can happen while database connections are open, while SQL transactions are in process, while tables are locked, and while temporary tables are in use. Aurora Serverless v2 doesn’t wait for a quiet point to begin scaling. Scaling doesn’t disrupt any database operations that are underway.

Scaling of reader instances

Readers in promotion tiers 0 and 1 scale at the same time as the writer. That scaling behavior makes readers in priority tiers 0 and 1 ideal for availability. That’s because they are always sized to the right capacity to take over the workload from the writer in case of failover.

Evaluation

I demonstrated simple benchmarking to evaluate scaling behavior of Aurora Serverless v2. Especially, the following points were evaluated:

Note that there are a lot of details about experiments on the next sections. If you are not interested, please skip to “Overall observations from experiments” section.

Evaluation method

Evaluation environment

I used an Aurora Cluster of PostgreSQL 15.5 that has 1 writer instance of db.serverless class, i.e. Serverless v2, in us-east-1 region. DB parameters were the default values except some logging related ones, e.g. pg_stat_statement was enabled.

I tried some min/max DB capacity settings as the document told that the scaling rate depends on the current DB capacity.

Benchmark workload

Since our application tends to have a bottleneck on write workload, I used write only workload on the benchmark.

The benchmark application runs multiple threads and each thread repeats a query that upsert (mostly update) 20 tuples. The average tuple size was 57 bytes including tuple header, i.e. 1140 bytes/query were written on average. All working data set basically fits into shared_buffers even on the smallest ACU I used, i.e. ACU = 1, shared_buffers = 384MiB. The workload was changed by varying number of threads from 1 to 32.

The benchmark was executed long enough so that the throughput became steady. The benchmark client server was large enough so that it wasn’t to be a bottleneck unless explicitly described. I monitored and confirmed that the client wasn’t actually a bottleneck while running the benchmark.

Monitoring method of scaling behavior

postgres=# \x
Expanded display is on.
postgres=# select pg_size_pretty(setting::bigint * 8192) shared_buffers from pg_settings where name = 'shared_buffers';
-[ RECORD 1 ]--+-------
shared_buffers | 128 MB

postgres=# \watch 1
Mon Mar  4 17:01:47 2024 (every 1s)

-[ RECORD 1 ]--+-------
shared_buffers | 128 MB

Mon Mar  4 17:01:48 2024 (every 1s)

-[ RECORD 1 ]--+-------
shared_buffers | 128 MB
...

Experiments and results

Basic scaling behavior (experiment with DB min/max = 1.0/4.0, Bench threads = 1)

As the first experiment, I used relatively small DB capacity configuration (min = 1.0, max = 4.0) and ran the benchmark with single thread. I observed how Aurora Serverless v2 scaled while running the benchmark.

Observations

Stress DB more

The scaling rate depends on the current DB capacity. I evaluated how scaling behavior varies by workload and DB capacity setting. Through next some experiments, I stressed DB more and observed scaling behaviors under heavier loads.

Experiment setup is described by the combination of DB capacity configuration (referred to as “DB min/max”) and number of threads of benchmark (referred to as “Bench threads”).

DB min/max = 1.0/4.0, Bench threads = 4

Executed the benchmark with 4 threads that is the same as the max DB capacity configuration. It was to check the behavior when CPU utilization of DB is around 100%.

DB min/max = 1.0/4.0, Bench threads = 6

Executed the benchmark so as to hit 100% CPU utilization.

DB min/max = 1.0/32.0, Bench threads = 16

Checked scaling behavior with larger max DB capacity and larger workload.

DB min/max = 4.0/32.0, Bench threads = 16

On this experiment, I checked the rate of scale up with larger minimum DB capacity.

DB min/max = 8.0/32.0, Bench threads = 16

Increased min DB capacity more that the previous experiment.

DB min/max = 4.0/32.0, Bench threads = 32

Finally, I stressed DB more to see scale up to more ACUs. On this experiment, the benchmark client server performance saturated. I don’t describe the detail of client performance as it’s out of scope of this document, but note that the workload incurred to DB was not twice as when benchmark threads was 16.

Overall observations from experiments

Remaining questions

Summary

From performance perspective, I think Aurora Serverless v2 is ready for production applications. (If you actually think to use it for production, you must evaluate for your use case, of course.)

It can almost seamlessly scale up DB capacity. I haven’t noticed significant outage or delay during benchmarking. Scale up to add a few ACUs is fairly fast. It completed in a few seconds in my experiment. For further scale up, it required tens of seconds to minutes. It depends on the gap between the ACU before scale up started and the target ACU. If you need to scale up DB capacity to large ACU quickly, you should set the minimum ACU high enough to scale up fast. It means there is a trade-off between cost efficiency, i.e. set minimum ACU to low, and latency, i.e. scale up quickly. As you may notice, we still need some capacity planning even with Aurora Serverless v2. (Note that it requires DB restart to change min/max DB capacity.) However, it’s easier than capacity planning for provisioned instance in my opinion.

Aurora Serverless v2 looks very promising so far, but considering relatively high unit price, i.e. price per vCPU hour or price per memory hour, it isn’t for all applications but only for applications which have highly variable workload. Pricing of Aurora Serverless v2 is good for applications that has high CPU demand at peak time. If your application has large data set to work on, it may be favourable for provisioned instance. Price per memory hour of Aurora Serverless v2 is not good.

If you provisioned DB capacity for the peak workload and the peak workload doesn’t last for a long period, you likely to have a chance to save DB cost by using Aurora Serverless v2 instead of provisioned instance. The break even point varies by various factors, so you should compare the cost of serverless and provisioned instance for your use case.

Another promising use case is to create a new application that workload will gradually increase after launch. Aurora Serverless v2 makes you free from DB capacity estimation for future demand.