Friday, February 28, 2014

When To Start Partitioning Your Database

As traffic volume grows, load on primary transactional database becomes the main bottleneck, in particular the disk I/O. In 3 or 4-tier architecture, if the middle-tiers are stateless, they can be scaled horizontally pretty easily. Main transactional database can also be horizontally or vertically partitioned and scaled but it is not that easy.

We have recently vertically partitioned the database and put two fairly independent modules into two separate databases, but it took considerable time and effort from the entire team and forced us to postpone certain other projects. Horizontal partitioning or sharding, which we had done once before, also takes time in terms of setting up and monitoring, although typically entire team does not get involved.

Engineers are trigger-happy when talking about sharding, but for a small team, it is better to postpone such tasks as long as possible and focus on adding application functionality. One obvious choice is to use higher performance systems. We have started using AWS storage-optimized I2 instances and got amazing I/O throughput. From their web site:

"I2 and HI1 instances provide very fast SSD-backed instance storage optimized for very high random I/O performance, and provide high IOPS at a low cost. We recommend using I2 and HI1 instances for NoSQL databases like Cassandra and MongoDB which benefit from very high random I/O performance and the low request latency of direct-attached SSDs, and are an excellent choice for transactional systems that can manage their own resiliency."

However there is a limit on how long we can scale out using high performance systems, and it is not the IOPS limit. It is the backup.

Replication is a great way to backing up databases, but alas it is not enough. If an unwanted delete or truncate or an incorrect update gets executed in the master database by mistake, it propagates to all slave databases, and original data cannot be recovered. Which is why point-in-time backup and recovery are important. It lets us go back one hour or one day or whatever to get a correct snapshot of the data.

Backup strategy is another topic for another day, but suffice to say here, that periodically we need take a full backup of the database. That combined with incremental or differential backups lets us go back in time to get a correct snapshot of the data.

Based on how far we go back in time, some data loss could happen. If we are going one hour, then one hour data could be lost, if we are going back one day, then one day's data can be lost. (Hopefully all or some lost data can be recovered from replicated servers).

How much data we are willing to lose in the worst case, is directly dependent on how long it takes to take a full backup of the database. That really puts a cap on how much we can scale out the database with high performance systems. After that we need to start partitioning.

We need to decide on how much data we are willing to lose in the worst case if something bad happens. As an example, lets say we decide 6 hours. Then the full backup should complete well within 6 hours. On m1.xlarge AWS instance, a MySQL DB of size 200Gb took about 6 hours to complete a full backup. So the database needs to be partitioned as the size approaches 200 Gb.

The above is just an example. On a high I/O system, backup is much faster, and in our case, 1 TB is the cutoff for now.

Correction: Actually, time required for full backup need not be the cap on the database size. If database is being replicated to multiple slaves, backups can be taken at each slave and strategically scheduled so that the data loss window is brought down.

No comments:

Post a Comment