Friday, October 24, 2014

Experiences with a Large Table

We gained some experiences dealing with one of our large tables recently. This is not really in big-data scale, and we use MySQL for storage and we run SQL queries and so we don't use no-SQL databases.

The table is growing at the rate of 2.6 million records per day. Row size is about 2.3KB. At this rate the table grows by 6GB per day, or 180GB per month.

In one year, it would grow above 2TB, provided traffic remains constant, which actually wont.

Now, this is not one of our primary tables that we deal with everyday. This table is important, but we don't want to invest too much time and resources managing it. We have a SSD drive of 1.6TB, and want to avoid server or disk addition/upgrade at this point.

What we did was reduce the row size to about 350 bytes by eliminating bunch of diagnostic data that were being inserted. We put the diagnostic data in log files and kept the data required for queries and analytics in the table.

With this change, the table would grow by 900MB per day, or 27GB per month, or 330GB per year.

We have an SSD drive of 1.6TB, so we won't have to worry about disk space for 2-3 yrs.

Few points to remember while dealing with large tables:

  • Always have an auto-increment numeric primary key. Even if the table has UUID or other types of unique keys, it just helps with administration when there is an auto-increment numeric primary key.
  • Always have a create-date column, which is of-course a no-brainer. Index the create-date column. Archival or deletion will require create-date column almost all the time.
  • As learnt from above experience, keep the table size small, and do not add columns that can be avoided.
  • Allocate enough space to in the disk to take full backups.

No comments:

Post a Comment