After 9 months sharing an Amazon EC2 instance with some of my other experimental projects it was finally time for TagWalk to “fly the nest”. The low spec virtual server was just not enough for TagWalk’s storage and data processing requirements. Looking at the economics of cost vs. performance, it made sense to pony up some money for a dedicated server as a new home for TagWalk.
Once the decision was made, it was very quick and painless to secure a new dedicated host. The hardest part was moving the TagWalk data mine from ‘the cloud’ to a real server.
It has been an interesting exercise, and an excellent opportunity to validate many aspects of the design, identify weaknesses, and learn some interesting and painful lessons — every day’s a school day right?
Non-technical:
- Migration is an excellent rehearsal for disaster recovery.
- Restoring a read-only mirror of the site was a good start goal.
- Create a journal file with all your command snippets.
Technical:
- Ubuntu Server 9.10 rocks.
- DNS can be very useful. Subdomains are great.
- Moving large volumes of data from one data center to another sounds daunting, but that was the easiest of jobs. The transfer rate was about 1.4MB/s and the generally speaking, the data could arrive faster than it could be dealt with.
- Sharding was a good idea which pays huge dividends (never in doubt)
- Waiting 20+ hours for a single table to restore really makes you question alternative methods of storage.
- Mono 2.4.3.4 has some issues which required VB.Net stuff to require “Option Strict”
MySQL Specific:
- MySQL imports can be *REALLY* slow. Under certain circumstances restoring large tables that have a UNIQUE INDEX can be very slow. I had heard anecdotal evidence but never really paid attention to it. Due to some rookie mistakes made along the way, it took me several attempts to restore my largest table. This problem alone set me back at least 2 days.
- Giving MySQL as much memory as possible can considerable improve import times. Optimize MySQL for the task in hand.
- MySQL’s innodb_file_per_table setting is the only way to go.
- The Linux nohup command is the only way to run long MySQL restore operations.
- Not using MySQL for the bulk of the data mine meant that for 70% of the data it was a zip-and-ship migration strategy. +1 for NoSQL!
- Representing a MD5 digest as 4 unsigned integers is much more efficient than a 32 character string.
- Restoring the data is easy, it is the indexes and constraint which take the time. A read-only version of your site can be up very quickly, the constraints are only necessary when you switch to read/write.
- MySQL table switcheroos with the RENAME TABLE command allows your site to use temporary copies of your tables which do not have all the indexes or use the right storage engine. The most effective large-table restore strategy I arrived at involved restoring MySQL tables into a MyISAM table without expensive indexes. Getting the site to use these for read only. Meanwhile, populate expensive versions of these tables with the full set of indexes (InnoDB + Unique indexes).
- Some big MySQL operations cannot be canceled. If an script load or INSERT/SELECT dies then MySQL want to “close tables” of “clean up”, this can take forever. In many cases, this required server reboots, and MySQL remove/installs to overcome.
The new server has more than double the memory which will give memcache and MySQL a much needed RAM massive boost. I also plumped for RAID as moving away from EC2′s Elastic Block Storage means a loss of redundancy on disk; a level of risk I am uncomfortable with, given the volumes of data
TagWalk has already settled into it’s new environment very quickly. The performance benefits are immediately apparent. As I begin to pay more attention I will be able to optimise this performance further. More importantly, the additional machine capacity will enable me to develop the next set of features.