Recovery Principles of MySQL Cluster 5.1
MySQL Cluster is a parallel main memory
database. It is using the normal MySQL software with a new storage engine NDB Cluster. MySQL Cluster 5.1 has been adapted to also handle fields on disk. In this work a number of recovery principles of MySQL Cluster had to be adapted to handle very large data sizes. The article presents an efficient algorithm for synchronizing a starting node with very large data sets. It provides reasons for the unorthodox choice of a no-steal algorithm in the buffer manager. It also presents the algorithm to change the data.
MySQL Cluster is a parallel main-memory DBMS. In MySQL Cluster 5.1 fields on disk are introduced. This introduces a number of challenges on the recovery architecture.
1.1 Description of MySQL Cluster
MySQL Cluster uses the normal MySQL Server technology paired with a new storage engine NDB Cluster. Data within MySQL Cluster is synchronously replicated among the data nodes in the cluster. MySQL Cluster uses the shared-nothing architecture, data nodes in the cluster handle their own storage and the only means of communication between the nodes is through messages. The main reason for choosing a shared-nothing architecture is to enable a very fast fail-over at node failures. It also doesn’t rely on an advanced storage system as most shared-disk systems do. This means that normal cluster hardware can be used also for building large database clusters.
Internally there are a number of protocols designed, to be able to handle single failures most of these protocols have an extra protocol to handle failures of the master and to be able to handle multiple failures there is also an extra protocol to handle failure of the new master taking over for the old master.
Applications can use the normal MySQL interface from PHP, Perl, Java, C++, C and so forth, the only difference is that the tables are created with ENGINE=NDBCLUSTER. In version 4.1 and 5.0 of MySQL Cluster all data in MySQL Cluster resides in main memory distributed over the data nodes in the cluster.
1.2 Challenges for Node Restart
Main memory is fairly limited in size. A starting node is synchronised by copying the entire data set to it. Even a large node with 10 Gbyte of data can in this manner be synchronised in 10-20 minutes. Using fields on disk the size of the data set in a data node in MySQL Cluster can easily be 1 TB. The current synchronisation method will thus take one day. This is obviously too much. A solution.

Leave a Reply
You must be logged in to post a comment.