Having written an application to work with an unsharded MySQL setup, how do we migrate the application to a sharded setup ?
The above problem can be divided into two parts
- Migrating the MySQL Store (with minimum disruption to ongoing transactions)
- Migrating the Application.
This blog deals with the first problem of migrating the MySQL Store. The second problem of migrating the application will be handled in a separate blog post, and we shall assume for now that we just replace the old unsharded non-Fabric aware application with the new Fabric aware application.
Most enterprises start with a single server setup to develop a prototype and later realize that they need to scale. This blog presents a way for such a setup to seamlessly scale to a sharded setup with minimal disruption of ongoing transactions.
The following is the schema in the MySQL Server running on localhost:13013
The employee database has the following tables
Creating the Sharded Configuration
Let us assume that our final sharded configuration will have three shards. Each of these shards will be present in one group and there will be one global group.
The tables employee, salaries and dept_emp need to be sharded based on emp_no while fuel_reimb is a global table that needs to be present in all the shards.
Bring the Target server into the Fabric ecosystem
Create a Fabric group and add the server to it. Promote the server to master.
mysqlfabric group create GROUPID1
mysqlfabric group add GROUPID1 localhost:13013 root ""
mysqlfabric group promote GROUPID1
The server is now part of the Fabric setup.
Creating the Global Group
The global group in a sharding setup is used to propagate schema updates to all tables in the sharding setup and updates to global tables throughout the sharding ecosystem. The global group contains
- The schema of the sharded tables
- The global tables
NOTE: Once we have restored the schema on the global group, until the global group is connected to the server group of the shard (GROUPID1) i.e. Until we create the sharding setup, we need to ensure that schema changing transactions do not occur on the server group of the shard. If schema changes occur in the server group of the shard, these will need to be propogated to the global group also, otherwise it will result in mismatching schemas in the global group and the server group of the shard. This can cause problems for global operations that expect the new schema. Once the sharding setup is created, schema changes can be sent to the global group.
NOTE: Similarly once the global tables have been restored on the global group, we need to prevent changes to the global groups until the sharding setup is created, after which the changes can be propagated to the global group.
mysqlfabric group create GLOBAL_GROUP
mysqlfabric group add GLOBAL_GROUP localhost:13011 root ""
mysqlfabric group promote GLOBAL_GROUP
Creating the data in the Global Group
Backup the schema of all the tables and restore them on the on the global group. Also backup the data in the global tables and restore them on the global group.
Use mysqldump to backup the schemas
mysqldump -d -u root --single-transaction --all-databases --socket <Server socket file for localhost:13013> > schema.sql
NOTE: --no-data can be used instead of -d in the above command.
Use mysql client to restore the schemas
mysql -u root --socket <Server socket file for localhost:13011> < schema.sql
Use mysqldump to backup the global tables
mysqldump -u root --no-create-info --single-transaction --socket <server socket file for localhost:13013> employee fuel_reimb > global.sql
The GTID_EXECUTED variable would have been set on the destination server. This causes applying the dump using the mysql client to fail. Reset this to allow restoring the global table.
Restore the data on the global tables
mysql --database=employee -u root --socket <server socket file for localhost:13011> < global.sql
Creating the Sharding Setup
Create a sharding definition
A sharding definition introduces a particular sharding scheme to which multiple tables related by a common sharding key can be mapped. A Sharding definition is composed of
- Sharding scheme – RANGE, HASH
- Global Group – Gets the global updates for all the shard tables mapped to this sharding definition. It also stores the global tables in this sharding definition.
mysqlfabric sharding define HASH GLOBAL_GROUP
Map the tables to the sharding definition
Each sharding definition is associated with a unique ID. We map this unique ID to the table being sharded. Each mapping maps a table and the column containing the shard key to the sharding definition.
mysqlfabric sharding add_mapping 1 employee.employee emp_no
mysqlfabric sharding add_mapping 1 employee.dept_emp emp_no
mysqlfabric sharding add_mapping 1 employee.salaries emp_no
Define the Shards
The shard definitions define the way the data is sharded based on the shard key. For a hash based sharding scheme, since the shards are automatically defined on the shard key, we do not need to manually define the lower_bounds for each shard.
Each shard is associated with a sharding definition and a group on which the data will be present. When we define a shard we also need to specify if it will be ENABLED for operations on shards.
mysqlfabric sharding add_shard 1 GROUPID1 ENABLED
The sharding setup now contains a global group and one shard containing all the data.
Once the sharding setup has been created we can start using the Fabric enabled connector to start executing transactions. This would require shutting down the non-Fabric-aware application and replacing it by a Fabric-aware application.
Split the shards
Split the shard containing the data. Splitting helps to distribute the load into another group. This is a way to scale our setup for both read and write loads. The splitting can happen while transactions are ongoing.
Add another group to which we want to split the data
mysqlfabric group create GROUPID2
Add the servers that are part of the split group
mysqlfabric group add GROUPID2 localhost:13009 root ""
Promote a server as master in the split group
mysqlfabric group promote GROUPID2
Split the shard
If you do not know the shard ID, dump the sharding information to find the shard ID.
mysqlfabric store dump_sharding_information
split the shard
mysqlfabric sharding split 1 GROUPID2
Now the data is distributed between the GROUPID1 and the GROUPID2.
The shards can be split further depending on need. For example to split shard with shard ID = 3 into a group with group ID = GroupID3 we will run the following command,
mysqlfabric sharding split 3 GROUPID3
After the above command is executed the topology would look like the following.
Please find a few thoughts on maintaining a sharded system here http://vnwrites.blogspot.com/2013/09/mysqlfabric-sharding-maintenance.html