Database systems with large data sets or high throughput applications can challenge the capacity of a single database server.
You may be able to purchase the most expensive and the fastest CPU or storage on the market, but it still may not be enough to handle your workload. Historically, My SQL replication used a single thread to process writes - in a multi-user, highly concurrent environment, this was a serious limitation. In My SQL 5.6, multiple schemas could be replicated in parallel.The only feasible way to scale beyond the constraints of a single host is to utilize multiple hosts working together as a part of a cluster or connected using replication. When it comes to scaling reads, it is very efficient - just add a node and you can utilize additional processing power. In My SQL 5.7, after addition of a ‘logical clock’ scheduler, it became possible for a single-schema workload to benefit from the parallelization of multi-threaded replication.Galera Cluster for My SQL also allows for multi-threaded replication by utilizing multiple workers to apply writesets.Still, even with those enhancements, you can get just some incremental improvement in the write throughput - it is not the solution to the problem.One solution would be to split our data across multiple servers using some kind of a pattern and, in that way, to split writes across multiple My SQL hosts. The idea is really simple - if my database server cannot handle the amount of writes, let’s split the data somehow and store one part, generating part of the write traffic, on one database host and the other part on another host.
In that way, each host will have to handle half of the writes which should be well within their hardware limits.
We can further split the data and distribute it on more servers if our write workload grows.
The actual implementation is more complex as there are numerous issues you need to solve before you can implement sharding.
The first, very important question that you need to answer is - how are you going to split your data?
Let’s imagine your application is built out of multiple modules, or microservices if we want to be fashionable.
Assume it’s a large online store with a backend of several warehouses.