亚马逊扩展出Web服务器的数据库服务器

We have a PHP based web server that is hosted on an AMI on Amazon EC2. This web server uses a MySQL server running on another RDS Amazon instance. Both setup works perfectly. We are trying to scale out the setup by putting them behind a load balancer and spin up new instances as load goes up and down the other way.

We were able to achieve spinning up the new web server instances. All these web servers connect to one database and hence all seems to work well. However during this process the database server becomes the bottleneck. We would like to have the database server to be scaled out. The web server and database server are to be scaled out in pairs and obviously the database servers are to be in sync. It is not clear how this has to be achieved and any help or reference or tutorial will be of great help.

I searched on Internet and read through Amazon docs but all drew blank.

Frankly I also do not know whether this is a SO related question or a SU or ServerFault site question. Please be lenient in case if it is otherwise.

Scaling out a database server is much more complicated than scaling out a web server. You will need to create read-replica instances, which RDS makes fairly easy. Read-replicas are kept in sync with the primary database instance. However, as the name implies, read-replicas are read-only database instances. Any insert/update/delete commands need to go back to the primary database server.

What this means is that you can't use simple load balancing to distribute the load. A load balancer would need to inspect each SQL request and understand which commands can go to which servers. Amazon's Elastic Load Balancer is not capable of this. If you search for "MySQL load balancer" you can find lots of articles and tutorials for setting up software like HAProxy to intelligently distribute the load across a cluster of MySQL instances.

Alternatively you can code your application to have knowledge of the different database servers available, and send updates to the primary server while sending queries to one of the read-replica servers. In your scenario I could see a case where each time you spin up a web server you also spin up a MySQL read-replica which that web server uses for queries.


As a side note, since you are using MySQL on Amazon RDS you should really take a look at Amazon's Aurora database. It is MySQL compatible so you don't have to change a single line of code, and it generally performs better than MySQL. If you switch to Aurora you may find that your database isn't as much of a bottleneck.

Amazon also recently added load balancing support to Aurora. With this new feature you still have a primary database for updates, but now you only have to keep track of a single read-replica endpoint which will distribute query load across all your read replica instances.