The monitoring, auto scaling, and elastic load balancing features of the Amazon EC2 services give you easy on-demand access to capabilities that once required a complicated system architecture and a large hardware investment.
Any real-world web application must have the ability to scale. This can take the form of vertical scaling, where larger and higher capacity servers are rolled in to replace the existing ones, or horizontal scaling, where additional servers are placed side-by-side (architecturally speaking) with the existing resources. Vertical scaling is sometimes called a scale-up model, and horizontal scaling is sometimes called a scale-out model.
At first, vertical scaling appears to be the easiest way to add capacity. You start out with a server of modest means and use it until it no longer meets your needs. You purchase a bigger one, move your code and data over to it, and abandon the old one. Performance is good until the newer, larger system reaches its capacity. You purchase again, repeating the process until your hardware supplier informs you that you’re running on the largest hardware that they have, and that you’ve no more room to grow. At this point you’ve effectively painted yourself into a corner.
Vertical scaling can be expensive. Each time you upgrade to a bigger system you also make a correspondingly larger investment. If you’re actually buying hardware, your first step-ups cost you thousands of dollars; your later ones cost you tens or even hundreds of thousands of dollars. At some point you may have to invest in a similarly expensive backup system, which will remain idle unless the unthinkable happens and you need to use it to continue operations.
Horizontal scaling is slightly more complex, but far more flexible and scalable in the long term. Instead of upgrading to a bigger server, you obtain another one (presumably of the same size, although there’s no requirement for this to be the case) and arrange to share the storage and processing load across two servers. When two servers no longer meet your needs, you add a third, a fourth, and so on. This scale-out model allows you to add resources incrementally and economically. As your fleet of servers grow, you can actually increase the reliability of your system by eliminating dependencies on any particular server.
Of course, sharing the storage and processing load across a fleet of servers is sometimes easier said than done. Loosely coupled systems tied together with SQS message queues like those we saw and built in the previous chapter can usually scale easily. Systems with a reliance on a traditional relational database or another centralized storage can be more difficult.
Monitoring, Scaling, and Load Balancing
We’ll need several services in order to build a horizontally scaled system that automatically scales to handle load.
First, we need to know how hard each server is working. We have to establish how much data is moving in and out across the network, how many disk reads and writes are taking place, and how much of the time the CPU (Central Processing Unit) is busy. This functionality is provided by Amazon CloudWatch. After CloudWatch has been enabled for an EC2 instance or an elastic load balancer, it captures and stores this information so that it can be used to control scaling decisions.
Second, we require a way to observe the system performance, using it to make decisions to add more EC2 instances (because the system is too busy) or to remove some running instances (because there’s too little work for them to do). This functionality is provided by the EC2 auto scaling feature. The auto scaling feature uses a rule-driven system to encode the logic needed to add and remove EC2 instances.
Third, we need a method for routing traffic to each of the running instances. This is handled by the EC2 elastic load balancing feature. Working in conjunction with auto scaling, elastic load balancing distributes traffic to EC2 instances located in one or more Availability Zones within an EC2 region. It also uses configurable health checks to detect failing instances and to route traffic away from them.
Figure 7-1 depicts how these features relate to each other.
An incoming HTTP load is balanced across a collection of EC2 instances. CloudWatch captures and stores system performance data from the instances. This data is used by auto scale to regulate the number of EC2 instances in the collection.
As you’ll soon see, you can use each of these features on their own or you can use them together. This modular model gives you a lot of flexibility and also allows you to learn about the features in an incremental fashion.
This article is from book <Host Your Web Site In The Cloud: Amazon Web Services Made Easy>.