Administrators are occasionally faced with the task to size their applications farms properly so it can sustain network growth for the years to come. They follow all the best practices:
- Understanding the company objectives and upcoming Internet related projects
- Establishing the estimated growth number of subscribers/users
- Working with the network team to understand how the network will grow
- Estimating the total traffic per second generated by clients and integrated systems
- Sizing the farms according to demographic demands
- Implementing traditional security technologies like load balancers, firewalls and intrusion prevention systems
They do everything right.
Unfortunately after all the sizing work is done, they find that application traffic load is increasing in unprecedented ways and their applications are unable to handle traffic load spikes.
Their services/applications collapses again and again.
Finally, administrators are asked by their less than happy management team:
“How could you have sized the service/application so poorly?”
The lesson to be learned here is: You can do everything right when sizing your application/service, but you need to be ready for the unpredictable.
What is the unpredictable?
Attacks and legitimate traffic peaks.
Knowing this, how can you ready your network?
Step 1 of 3: Clean the pipe
Let´s assume, for example, that you have an application running on a web farm and you´re constantly being hamered by bogus http traffic.
The best way to handle them is to block it upfront. An Intrusion Prevention System, NG firewall, ANti-DDOS and Traffic Control Engine (the DPI ones) are great tools to ensure that your farm will receive only valid traffic.
Step 2 of 3: Close the pipe
Let´s assume a situation where your service/application server pool is able to handle 150,000 transactions per second (tps) at its best. If this is true, why would you allow more than the maximum supported number of transactions to ever reach your servers? Why not rate-limit the traffic to guarantee that your servers will receive only the traffic they can handle?
This is a completely valid approach because we’re talking about protecting an infrastructure. Remember, our task as security and network professionals is to keep the network and the Internet running even when under attack or heavy load.
Step 3 of 3: Estimate the load
The last important aspect to consider is controlling the traffic generated by clients and networks. I cannot imagine a computer, tablet, or smartphone that can generate more than 10 transactions per second; even considering all the traffic that happens in the background (i.e. updates, synchronizations, etc.) in addition to the human generated traffic.
Establishing a limit for the transactions per second generated by each IP sounds like a good idea and can possibly be implemented by many administrators (each case is a case, of course).
But sometimes, you cannot simply rate-limit the transactions per second generated by an IP address because you can have many IPs that are used in network address translation (NAT).This means you may have many clients sharing a single IP address. In those cases, you can monitor the traffic generated by the subnets and estimate a safe-limit to control the aggregated traffic. You can also control the traffic load targeted by your individual servers too.
Next steps
It is a critical task to size a service your sercice and application properly; but it´s just as important to implement counter measures that can assure that your system will run under the designed conditions. Failure to recognize and implement those controls can jeopardize your service/application and your business.
Be ready for the unpredictable.
Best Regards