Scaling Your Web Application: Basic Steps

It’s not enough to build apps for your business, you need to optimize them. One effective way is to scale. In this article, you’ll learn about code optimization, architecture optimization, and how to build scalable web applications in general.


Gearheart suggests asking yourself the following questions:

  • are the database queries optimal (EXPLAIN analysis, use of indexes)?
  • is the data stored correctly (SQL vs NoSQL)?
  • is caching used?
  • no unnecessary queries to FS or database?
  • are the data processing algorithms optimal?
  • are the environment settings optimal: Apache/Nginx, MySQL/PostgreSQL, PHP/Python?

As each of these issues can be the subject of a separate article, their detailed examination within the framework of this article is manifestly excessive. It is important to understand that before starting to scale an application, it is highly desirable to optimize its work as much as possible – in fact, it is then possible that no scaling will be necessary.


Suppose you have already optimized your application, but it is still not able to handle the load. In this case, the obvious solution is to spread the application across multiple hosts in order to increase the overall performance of the application by increasing the available resources. This approach is officially called “scaling” the application. More specifically, scalability is the ability of a system to increase its performance by increasing the amount of resources available to it.

There are two types of scalability: vertical and horizontal. Vertical scalability involves increasing application performance by adding resources (CPU, memory, disk) within a node (host). Horizontal scaling is typical of distributed applications and involves increasing application performance by adding another node.

Clearly the easiest way is a simple hardware upgrade (CPU, memory, disk) – i.e. vertical scaling. Moreover, this approach does not require any modification of the application. However, the vertical scaling quickly reaches its limit, after which the developer and administrator have no choice but to switch to horizontal scaling of the application.

Application Architecture

Most web applications are a priori distributed, because their architecture can be divided into at least three layers: web-server, business logic (application), data (database, static).

Each of these layers can be scaled. So if your system has an application and a database residing on the same host, the first step should definitely be to separate them on different hosts.

The bottleneck

When scaling the system, the first thing to do is to determine which of the layers is the “bottleneck”, i.e. slower than the rest of the system. To start, you can use trivial utilities like top (htop) to estimate CPU/memory consumption and df, iostat to estimate disk consumption. However, it is desirable to provide a separate host with combat load emulation (using AB or JMeter), on which you can profile the application using utilities such as xdebug, oprofile, etc. You can use utilities like pgFouine to identify narrow database queries (of course this is best done based on combat server logs).

It usually depends on the architecture of the application, but in general the most likely bottleneck candidates are the database and the code. If your application handles a lot of user data, the bottleneck is probably static storage.

Database scaling

As mentioned above, the bottleneck of modern applications is often the database. The issues with this are generally divided into two categories: performance and the need to store a large amount of data.

You can reduce the load on the database by dividing it into multiple hosts. There is an acute difficulty of synchronization between them, which can be resolved by implementing the master/slave scheme with synchronous or asynchronous replication. For PostgreSQL you can use Slony-I for synchronous replication and PgPool-II or WAL (9.0) for asynchronous replication. To solve the problem of separating read and write requests, as well as balancing the load between slaves, you can configure a special database access layer (PgPool-II).

The concern of storing large amounts of data in the case of relational databases can be solved by partitioning (“partitioning” in PostgreSQL), or by deploying the database on a distributed database like Hadoop DFS.

You can read more about both solutions in the excellent PostgreSQL configuration book.

1.However, for storing large amounts of data, the best solution is sharding, which is an inherent advantage of most NoSQL databases (eg MongoDB).

2. Additionally, NoSQL databases in general run faster than their SQL brethren due to lack of overhead for query analysis/optimization, data structure integrity checking, etc The topic of comparing relational and NoSQL databases is also quite extensive and deserves a separate article.

3. Note separately the experience of Facebook, which uses MySQL without JOIN selections. This strategy allows them to scale the database much more easily, while shifting the load from the database to the code, which, as will be described below, scales more easily than the database.

Code scaling

  • The complexity of scaling code depends on how many shared resources your hosts need to run your application. Will it be sessions only or will you have to share caches and files? In any case, the first thing to do is to run copies of the application on multiple hosts with the same environment.
  • Next, you need to configure load/query balancing between these hosts. You can do this both over TCP (HAProxy), HTTP (nginx) or DNS.
  • The next step, Gearheart mentioned, is to make static files, cache, and web application sessions available on each host. For sessions, you can use a server running on the network (for example, Memcached). As a cache server, it makes sense to use the same Memcached, but on a different host, of course.
  • Static files can be mounted from shared file storage via NFS/CIFS or using distributed FS (HDFS, GlusterFS, Ceph).

It is also possible to store files in a database (e.g. Mongo GridFS), thus solving the problem of availability and scalability (taking into account that for NoSQL database, the problem of scalability is solved by sharding).

Note separately the issue of deployment on multiple hosts. How to make sure that the user clicking on “Update” does not see different versions of the application? The simplest solution, in my opinion, would be to exclude from the configuration load balancer (web server) hosts that are not updated and activate them sequentially as updates occur. You can also bind users to specific hosts by cookie or IP. If the update requires significant database changes, the easiest way is to temporarily close the project.

Comments are closed.