To build a reliable fault-tolerant infrastructure, load balancing of the servers is essential. Load balancer is responsible for this function - a job distribution service within a cluster that guarantees system operation even if one of the servers fails.
In this article you will learn how algorithms and methods of load balancing are designed, what the points of failure are, what the difference between Load balancer and proxy is.
What is load balancing
Load balancing is a method of distributing network traffic and tasks between network devices.
At the start of the service development all the components (Frontend, Backend, database) may be on the same server. If the load grows, it can be scaled vertically: change the server configuration to a more powerful one, or quickly add resources to the cloud server - add the number of vCPUs or the amount of memory.
This will be enough for a while, but eventually the server capacity may be insufficient, and the tasks will be split across several servers. So, the frontend will go to a separate server, the backend to a second server, and the database will be stored on yet another machine. And each of the servers can also be "upgraded" vertically.
Horizontal scaling, where the number of servers dedicated to one task grows, is the next step in infrastructure development. It is required to better manage workloads and achieve greater flexibility in the infrastructure organisation.
With horizontal scaling, a new infrastructure element is required - a load balancer to distribute traffic.
Balancing is applicable to the following hardware:
- proxy server,
- DPI system,
- DNS server,
- network adapter.
Why a load balancer is needed
An increase in the number of users and the volume of traffic leads to an increase in the load on the service infrastructure. The load balancer ensures that the server is not overloaded with traffic and that data moves efficiently between cluster components.
Fault tolerance is the main goal. Application complexity is growing, the number of points of failure is increasing, and they may be located at different infrastructure levels, be they servers or networks. The balancer allows you to avoid a single point of failure - the part of the system that, if it fails, stops all work. If one server fails, the balancer distributes traffic to the rest of the infrastructure.
The balancer allows you to make better use of resources and serve requests faster. For example, if you have two database servers, the load balancer will make sure that both are equally loaded.
The balancer will also provide a smoother scaling of your infrastructure: when you grow horizontally - adding a new server to the cluster - it will quickly and accurately load the new "link" of the infrastructure.
Another important balancer feature is protection against DDoS attacks. It provides a response delay when the background servers don't see the client until the TCP acknowledgement. Selectel's load balancer routes outgoing traffic through special algorithms that filter out TCP ACK/FIN/RST attacks by up to 99.9%.
Examples of points of failure
- Getting to the Internet and enabling user access to the service: the IP address and network equipment through which traffic flows to the target servers.
- Physical servers or virtual machines on which the service is deployed. These are redundant to prevent system crashes.
- The balancer itself. Hardware that is configured for the balancer can also be a weak link. This is why companies, for which fault tolerance is critical, also reserve the balancer.
Balancing layers on the OSI model
Open System Interconnection (OSI) is a model of the network protocol stack. Under the OSI model, network devices are lined up into distinct layers of functionality. Balancing occurs at the L4 and L7 layers.
L1. The physical layer performs the exchange of signals between physical devices. At the initial layer, signals are represented in bits.
L2. The data link layer is responsible for physical addressing, bits are transformed into frames and receive sender-receiver addresses.
L3. The network layer performs routing and logical addressing between the devices, the frames are combined into data packets. All possible network faults are taken into account in this layer. A router is responsible for this function.
L4. The transport layer maintains the communication between the end devices. The main functions of L4 are to minimize data transmission delay and to achieve integrity of the delivered information. Depending on the protocol in the transport layer, data packets are segmented according to two principles.
Segmentation (TCP protocol) - division of a packet into parts when network bandwidth is exceeded. Division into datagrams (UDP protocol) - a method in which an autonomous part of a packet with headers and destination addresses appears. The datagrams reach the final destination independently.
The transport layer is the link between the two groups:
- Media layers, or environment layers. L1 TO L3. These are where information is transferred between network devices.
- Host layers, or host layers. L4 - L7. These run directly on desktops or mobile devices.
Layer 4 is a load balancer and its functions include tracking network information about protocols and application ports, combining the data with balancing algorithms, selecting the target server with the least amount of response time.
L5. The session layer manages the communication session. Its functionality includes task synchronization, creation, maintenance during inactivity, and session termination. Starting with L5, the layers use clean data.
L6. The presentation layer performs encryption and conversion of data into a server- and user-understandable form.
L7. The application layer serves user and network communication. The upper layer provides access to network services. The HTTP protocol running on L7 identifies client sessions and, based on cookies, ensures that user requests are delivered to the same server.
The upper layer of the OSI model interacts with the previous layers: inspects L4 traffic, sends error reports and requests to the L6 layer, and transmits service information.
Availability check. The balancer monitors server statuses and redirects connections if one of the servers fails. There are several parameters to assess the server status:
- responses to check requests, the interval is configured in seconds,
- network latency time,
- expected response codes for HTTP and HTTPS inspection protocols,
- success/fail threshold - the number of consecutive requests sent, after which the server continues to work or pauses.
Setting up connections. Follow the scheme: incoming request → balancer → server. Configurable settings:
- limit the number of connections,
- connection timeout determines the time to wait for a response,
- Inactive timeout - the time when the connection is considered active, even if no data is transmitted,
- TCP timeout - the time to wait for data to be transmitted for inspection on an established connection.
This distributes incoming requests through the servers in the cluster. This algorithm helps to solve several problems: optimize resource utilization and achieve maximum throughput, reduce response time and prevent overloading of one of the system components.
The advantage of this routing protocol is a single IP address for multiple servers. The least busy server can respond to all requests. Delays in receiving traffic are thus minimised. This protocol supports the flexible provisioning and addition of new servers. BGP Anycast is used at Selectel.
A round robin service algorithm used by the balancer. RR distributes requests evenly and cyclically to servers according to a pre-defined weighting.
An algorithm that counts the number of connections to a server. Each incoming request is sent to the server with the fewest active connections. The vulnerability is the need to balance between multiple Frontend servers. When a user establishes a connection to a Frontend server, all requests are sent to that Frontend server. If the other Frontend server is respected, it may be less load, and the user will connect to it - it will have to reauthenticate.
Sticky Sessions, however, solves this problem. The function of the algorithm is that the server that processed the request will be assigned to the user's session. The session will start on another server only if the original server is unavailable.
Load balancing and proxying
Reverse proxy servers and load balancers act as intermediaries in communication between clients and servers. The terms "load balancer" and "proxy server" are often used interchangeably. For example, Selectel's fault-tolerant load balancer is a reverse proxy that distributes traffic between different company services located in different regions and availability zones.
The reverse proxy receives a request from a client, forwards it to a server that can handle it, and returns the response. Deploying a reverse proxy makes sense even if there is only one web or application server. Not all proxies are load balancers, but the primary function of the vast majority is load balancing.
A load balancer distributes incoming client requests among a group of servers, in each case returning the response from the selected server to the appropriate client. Load balancers are used when the volume of requests is too large to be handled efficiently by a single server. Deploying multiple servers also eliminates the problem of a single point of failure.
Benefits of a cloud balancer
For a long time, load balancers were hardware based and hardware was set up for them. Cloud-based load balancers are now popular because they offer a number of advantages:
- Quick deployment of the load balancer in a new configuration. At Selectel, this can be done in a few minutes in the control panel interface.
- "Iron balancers can fail to handle the load and fail. Cloud Load balancers are more robust in this respect. They are also easier to scale vertically.
- Cloud balancers are cheaper. For example, Selectel pays for load balancers on a pay-as-you-go (pay-per-capacity) model. The solution can be activated during a "hot" period for business and deactivated when the need for balancing is no longer acute.
Various combinations of protocols are available in cloud balancers that deal with L4-level load and L7-level load.
- TCP-TCP - classic L4 balancing,
- TCP-PROXY - client information is not lost and is sent in a separate connection header,
- UDP-UDP - UDP protocol is faster than TCP but less reliable,
- HTTP-HTTP - L7-balanced,
- HTTPS-HTTP - L7-balanced with encryption and SSL certificate termination on the balancer.
This concludes our review of load balancers. We covered modern network load balancing approaches, load balancer functionality, and load balancing algorithms.
Load balancers are a mandatory element of a complex infrastructure consisting of multiple servers and requiring smart approaches to traffic management.