Tag Archives: layer7

Hypervisors virtual network performance comparison from a Virtualized load-balancer point of view

Introduction

At HAProxy Technologies, we edit and sell a Load-Balancer appliance called ALOHA (stands for Application Layer Optimisation and High-Availability).
A few month ago, we managed to make it run on the most common hypervisors available:

  • VMWare (ESX, vsphere)
  • Citrix XenServer
  • HyperV
  • Xen OpenSource
  • KVM

< ADVERTISEMENT>So whatever your hypervisor is, you can runan Aloha on top of it 🙂 </ADVERTISEMENT>

Since a Load-Balancer appliance is Network IO intensive, we thought it was a good opportunity to bench each Hypervisor from a virtual network performance point of view.

Well, more and more companies use Virtualization in their infrastructures, so we guessed that a lot of people would be interested by the results of this bench, that’s why we decided to publish them on our blog.

Things to bear in mind about virtualization

One of the interesting feature of Virtualization is to be able to consolidate several servers onto a single hardware.
As a consequence, the resources (CPU, memory, disk and network IOs) are shared between several virtual machines.
An other issue to take into account is that the Hypervisor is like a new “layer” between the hardware and the OS inside the VM, which means that it may have an impact on the performance.

Purpose of benchmarking Hypervisors

First of all: WE ARE TOTALLY NEUTRAL AND HAVE NO INTEREST SAYING GOOD OR BAD THINGS ABOUT ANY HYPERVISORS.

Our main goal here is to check if each Hypervisor performs well enough to allow us to sell our Virtual Appliance on top of it.
From the tests we’ll run, we want to be able to measure the impact of the Hypervisor on the Virtual Machine performance

Benchmark platform and procedure

To run these tests, we use the same server for all Hypervisors, just swapping the hard-drive, to run each hypervisor independently.

Hypervisor Hardware summarized below:

  • CPU quad core i7 @3.4GHz
  • 16G of memory
  • Network card 1G copper e1000e

NOTE: we benched some other network cards and we got UGLY results. (Cf conclusions)
NOTE: there is a single VM running on the hypervisor: The Aloha.

The Aloha Virtual Appliance used is the Aloha VA 4.2.5 with 1G of memory and 2 vCPUs.
The client and WWW servers are physical machines plugged on the same LAN than the Hypervisor.
The client tool is inject and the web server behind the Aloha VA is httpterm.
So basically, the only thing that will change during these tests is the Hypervisor.

The Aloha is configured in reverse-proxy mode (using HAProxy) between the client and the server, load-balancing and analyzing HTTP requests.
We focused mainly on virtual networking performance: number of HTTP connections per seconds and associated bandwidth.
We ran the benchmark with different object size: 0, 1K, 2K, 4K, 8K, 16K, 32K, 48K, 64K.
NOTE: by “HTTP connection”, we mean a single HTTP request with its response over a single TCP connection, like in HTTP/1.0.

Basically, the 0K object test is used to get the number of connections per second the VA can do and the 64K object is used to measure the maximum bandwidth.

NOTE: the maximum bandwith will be 1G anyway, since we’re limitated by the physical NIC.

We are going to bench Network IO only, since this is the intensive usage a load-balancer does.
We won’t bench disks IOs…

Tested Hypervisors


We benched a native Aloha against Aloha VA embedded in the Hypervisors listed below:

  • HyperV
  • RHEV (KVM based)
  • vshpere 5.0
  • Xen 4.1 on Ubuntu 11.10
  • XenServer 6.0

Benchmark results


Raw server performance (native tests, without any hypervisor)

For the first test, we run the Aloha on the server itself without any Hypervisor.
That way, we’ll have some figures on the capacity of the server itself. We’ll use those numbers later in the article to compare the impact of each Hypervisor on performance.

native_performance

Microsoft HyperV


We tested HyperV on a Windows 2008 r2 server.
For this hypervisor 2 network cards are available:

  1. Legacy network adapteur: which emulates the network layer through the tulip driver.
    ==> With this driver, we got around 1.5K requests per seconds, which is really poor…
  2. Network adapteur: requires the hv_netvsc driver supplied by Microsoft in open source since Linux Kernel 2.6.32.
    ==> this is the driver we used for the tests

hyperv_performance

RHEV 3.0 Beta (KVM based)

RHEV is Red Hat Hypervisor, based on KVM.
For the Virtualization of the Network Layer, RHEV uses the virtio drivers.
Note that RHEV was still in Beta version when running this test.

VMWare Vsphere 5

There are 3 types of network cards available for Vsphere 5.0
1. Intel e1000: e1000 driver, emulates network layer into the VM.
2. VMxNET 2: allows network layer virtualization
3. VMxNET 3: allows network layer virtualization
The best results were obtained with the vmxnet2 driver.

Note: We have not tested Vsphere 4 neither ESX 3.5.

vsphere_performance

Xen OpenSource 4.1 on Ubuntu 11.10

Since CentOS 6.0 does not provide Xen OpenSource in its official repositories, we decided to use the latest (Oneiric Ocelot) Ubuntu server distribution, with Xen 4.1 on top of it.
Xen provides two network interfaces:

  1. emulated one, based on 8139too driver
  2. virtualized network layer, xen-vnif

Of course, the results are much better with xen-vnif, so we’re going to use this driver for the test.

xen41_performance

Citrix Xenserver 6.0


The network driver used for XenServer is the same one than the Xen OpenSource: xen-vnif.

xenserver60_performance

Hypervisors comparison


HTTP connections per second


The graph below summarizes the http connections per second capacity for each Hypervisor.
It shows us the Hypervisor overhead by comparing the light blue line which represents the server capacity without any Hypervisor to each hypervisor line..

http_connections_comparison

Bandwith usage


The graph below summarizes the http connections per second capacity for each Hypervisor.
It shows us the Hypervisor overhead by comparing the light blue line which represents the server capacity without any Hypervisor to each hypervisor line..

bandwith_comparison

Performance loss


Well, comparing Hypervisors to each others is nice, but remember, we wanted to know how much performance was lost in the hypervisor layer.
The graph below shows, in percentage, the loss generated by each hypervisor when compared to the native Aloha.
The highest the percentage, the worste for the hypervisor…

performance_loss_comparison

Conclusion

  • the Hypervisor layer has a non-negligible impact on networking performance on a Virtualized Load-Balancer running in reverse-proxy mode.
    But I guess it would be the same for any VM which is Networking IO intensive
  • The shortest the connections, the biggest the impact is.
    For very long connection like TSE, IMAP, etc… virtualization might make sense
  • Vsphere seems in advanced compared to its competitors on a performance point of view.
  • HyperV and Citrix XenServer have interesting performance.
  • RHEV (KVM) and Xen open source can still improve performance.
    Unless this is related to our procedure.
  • Even if the hardware layer is not accessed by the VM anymore, it still has a huge impact on performance.
    IE, on vsphere, we could not go higher than 20K connections per second with a Realtek NIC in the server…
    With Intel e1000e driver, we got up to 55K connections per second….
    So, even when you use virtualization, hardware counts!

Links

load balancing, affinity, persistence, sticky sessions: what you need to know

Synopsis

To ensure high availability and performance of Web applications, it is now common to use a load-balancer.
While some people uses layer 4 load-balancers, it can be sometime recommended to use layer 7 load-balancers to be more efficient with HTTP protocol.

NOTE: To understand better the difference between such load-balancers, please read the Load-Balancing FAQ.

A load-balancer in an infrastructure

The picture below shows how we usually install a load-balancer in an infrastructure:
reverse_proxy
This is a logical diagram. When working at layer 7 (aka Application layer), the load-balancer acts as a reverse proxy.
So, from a physical point of view, it can be plugged anywhere in the architecture:

  • in a DMZ
  • in the server LAN
  • as front of the servers, acting as the default gateway
  • far away in an other separated datacenter

Why does load-balancing web application is a problem????


Well, HTTP is not a connected protocol: it means that the session is totally independent from the TCP connections.
Even worst, an HTTP session can be spread over a few TCP connections…
When there is no load-balancer involved, there won’t be any issues at all, since the single application server will be aware the session information of all users, and whatever the number of client connections, they are all redirected to the unique server.
When using several application servers, then the problem occurs: what happens when a user is sending requests to a server which is not aware of its session?
The user will get back to the login page since the application server can’t access his session: he is considered as a new user.

To avoid this kind of problem, there are several ways:

  • Use a clustered web application server where the session are available for all the servers
  • Sharing user’s session information in a database or a file system on application servers
  • Use IP level information to maintain affinity between a user and a server
  • Use application layer information to maintain persistance between a user and a server

NOTE: you can mix different technc listed above.

Building a web application cluster

Only a few products on the market allow administrators to create a cluster (like Weblogic, tomcat, jboss, etc…).
I’ve never configured any of them, but from Administrators I talk too, it does not seem to be an easy task.
By the way, for Web applications, clustering does not mean scaling. Later, I’ll write an article explaining while even if you’re clustering, you still may need a load-balancer in front of your cluster to build a robust and scalable application.

Sharing user’s session in a database or a file system

This Technic applies to application servers which has no clustering features, or if you don’t want to enable cluster feature from.
It is pretty simple, you choose a way to share your session, usually a file system like NFS or CIFS, or a Database like MySql or SQLServer or a memcached then you configure each application server with proper parameters to share the sessions and to access them if required.
I’m not going to give any details on how to do it here, just google with proper keywords and you’ll get answers very quickly.

IP source affinity to server

An easy way to maintain affinity between a user and a server is to use user’s IP address: this is called Source IP affinity.
There are a lot of issues doing that and I’m not going to detail them right now (TODO++: an other article to write).
The only thing you have to know is that source IP affinity is the latest method to use when you want to “stick” a user to a server.
Well, it’s true that it will solve our issue as long as the user use a single IP address or he never change his IP address during the session.

Application layer persistence

Since a web application server has to identify each users individually, to avoid serving content from a user to an other one, we may use this information, or at least try to reproduce the same behavior in the load-balancer to maintain persistence between a user and a server.
The information we’ll use is the Session Cookie, either set by the load-balancer itself or using one set up by the application server.

What is the difference between Persistence and Affinity

Affinity: this is when we use an information from a layer below the application layer to maintain a client request to a single server
Persistence: this is when we use Application layer information to stick a client to a single server
sticky session: a sticky session is a session maintained by persistence

The main advantage of the persistence over affinity is that it’s much more accurate, but sometimes, Persistence is not doable, so we must rely on affinity.

Using persistence, we mean that we’re 100% sure that a user will get redirected to a single server.
Using affinity, we mean that the user may be redirected to the same server…

What is the interraction with load-balancing???


In load-balancer you can choose between several algorithms to pick up a server from a web farm to forward your client requests to.
Some algorithm are deterministic, which means they can use a client side information to choose the server and always send the owner of this information to the same server. This is where you usually do Affinity 😉 IE: “balance source”
Some algorithm are not deterministic, which means they choose the server based on internal information, whatever the client sent. This is where you don’t do any affinity nor persistence 🙂 IE: “balance roundrobin” or “balance leastconn”
I don’t want to go too deep in details here, this can be the purpose of a new article about load-balancing algorithms…

You may be wondering: “we have not yet speak about persistence in this chapter”. That’s right, let’s do it.
As we saw previously, persistence means that the server can be chosen based on application layer information.
This means that persistence is an other way to choose a server from a farm, as load-balancing algorithm does.

Actually, session persistence has precedence over load-balancing algorithm.
Let’s show this on a diagram:

      client request
            |
            V
    HAProxy Frontend
            |
            V
     backend choice
            |
            V
    HAproxy Backend
            |
            V
Does the request contain
persistence information ---------
            |                   |
            | NO                |
            V                   |
    Server choice by            |  YES
load-balancing algorithm        |
            |                   |
            V                   |
   Forwarding request <----------
      to the server

Which means that when doing session persistence in a load balancer, the following happens:

  • the first user’s request comes in without session persistence information
  • the request bypass the session persistence server choice since it has no session persistence information
  • the request pass through the load-balancing algorithm, where a server is chosen and affected to the client
  • the server answers back, setting its own session information
  • depending on its configuration, the load-balancer can either use this session information or setup its own before sending the response back to the client
  • the client sends a second request, now with session information he learnt during the first request
  • the load-balancer choose the server based on the client side information
  • the request DOES NOT PASS THROUGH the load-balancing algorithm
  • the server answers the request

and so on…

At HAProxy Technologies we say that “Persistence is a exception to load-balancing“.
And the demonstration is just above.

Affinity configuration in HAProxy / Aloha load-balancer

The configuration below shows how to do affinity within HAProxy, based on client IP information:

frontend ft_web
  bind 0.0.0.0:80
  default_backend bk_web

backend bk_web
  balance source
  hash-type consistent # optional
  server s1 192.168.10.11:80 check
  server s2 192.168.10.21:80 check

Web application persistence

In order to provide persistence at application layer, we usually use Cookies.
As explained previously, there are two ways to provide persistence using cookies:

  • Let the load-balancer set up a cookie for the session.
  • Using application cookies, such as ASP.NET_SessionId, JSESSIONID, PHPSESSIONID, or any other chosen name

Session cookie setup by the Load-Balancer

The configuration below shows how to configure HAProxy / Aloha load balancer to inject a cookie in the client browser:

frontend ft_web
  bind 0.0.0.0:80
  default_backend bk_web

backend bk_web
  balance roundrobin
  cookie SERVERID insert indirect nocache
  server s1 192.168.10.11:80 check cookie s1
  server s2 192.168.10.21:80 check cookie s2

Two things to notice:

  1. the line “cookie SERVERID insert indirect nocache”:
    This line tells HAProxy to setup a cookie called SERVERID only if the user did not come with such cookie. It is going to append a “Cache-Control: nocache” as well, since this type of traffic is supposed to be personnal and we don’t want any shared cache on the internet to cache it
  2. the statement “cookie XXX” on the server line definition:
    It provides the value of the cookie inserted by HAProxy. When the client comes back, then HAProxy knows directly which server to choose for this client.

So what happens?

  1. At the first response, HAProxy will send the client the following header, if the server chosen by the load-balancing algorithm is s1:
    Set-Cookie: SERVERID=s1
  2. For the second request, the client will send a request containing the header below:
    Cookie: SERVERID=s1

Basically, this kind of configuration is compatible with active/active Aloha load-balancer cluster configuration.

Using application session cookie for persistence

The configuration below shows how to configure HAProxy / Aloha load balancer to use the cookie setup by the application server to maintain affinity between a server and a client:

frontend ft_web
  bind 0.0.0.0:80
  default_backend bk_web

backend bk_web
  balance roundrobin
  cookie JSESSIONID prefix nocache
  server s1 192.168.10.11:80 check cookie s1
  server s2 192.168.10.21:80 check cookie s2

Just replace JSESSIONID by your application cookie. It can be anything, like the default ones from PHP and IIS: PHPSESSID and ASP.NET_SessionId.

So what happens?

  1. At the first response, the server will send the client the following header
    Set-Cookie: JSESSIONID=i12KJF23JKJ1EKJ21213KJ
  2. when passing through HAProxy, the cookie is modified like this:
    Set-Cookie: JSESSIONID=s1~i12KJF23JKJ1EKJ21213KJ


    Note that the Set-Cookie header has been prefixed by the server cookie value (“s1” in this case) and a “~” is used as a separator between this information and the cookie value.

  3. For the second request, the client will send a request containing the header below:
    Cookie: JSESSIONID=s1~i12KJF23JKJ1EKJ21213KJ
  4. HAProxy will clean it up on the fly to set it up back like the origin:
    Cookie: JSESSIONID=i12KJF23JKJ1EKJ21213KJ

Basically, this kind of configuration is compatible with active/active Aloha load-balancer cluster configuration.

What happens when my server goes down???

When doing persistence, if a server goes down, then HAProxy will redispatch the user to an other server.
Since the user will get connected on a new server, then this one may not be aware of the session, so be redirected to the login page.
But this is not a load-balancer problem, this is related to the application server farm.

Links

Protect Apache against “apache killer” script

What is Apache killer?

Apache killer is a script which aims to exploit an Apache Vulnerability.
Basically, it makes Apache to fill up the /tmp directory which makes the webserver unstable.

Who is concerned?

Anybody running a website on Apache.
The Apache announce

How can Aloha Load-Balancer help you?

First, let’s have a look at the diagram below:

The Aloha can clean up your Range headers as well as limiting rate of connection from malicious people and event emulate the success of the attack.

Protect against Range header

Basically, the attack consists on sending a lot of Range headers to the webserver.
So, if a “client” sends more than 10 Range headers, we can consider this as an attack and we can clean them up.
Just add the two lines below in your Layer 7 (HAPropxy) backend configuration to protect your Apache web servers:

backend bk_http
[...]
  # Detect an ApacheKiller-like Attack
  acl weirdrangehdr hdr_cnt(Range) gt 10
  # Clean up the request
  reqidel ^Range if weirdrangehdr
[...]

Protect against service abuser

Since this kind of attack is combined with a DOS, you can blacklist bad guys with the configuration below.
It will limit users to 10 connections over a 10s period, then hold the connection for 10s before answering a 503 HTTP response.

You should adjust the values below to your website traffic.

frontend ft_http
[...]
  option http-server-close

  # Setup stick table
  stick-table type ip size 1k expire 30s store gpc0
  # Configure the DoS src
  acl MARKED src_get_gpc0(ft_http) gt 0
  # tarpit attackers if src_DoS
  use_backend bk_tarpit if MARKED
  # If not blocked, track the connection
  tcp-request connection track-sc1 src if ! MARKED

  default_backend bk_http
[...]

backend bk_http
[...]
  # Table to track connection rate
  stick-table type ip size 1k expire 30s store conn_rate(5s)
  # Track request
  tcp-request content track-sc2 src
  # Mark as abuser if more than 10 connection
  acl ABUSER sc2_conn_rate gt 10
  acl MARKED_AS_ABUSER sc1_inc_gpc0
  # Block connection concidered as abuser
  tcp-request content reject if ABUSER MARKED_AS_ABUSER
[...]

# Slow down attackers
backend bk_tarpit
  mode http
  # hold the connection for 10s before answering
  timeout tarpit 10s
  # Emulate a 503 error
  errorfile 500 /etc/errors/500_tarpit.txt
  # slowdown any request coming up to here
  reqitarpit .

Open a shell on your Aloha Load-Balancer, then:

  • create the directory /etc/errors/
  • create the file 500_tarpit.txt with the content below.

500_tarpit.txt:

HTTP/1.0 503 Service Unavailable
Cache-Control: no-cache
Connection: close
Content-Type: text/html
Content-Length: 310

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">;
<html xmlns="http://www.w3.org/1999/xhtml">;
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Error</title>
</head>
<body><h1>Something went wrong</h1></body>
</html>

Don’t forget to save your configuration with the command

config save

Related articles

Links

layer 7 load-balancing proxy mode

This mode is sometimes called reverse-proxy.

The load-balancer is in the middle of all transactions between the user and the server.
It maintains two separated TCP connections:

  • One with the user: the load-balancer acts as a server. It takes requests and forward responses
  • One with the server: the load-balancer acts as a user: it forward requests and get responses

This mode is the less intrusive in an architecture: nothing to change on any server or router.

TCP connection overview

layer7_reverse_proxy_tcp_connection
The diagram shows clearly the two TCP connections maintained by the load-balancer.

Data flow

layer7_reverse_proxy_data_flow
As a consequence of the proxy mode, the server gets a connection from the load-balancer IP address and the server can only get the client IP address at the application level.
IE: HTTP X-Forwarded-For header.

Pros and cons

Pros

  • not intrusive
  • more secure: server aren’t reached directly
  • allows protocol inspection and validation
  • can load-balance services even if both client and server are in the same subnet

Cons

  • backend servers only see one IP address: the one from the LB
  • “slower” than layer 4 load-balancing (we speak about micro-seconds)

When use this mode?

  • when you need application layer intelligence (content switching, etc…)
  • in order to protect an application
  • when the application does not care about the user IP (or can use the X-Forwarded-For header)

Links

layer 7 load-balancing transparent proxy mode

The load-balancer is in the middle of all transactions between the user and the server.
It maintains two separated TCP connections:

  • With the user: the load-balancer acts as a server. It takes requests and forward responses
  • With the server: the load-balancer acts as a user: it forward requests and get responses

It is really close to the proxy mode, but has one main difference: the load-balancer opens the connection to the server using the client IP address as source IP.

Of course, the backend server default gateway must be the load-balancer.

TCP connection overview

layer7_transparent_proxy_tcp_connection
The diagram shows clearly the two TCP connections maintained by the load-balancer.

Data flow

layer7_transparent_proxy_data_flow
Since the load-balancer opens the TCP connection to the server with the user IP address, the server must use the load-balancer as its default gateway.
Otherwise, the server would forward response directly to the client and the client would drop it…

Pros and cons

Pros

  • servers see the client IP address at network layer
  • secure: server aren’t reached directly
  • allows protocol inspection and validation

Cons

  • intrusive: must change the default gateway of the server.
  • “slower” than layer 4 load-balancing (we speak about micro-seconds)
  • clients and servers must be in two different subnets.

When use this mode?

  • when the load-balanced service needs client IP at network layer (IE: anti-spam services)
  • when you need application layer intelligence (content switching, etc…)
  • in order to protect an application

Links