Tag Archives: performance

Load Balancing and Application Delivery for the Enterprise [Webinar]

Do you know what makes HAProxy Enterprise Edition different from HAProxy Community Edition?

HAPEE isn’t just HAProxy Community with paid support, and unlike some other products based on open source projects, HAPEE doesn’t strip away any of the capabilities of HAProxy Community.

You can learn what HAPEE is all about and how it can provide additional benefits to enterprises during our webinar on March 2nd.

In this webinar, we’ll present:

  • How HAProxy Enterprise Edition (HAPEE) is different from HAProxy Community Edition
  • Why HAPEE is the most up-to-date, secure, and stable version of HAProxy
  • How enterprises leverage HAPEE to scale-out environments in the cloud
  • How enterprises can increase their admin productivity using HAPEE
  • How HAPEE enables advanced DDOS protection and helps mitigate other attacks

Sign up here.

Serving ECC and RSA certificates on same IP with HAproxy

ECC and RSA certificates and HTTPS

To keep this practical, we will not go into theory of ECC or RSA certificates. Let’s just mention that ECC certificates can provide as much security as RSA with much lower key size, meaning much lower computation requirements on the server side. Sadly, many clients do not support ciphers based on ECC, so to maintain compatibility as well as provide good performance we need to be able to detect which type of certificate is supported by the client to be able to serve it correctly.

The above is usually achieved with analyzing the cipher suites sent by the client in the ClientHello message at the start of the SSL handshake, but we’ve opted for a much simpler approach that works very well with all modern browsers (clients).

Prerequisites

First you will need to obtain both RSA and ECC certificates for your web site. Depending on the registrar you are using, check their documentation. After you have been issued with the certificates, make sure you download the appropriate intermediate certificates and create the bundle files for HAproxy to read.

To be able to use the sample fetch required, you will need at least HAproxy 1.6-dev3 (not yet released as of writing) or you can clone latest HAproxy from the git repository. Feature was introduced in commit 5fc7d7e.

Configuration

We will use chaining in order to achieve desired functionality. You can use abstract sockets on Linux to get even more performance, but note the drawbacks that can be found in HAproxy documentation.

 frontend ssl-relay
 mode tcp
 bind 0.0.0.0:443
 use_backend ssl-ecc if { req.ssl_ec_ext 1 }
 default_backend ssl-rsa

 backend ssl-ecc
 mode tcp
 server ecc unix@/var/run/haproxy_ssl_ecc.sock send-proxy-v2

 backend ssl-rsa
 mode tcp
 server rsa unix@/var/run/haproxy_ssl_rsa.sock send-proxy-v2

 listen all-ssl
 bind unix@/var/run/haproxy_ssl_ecc.sock accept-proxy ssl crt /usr/local/haproxy/ecc.www.foo.com.pem user nobody
 bind unix@/var/run/haproxy_ssl_rsa.sock accept-proxy ssl crt /usr/local/haproxy/www.foo.com.pem user nobody
 mode http
 server backend_1 192.168.1.1:8000 check

The whole configuration revolves around the newly implemented sample fetch: req.ssl_ec_ext. What this fetch does is that it detects the presence of Supported Elliptic Curves Extension inside the ClientHello message. This extension is defined in RFC4492 and according to it, it SHOULD be sent with every ClientHello message by the client supporting ECC. We have observed that all modern clients send it correctly.

If the extension is detected, the client is sent through a unix socket to the frontend that will serve an ECC certificate. If not, a regular RSA certificate will be served.

Benchmark

We will provide full HAproxy benchmarks in the near future, but for the sake of comparison, let us view the difference present on an E5-2680v3 CPU and OpenSSL 1.0.2.

256bit ECDSA:
sign verify sign/s verify/s
0.0000s 0.0001s 24453.3 9866.9

2048bit RSA:
sign verify sign/s verify/s
0.000682s 0.000028s 1466.4 35225.1

As you can see, looking at the sign/s we are getting over 15 times the performance with ECDSA256 compared to RSA2048.

HAProxy’s load-balancing algorithm for static content delivery with Varnish

HAProxy’s load-balancing algorithms

HAProxy supports many load-balancing algorithms which may be used in many different type of cases.
That said, cache servers, which deliver most of the time the static content from your web applications, may require some specific load-balancing algorithms.

HAProxy stands in front of your cache server for some good reasons:

  • SSL offloading (read PHK’s feeling about SSL, Varnish and HAProxy)
  • HTTP content switching capabilities
  • advanced load-balancing algorithms

The main purpose of this article is to show how HAProxy can be used to aggregate Varnish servers memory storage in some kind of “JBOD” mode (like the “Just a Bunch Of Disks“).
Main purpose of the examples delivered here are to optimize the resources on the cache, mainly its memory, in order to improve the HIT rate. This will also improve your application response time and make your site top ranked on google 🙂

Content Switching in HAProxy

This has been covered many times on this blog.
As a quick introduction for readers who are not familiar with HAProxy, let’s explain how it works.

Clients will get connected to HAProxy through a Frontend. Then HAProxy routes traffic to a backend (server farm) where the load-balancing algorithm is used to choose a server.
A frontend can points to multiple backends and the choice of a backend is made through acls and use_backend rules..
Acls can be formed using fetches. A fetch is a directive which instructs HAProxy where to get content from.

Enough theory, let’s make a practical example: splitting static and dynamic traffic using the following rules:

  • Static content is hosted on domain names starting by ‘static.’ and ‘images.’
  • Static content files extensions are ‘.jpg’ ‘.png’ ‘.gif’ ‘.css’ ‘.js’
  • Static content can match any of the rule above
  • anything which is not static is considered as dynamic

The configuration sniplet below should be integrated into the HAProxy frontend. It matches the rules above to do traffic splitting. The varnish servers will stands in the bk_static farm.

frontend ft_public
 <frontend settings>
 acl static_domain  req.hdr_beg(Host) -i static. images.
 acl static_content path_end          -i .jpg .png .gif .css .js
 use_backend bk_static if static_domain or static_content
 default_backend bk_dynamic
   
backend bk_static
 <parameters related to static content delivery>

The configuration above creates 2 named acls ‘static_domain‘ and ‘static_content‘ which are used by the used_backend rule to route the traffic to varnish servers.

HAProxy and hash based load-balancing algotithm


Later in this article, we’ll heavily used the hash based load-balancing algorithms from HAProxy.
So a few information here (non exhaustive, it would deserve a long blog article) which will be useful for people wanting to understand what happens deep inside HAProxy.

The following parameters are taken into account when computing a hash algorithm:

  • number of servers in the farm
  • weight of each server in the farm
  • status of the servers (UP or DOWN)

If any of the parameter above changes, the whole hash computation also changes, hence request may hit an other server. This may lead to a negative impact on the response time of the application (during a short period of time).
Fortunately, HAProxy allows ‘consistent’ hashing, which means that only the traffic related to the change will be impacted.
That’s why you’ll see a lot of hash-type consistent directives in the configuration samples below.

Load-Balancing varnish cache server

Now, let’s focus on the magic we can add in the bk_static server farm.

Hashing the URL

HAProxy can hash the URL to pick up a server. With this load-balancing algorithm, we guarantee that a single URL will always hit the same Varnish server.

hashing the URL path only


In the example below, HAProxy hashes the URL path, which is from the first slash ‘/’ character up to the question mark ‘?’:

backend bk_static
  balance uri
  hash-type consistent

hashing the whole url, including the query string


In some cases, the query string may contain some variables in the query string, which means we must include the query string in the hash:

backend bk_static
  balance uri whole
  hash-type consistent

Query string parameter hash


That said, in some cases (API, etc…), hashing the whole URL is not enough. We may want to hash only on a particular query string parameter.
This applies well in cases where the client can forge itself the URL and all the parameters may be randomly ordered.
The configuration below tells HAProxy to apply the hash to the query string parameter named ‘id’ (IE: /image.php?width=512&id=12&height=256)

backend bk_static
  balance url_param id
  hash-type consistent

hash on a HTTP header


HAProxy can apply the hash to a specific HTTP header field.
The example below applies it on the Host header. This can be used for people hosting many domain names with a few pages, like users dedicated pages.

backend bk_static
  balance hdr(Host)
  hash-type consistent

Compose your own hash: concatenation of Host header and URL


Nowadays, HAProxy becomes more and more flexible and we can use this flexibility in its configuration.
Imagine, in your varnish configuration, you have a storage hash key based on the concatenation of the host header and the URI, then you may want to apply the same load-balancing algorithm into HAProxy, to optimize your caches.

The configuration below creates a new HTTP header field named X-LB which contains the host header (converted to lowercase) concatenated to the request uri (converted in lowercase too).

backend bk_static
  http-request set-header X-LB %[req.hdr(Host),lower]%[req.uri,lower]
  balance hdr(X-LB)
  hash-type consistent

Conclusion


HAProxy and Varnish works very well together. Each soft can benefit from performance and flexibility of the other one.

Links

HAProxy, high mysql request rate and TCP source port exhaustion

Synopsys


At HAProxy Technologies, we do provide professional services around HAPRoxy: this includes HAProxy itself, of course, but as well the underlying OS tuning, advice and recommendation about the architecture and sometimes we can also help customers troubleshooting application layer issues.
We don’t fix issues for the customer, but using information provided by HAProxy, we are able to reduce the investigation area, saving customer’s time and money.
The story I’m relating today is issued of one of this PS.

One of our customer is an hosting company which hosts some very busy PHP / MySQL websites. They used successfully HAProxy in front of their application servers.
They used to have a single MySQL server which was some kind of SPOF and which had to handle several thousands requests per seconds.
Sometimes, they had issues with this DB: it was like the clients (hence the Web servers) can’t hangs when using the DB.

So they decided to use MySQL replication and build an active/passive cluster. They also decided to split reads (SELECT queries) and writes (DELETE, INSERT, UPDATE queries) at the application level.
Then they were able to move the MySQL servers behind HAProxy.

Enough for the introduction 🙂 Today’s article will discuss about HAProxy and MySQL at high request rate, and an error some of you may already have encountered: TCP source port exhaustion (the famous high number of sockets in TIME_WAIT).

Diagram


So basically, we have here a standard web platform which involves HAProxy to load-balance MySQL:
haproxy_mysql_replication

The MySQL Master server is used to send WRITE requests and the READ request are “weighted-ly” load-balanced (the slaves have a higher weight than the master) against all the MySQL servers.

MySql scalability

One way of scaling MySQL, is to use the replication method: one MySQL server is designed as master and must manages all the write operations (DELETE, INSERT, UPDATE, etc…). for each operation, it notifies all the MySQL slave servers. We can use slaves for reading only, offloading these types of requests from the master.
IMPORTANT NOTE: The replication method allows scalability of the read part, so if your application require much more writes, then this is not the method for you.

Of course, one MySQL slave server can be designed as master when the master fails! This also ensure MySQL high-availability.

So, where is the problem ???

This type of platform works very well for the majority of websites. The problem occurs when you start having a high rate of requests. By high, I mean several thousands per second.

TCP source port exhaustion

HAProxy works as a reverse-proxy and so uses its own IP address to get connected to the server.
Any system has around 64K TCP source ports available to get connected to a remote IP:port. Once a combination of “source IP:port => dst IP:port” is in use, it can’t be re-used.
First lesson: you can’t have more than 64K opened connections from a HAProxy box to a single remote IP:port couple. I think only people load-balancing MS Exchange RPC services or sharepoint with NTLM may one day reach this limit…
(well, it is possible to workaround this limit using some hacks we’ll explain later in this article)

Why does TCP port exhaustion occur with MySQL clients???


As I said, the MySQL request rate was a few thousands per second, so we never ever reach this limit of 64K simultaneous opened connections to the remote service…
What’s up then???
Well, there is an issue with MySQL client library: when a client sends its “QUIT” sequence, it performs a few internal operations before immediately shutting down the TCP connection, without waiting for the server to do it. A basic tcpdump will show it to you easily.
Note that you won’t be able to reproduce this issue on a loopback interface, because the server answers fast enough… You must use a LAN connection and 2 different servers.

Basically, here is the sequence currently performed by a MySQL client:

Mysql Client ==> "QUIT" sequence ==> Mysql Server
Mysql Client ==>       FIN       ==> MySQL Server
Mysql Client <==     FIN ACK     <== MySQL Server
Mysql Client ==>       ACK       ==> MySQL Server

Which leads the client connection to remain unavailable for twice the MSL (Maximum Segment Life) time, which means 2 minutes.
Note: this type of close has no negative impact when the connection is made over a UNIX socket.

Explication of the issue (much better that I could explain it myself):
“There is no way for the person who sent the first FIN to get an ACK back for that last ACK. You might want to reread that now. The person that initially closed the connection enters the TIME_WAIT state; in case the other person didn’t really get the ACK and thinks the connection is still open. Typically, this lasts one to two minutes.” (Source)

Since the source port is unavailable for the system for 2 minutes, this means that over 534 MySQL requests per seconds you’re in danger of TCP source port exhaustion: 64000 (available ports) / 120 (number of seconds in 2 minutes) = 533.333.
This TCP port exhaustion appears on the MySQL client server itself, but as well on the HAProxy box because it forwards the client traffic to the server… And since we have many web servers, it happens much faster on the HAProxy box !!!!

Remember: at spike traffic, my customer had a few thousands requests/s….

How to avoid TCP source port exhaustion?


Here comes THE question!!!!
First, a “clean” sequence should be:

Mysql Client ==> "QUIT" sequence ==> Mysql Server
Mysql Client <==       FIN       <== MySQL Server
Mysql Client ==>     FIN ACK     ==> MySQL Server
Mysql Client <==       ACK       <== MySQL Server

Actually, this sequence happens when both MySQL client and server are hosted on the same box and uses the loopback interface, that’s why I said sooner that if you want to reproduce the issue you must add “latency” between the client and the server and so use 2 boxes over the LAN.
So, until MySQL rewrite the code to follow the sequence above, there won’t be any improvement here!!!!

Increasing source port range


By default, on a Linux box, you have around 28K source ports available (for a single destination IP:port):

$ sysctl net.ipv4.ip_local_port_range
net.ipv4.ip_local_port_range = 32768    61000

In order to get 64K source ports, just run:

$ sudo sysctl net.ipv4.ip_local_port_range="1025 65000"

And don’t forget to update your /etc/sysctl.conf file!!!

Note: this should definitively be applied also on the web servers….

Allow usage of source port in TIME_WAIT


A few sysctls can be used to tell the kernel to reuse faster the connection in TIME_WAIT:

net.ipv4.tcp_tw_reuse
net.ipv4.tcp_tw_recycle

tw_reuse can be used safely, be but careful with tw_recycle… It could have side effects. Same people behind a NAT might be able to get connected on the same device. So only use if your HAProxy is fully dedicated to your MySql setup.

anyway, these sysctls were already properly setup (value = 1) on both HAProxy and web servers.

Note: this should definitively be applied also on the web servers….
Note 2: tw_reuse should definitively be applied also on the web servers….

Using multiple IPs to get connected to a single server


In HAProxy configuration, you can precise on the server line the source IP address to use to get connected to a server, so just add more server lines with different IPs.
In the example below, the IPs 10.0.0.100 and 10.0.0.101 are configured on the HAProxy box:

[...]
  server mysql1     10.0.0.1:3306 check source 10.0.0.100
  server mysql1_bis 10.0.0.1:3306 check source 10.0.0.101
[...]

This allows us to open up to 128K source TCP port…
The kernel is responsible to affect a new TCP port when HAProxy requests it. Dispite improving things a bit, we still reach some source port exhaustion… We could not get over 80K connections in TIME_WAIT with 4 source IPs…

Let HAProxy manage TCP source ports


You can let HAProxy decides which source port to use when opening a new TCP connection, on behalf of the kernel. To address this topic, HAProxy has built-in functions which make it more efficient than a regular kernel.

Let’s update the configuration above:

[...]
  server mysql1     10.0.0.1:3306 check source 10.0.0.100:1025-65000
  server mysql1_bis 10.0.0.1:3306 check source 10.0.0.101:1025-65000
[...]

We managed to get 170K+ connections in TIME_WAIT with 4 source IPs… and not source port exhaustion anymore !!!!

Use a memcache


Fortunately, the devs from this customer are skilled and write flexible code 🙂
So they managed to move some requests from the MySQL DB to a memcache, opening much less connections.

Use MySQL persistant connections


This could prevent fine Load-Balancing on the Read-Only farm, but it would be very efficient on the MySQL master server.

Conclusion

  • If you see some SOCKERR information messages in HAProxy logs (mainly on health check), you may be running out of TCP source ports.
  • Have skilled developers who write flexible code, where moving from a DB to an other is made easy
  • This kind of issue can happen only with protocols or applications which make the client closing the connection first
  • This issue can’t happen on HAProxy in HTTP mode, since it let the server closes the connection before sending a TCP RST

Links

HAProxy log customization

Synopsis

One of the strength of HAProxy is its logging system. It is very verbose and provides many information.
HAProxy HTTP log line is briefly explained in an HAProxy Technologies memo. It’s a must have document when you have to analyze HAProxy‘s log lines to troubleshoot an issue.
An other interesting tool is HALog. It is available in HAProxy‘s sources, in the contrib directory. I’ll write later an article about it. In order to have an idea on how to use it, just have a look at HAProxy Technologies howto related to halog and HTTP analyze.

Why customizing HAProxy’s logs ???


There may be several reasons why one want to customize HAProxy’s logs:

  • the default log format is too much complicated
  • there are too many information in the default log format
  • there is not enough information in the default log format
  • third party log anaylzer can hardly understand default HAProxy log format
  • logs generated by HAPorxy must be compliant to an existing format from an existing appliance in the architecture
  • … add your own reason here …

That’s why, at HAProxy Technologies, we felt the need of letting our users to create their own HAProxy log-format.
As for compression in HAProxy, the job was done by Wlallemand.

HAProxy log format customization

Configuration directive

The name of the directive which allows you to generate a home made log format is simply called log-format.

Variables

The log-format directive understand variables.
A variable follows the rules below:

  • it is preceded by a percent character: ‘%
  • it can take arguments in braces ‘{}‘.
  • If multiple arguments, then they are separated by commas ‘,‘ within the braces.
  • Flags may be added or removed by prefixing them with a ‘+‘ or ‘‘ sign.
  • spaces ‘ ‘ must be escaped (It is considered as a separator)

Currently available flags:

  • Q: quote a string
  • X: hexadecimal representation (IPs, Ports, %Ts, %rt, %pid)

Currently available variables:

  +---+------+-----------------------------------------------+-------------+
  | R | var  | field name (8.2.2 and 8.2.3 for description)  | type        |
  +---+------+-----------------------------------------------+-------------+
  |   | %o   | special variable, apply flags on all next var |             |
  +---+------+-----------------------------------------------+-------------+
  |   | %B   | bytes_read                                    | numeric     |
  |   | %Ci  | client_ip                                     | IP          |
  |   | %Cp  | client_port                                   | numeric     |
  |   | %Bi  | backend_source_ip                             | IP          |
  |   | %Bp  | backend_source_port                           | numeric     |
  |   | %Fi  | frontend_ip                                   | IP          |
  |   | %Fp  | frontend_port                                 | numeric     |
  |   | %H   | hostname                                      | string      |
  |   | %ID  | unique-id                                     | string      |
  |   | %Si  | server_IP                                     | IP          |
  |   | %Sp  | server_port                                   | numeric     |
  |   | %T   | gmt_date_time                                 | date        |
  |   | %Tc  | Tc                                            | numeric     |
  | H | %Tq  | Tq                                            | numeric     |
  | H | %Tr  | Tr                                            | numeric     |
  |   | %Ts  | timestamp                                     | numeric     |
  |   | %Tt  | Tt                                            | numeric     |
  |   | %Tw  | Tw                                            | numeric     |
  |   | %ac  | actconn                                       | numeric     |
  |   | %b   | backend_name                                  | string      |
  |   | %bc  | beconn                                        | numeric     |
  |   | %bq  | backend_queue                                 | numeric     |
  | H | %cc  | captured_request_cookie                       | string      |
  | H | %rt  | http_request_counter                          | numeric     |
  | H | %cs  | captured_response_cookie                      | string      |
  |   | %f   | frontend_name                                 | string      |
  |   | %ft  | frontend_name_transport ('~' suffix for SSL)  | string      |
  |   | %fc  | feconn                                        | numeric     |
  | H | %hr  | captured_request_headers default style        | string      |
  | H | %hrl | captured_request_headers CLF style            | string list |
  | H | %hs  | captured_response_headers default style       | string      |
  | H | %hsl | captured_response_headers CLF style           | string list |
  |   | %ms  | accept date milliseconds                      | numeric     |
  |   | %pid | PID                                           | numeric     |
  | H | %r   | http_request                                  | string      |
  |   | %rc  | retries                                       | numeric     |
  |   | %s   | server_name                                   | string      |
  |   | %sc  | srv_conn                                      | numeric     |
  |   | %sq  | srv_queue                                     | numeric     |
  | S | %sslc| ssl_ciphers (ex: AES-SHA)                     | string      |
  | S | %sslv| ssl_version (ex: TLSv1)                       | string      |
  | H | %st  | status_code                                   | numeric     |
  |   | %t   | date_time                                     | date        |
  |   | %ts  | termination_state                             | string      |
  | H | %tsc | termination_state with cookie status          | string      |
  +---+------+-----------------------------------------------+-------------+

    R = Restrictions : H = mode http only ; S = SSL only

Log format examples

Default log format

  • TCP log format
    log-format %Ci:%Cp [%t] %ft %b/%s %Tw/%Tc/%Tt %B %ts 
               %ac/%fc/%bc/%sc/%rc %sq/%bq
    
  • HTTP log format
    log-format %Ci:%Cp [%t] %ft %b/%s %Tq/%Tw/%Tc/%Tr/%Tt %st %B %cc 
               %cs %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs %{+Q}r
    
  • CLF log format
    log-format %{+Q}o %{-Q}Ci - - [%T] %r %st %B "" "" %Cp 
               %ms %ft %b %s %Tq %Tw %Tc %Tr %Tt %tsc %ac %fc 
               %bc %sc %rc %sq %bq %cc %cs %hrl %hsl
    

Home made formats

  • Logging HTTP Host header, the URL, the status code, number of bytes read from server and the server response time
    capture request header Host len 32
    log-format %hr %r %st %B %Tr
    
  • SSL log format with: HAProxy path (frontend, backend and server name), client information (source IP and port), SSL information (protocol version and negotiated cypher), connection termination state, including a few strings:
    log-format frontend:%f %b/%s client_ip:%Ci client_port:%Cp SSL_version:%sslv SSL_cypher:%sslc %ts

Links

HAProxy and gzip compression

Synopsis

Compression is a Technic to reduce object size to reduce delivery delay for objects over HTTP protocol.
Until now, HAProxy did not include such feature. But the guys at HAProxy Technologies worked hard on it (mainly David Du Colombier and @wlallemand).
HAProxy can now be considered an new option to compress HTTP streams, as well as nginx, apache or IIS which already does it.

Note that this is in early beta, so use it with care.

Compilation


Get the latest HAProxy git version, by running a “git pull” in your HAProxy git directory.
If you don’t already have such directory, then run the a:

git clone http://git.1wt.eu/git/haproxy.git

Once your HAProxy sources are updated, then you can compile HAProxy:

make TARGET=linux26 USE_ZLIB=yes

Configuration

this is a very simple configuration test:

listen ft_web
 option http-server-close
 mode http
 bind 127.0.0.1:8090 name http
 default_backend bk_web

backend bk_web
 option http-server-close
 mode http
 compression algo gzip
 compression type text/html text/plain text/css
 server localhost 127.0.0.1:80

Compression test

On my localost, I have an apache with compression disabled and a style.css object whose size is 16302 bytes.

Download without compression requested

curl -o/dev/null -D - "http://127.0.0.1:8090/style.css" 
HTTP/1.1 200 OK
Date: Fri, 26 Oct 2012 08:55:42 GMT
Server: Apache/2.2.16 (Debian)
Last-Modified: Sun, 11 Mar 2012 17:01:39 GMT
ETag: "a35d6-3fae-4bafa944542c0"
Accept-Ranges: bytes
Content-Length: 16302
Content-Type: text/css

100 16302  100 16302    0     0  5722k      0 --:--:-- --:--:-- --:--:-- 7959k

Download with compression requested

 curl -o/dev/null -D - "http://127.0.0.1:8090/style.css" -H "Accept-Encoding: gzip"
HTTP/1.1 200 OK
Date: Fri, 26 Oct 2012 08:56:28 GMT
Server: Apache/2.2.16 (Debian)
Last-Modified: Sun, 11 Mar 2012 17:01:39 GMT
ETag: "a35d6-3fae-4bafa944542c0"
Accept-Ranges: bytes
Content-Type: text/css
Transfer-Encoding: chunked
Content-Encoding: gzip

100  4036    0  4036    0     0  1169k      0 --:--:-- --:--:-- --:--:-- 1970k

In this example, object size passed from 16302 bytes to 4036 bytes.

Have fun !!!!

Links

high performance WAF platform with Naxsi and HAProxy

Synopsis

I’ve already described WAF in a previous article, where I spoke about WAF scalability with apache and modsecurity.
One of the main issue with Apache and modsecurity is the performance. To address this issue, an alternative exists: naxsi, a Web Application Firewall module for nginx.

So using Naxsi and HAProxy as a load-balancer, we’re able to build a platform which meets the following requirements:

  • Web Application Firewall: achieved by Apache and modsecurity
  • High-availability: application server and WAF monitoring, achieved by HAProxy
  • Scalability: ability to adapt capacity to the upcoming volume of traffic, achieved by HAProxy
  • DDOS protection: blind and brutal attacks protection, slowloris protection, achieved by HAProxy
  • Content-Switching: ability to route only dynamic requests to the WAF, achieved by HAProxy
  • Reliability: ability to detect capacity overusage, this is achieved by HAProxy
  • Performance: deliver response as fast as possible, achieved by the whole platform

The picture below provides a better overview:

The LAB platform is composed by 6 boxes:

  • 2 ALOHA Load-Balancers (could be replaced by HAProxy 1.5-dev)
  • 2 WAF servers: CentOS 6.0, nginx and Naxsi
  • 2 Web servers: Debian + apache + PHP + dokuwiki

Nginx and Naxsi installation on CentOS 6

Purpose of this article is not to provide such procedue. So please read this wiki article which summarizes how to install nginx and naxsi on CentOS 6.0.

Diagram

The diagram below shows the platform with HAProxy frontends (prefixed by ft_) and backends (prefixed by bk_). Each farm is composed by 2 servers.

Configuration

Nginx and Naxsi


Configure nginx as a reverse-proxy which listen in bk_waf and forward traffic to ft_web. In the mean time, naxsi is there to analyze the requests.

server {
 proxy_set_header Proxy-Connection "";
 listen       192.168.10.15:81;
 access_log  /var/log/nginx/naxsi_access.log;
 error_log  /var/log/nginx/naxsi_error.log debug;

 location / {
  include    /etc/nginx/test.rules;
  proxy_pass http://192.168.10.2:81/;
 }

 error_page 403 /403.html;
 location = /403.html {
  root /opt/nginx/html;
  internal;
 }

 location /RequestDenied {
  return 403;
 }
}

HAProxy Load-Balancer configuration


The configuration below allows the following advanced features:

  • DDOS protection on the frontend
  • abuser or attacker detection in bk_waf and blocking on the public interface (ft_waf)
  • Bypassing WAF when overusage or unavailable
######## Default values for all entries till next defaults section
defaults
  option  http-server-close
  option  dontlognull
  option  redispatch
  option  contstats
  retries 3
  timeout connect 5s
  timeout http-keep-alive 1s
  # Slowloris protection
  timeout http-request 15s
  timeout queue 30s
  timeout tarpit 1m          # tarpit hold tim
  backlog 10000

# public frontend where users get connected to
frontend ft_waf
  bind 192.168.10.2:80 name http
  mode http
  log global
  option httplog
  timeout client 25s
  maxconn 10000

  # DDOS protection
  # Use General Purpose Couter (gpc) 0 in SC1 as a global abuse counter
  # Monitors the number of request sent by an IP over a period of 10 seconds
  stick-table type ip size 1m expire 1m store gpc0,http_req_rate(10s),http_err_rate(10s)
  tcp-request connection track-sc1 src
  tcp-request connection reject if { sc1_get_gpc0 gt 0 }
  # Abuser means more than 100reqs/10s
  acl abuse sc1_http_req_rate(ft_web) ge 100
  acl flag_abuser sc1_inc_gpc0(ft_web)
  tcp-request content reject if abuse flag_abuser

  acl static path_beg /static/ /dokuwiki/images/
  acl no_waf nbsrv(bk_waf) eq 0
  acl waf_max_capacity queue(bk_waf) ge 1
  # bypass WAF farm if no WAF available
  use_backend bk_web if no_waf
  # bypass WAF farm if it reaches its capacity
  use_backend bk_web if static waf_max_capacity
  default_backend bk_waf

# WAF farm where users' traffic is routed first
backend bk_waf
  balance roundrobin
  mode http
  log global
  option httplog
  option forwardfor header X-Client-IP
  option httpchk HEAD /waf_health_check HTTP/1.0

  # If the source IP generated 10 or more http request over the defined period,
  # flag the IP as abuser on the frontend
  acl abuse sc1_http_err_rate(ft_waf) ge 10
  acl flag_abuser sc1_inc_gpc0(ft_waf)
  tcp-request content reject if abuse flag_abuser

  # Specific WAF checking: a DENY means everything is OK
  http-check expect status 403
  timeout server 25s
  default-server inter 3s rise 2 fall 3
  server waf1 192.168.10.15:81 maxconn 100 weight 10 check
  server waf2 192.168.10.16:81 maxconn 100 weight 10 check

# Traffic secured by the WAF arrives here
frontend ft_web
  bind 192.168.10.2:81 name http
  mode http
  log global
  option httplog
  timeout client 25s
  maxconn 1000
  # route health check requests to a specific backend to avoid graph pollution in ALOHA GUI
  use_backend bk_waf_health_check if { path /waf_health_check }
  default_backend bk_web

# application server farm
backend bk_web
  balance roundrobin
  mode http
  log global
  option httplog
  option forwardfor
  cookie SERVERID insert indirect nocache
  default-server inter 3s rise 2 fall 3
  option httpchk HEAD /
  # get connected on the application server using the user ip
  # provided in the X-Client-IP header setup by ft_waf frontend
  source 0.0.0.0 usesrc hdr_ip(X-Client-IP)
  timeout server 25s
  server server1 192.168.10.11:80 maxconn 100 weight 10 cookie server1 check
  server server2 192.168.10.12:80 maxconn 100 weight 10 cookie server2 check

# backend dedicated to WAF checking (to avoid graph pollution)
backend bk_waf_health_check
  balance roundrobin
  mode http
  log global
  option httplog
  option forwardfor
  default-server inter 3s rise 2 fall 3
  timeout server 25s
  server server1 192.168.10.11:80 maxconn 100 weight 10 check
  server server2 192.168.10.12:80 maxconn 100 weight 10 check

Detecting attacks


On the load-balancer


The ft_waf frontend stick table tracks two information: http_req_rate and http_err_rate which are respectively the http request rate and the http error rate generated by a single IP address.
HAProxy would automatically block an IP which has generated more than 100 requests over a period of 10s or 10 errors (WAF detection 403 responses included) in 10s. The user is blocked for 1 minute as long as he keeps on abusing.
Of course, you can setup above values to whatever you need: it is fully flexible.

To know the status of IPs in your load-balancer, just run the command below:

echo show table ft_waf | socat /var/run/haproxy.stat - 
# table: ft_waf, type: ip, size:1048576, used:1
0xc33304: key=192.168.10.254 use=0 exp=4555 gpc0=0 http_req_rate(10000)=1 http_err_rate(10000)=1

Note: The ALOHA Load-balancer does not provide watch tool, but you can monitor the content of the table in live with the command below:

while true ; do echo show table ft_waf | socat /var/run/haproxy.stat - ; sleep 2 ; clear ; done

On the Waf


Every Naxsi error log appears in /var/log/nginx/naxsi_error.log. IE:

2012/10/16 13:40:13 [error] 10556#0: *10293 NAXSI_FMT: ip=192.168.10.254&server=192.168.10.15&uri=/testphp.vulnweb.com/artists.php&total_processed=3195&total_blocked=2&zone0=ARGS&id0=1000&var_name0=artist, client: 192.168.10.254, server: , request: "GET /testphp.vulnweb.com/artists.php?artist=0+div+1+union%23foo*%2F*bar%0D%0Aselect%23foo%0D%0A1%2C2%2Ccurrent_user HTTP/1.1", host: "192.168.10.15:81"

Naxsi log line is less obvious than modsecurity one. The rule which matched os provided by the argument idX=abcde.
No false positive during the test, I had to build a request to make Naxsi match it 🙂 .

conclusion


Today, we saw it’s easy to build a scalable and performing WAF platform in front of any web application.
The WAF is able to communicate to HAProxy which IPs to automatically blacklist (throuth error rate monitoring), which is convenient since the attacker won’t bother the WAF for a certain amount of time 😉
The platform allows to detect WAF farm availability and to bypass it in case of total failure, we even saw it is possible to bypass the WAF for static content if the farm is running out of capacity. Purpose is to deliver a good end-user experience without dropping too much the security.
Note that it is possible to route all the static content to the web servers (or a static farm) directly, whatever the status of the WAF farm.
This make me say that the platform is fully scallable and flexible.
Thanks to HAProxy, the architecture is very flexible: I could switch my apache + modexurity to nginx + naxsi with no issues at all 🙂 This could be done as well for any third party waf appliances.
Note that I did not try any naxsi advanced features like learning mode and the UI as well.

Related links

Links