Category Archives: optimization

ALOHA Pocket is coming…

Well, this project is not really a secret anymore and people start to ask about it, so let me present the beast :

front_smThis is the ALOHA Pocket. Probably the smallest load balancer you have ever seen from any vendor. It is a full-featured ALOHA with layer 4/7, SSL, VRRP, the complete web interface with templates, the logs… It consumes less than a watt (0.75W to be precise) and is powered over USB.Β  It can run for about ten hours from a single 2200mAh battery. Still it achieves more than a thousand connections per second and forward 70 Mbps between the two ports. Yes, this is more than what some applications we’ve seen in field deliver on huge servers consuming 1000 times this power and running with 4000 times its amount of RAM. This is made possible thanks to our highly optimized, lightweight products which are so energy efficient and need so little resource that they can run on almost anything (and of course, they are magnified when running on powerful hardware).

Obviously nobody wants to run their production on this, it would not look serious! But we found that this is the ideal format to bring your machine everywhere, for demos, for tests, to develop in the train, or even just to tease friends. And it’s so cool that I have several of themΒ  on my desk and others in my bag and am using them all the day for various tests. And while using it I found that it was so much more convenient to use than a VM when explaining high availability to someone that we realized that it’s the format of choice for students discovering load balancing and high availability. Another nice thing is that since it has two ports, it’s perfect for plugging between your PC and the LAN to observe the HTTP communications between your browser and the application you’re developing.

So we decided to prepare one hundred of them that we’ll offer to students and interns working on a load balancing project, in exchange for their promise to blog about their project’s progress.Β  If they need we can even send them a cluster of two.Β  And who knows, maybe among these, someone will have a great idea and develop a worldwide successful project, and then we’ll be very proud to have provided the initial spark that made this possible. And if it helps students get a career around load balancing, we’ll be quite proud to transmit this passion as well!

We still have a few things to complete before it can go wild, such as a bit of documentation to explain how to start with it. But if you think you’re going to work on a load balancing project or are joining a company as an intern and will be doing some stuff with web servers, this can be the perfect way to discover this new amazing world to design solutions which resist to real failures caused by pulling off a cable and not just the clean “power down” button pressed in a VM. Start thinking about it to reserve one (or a pair) when we launch it in the upcoming weeks. Conversely if you absolutely want one, you just have to find a load balancing project to work on πŸ™‚

In any case, don’t wait too much to think about your project, because the stock is limited, so if there is too much demand, we’ll have to selective on the projects we’re going to support forΒ  the last ones.

Stay tuned!

HAProxy’s load-balancing algorithm for static content delivery with Varnish

HAProxy’s load-balancing algorithms

HAProxy supports many load-balancing algorithms which may be used in many different type of cases.
That said, cache servers, which deliver most of the time the static content from your web applications, may require some specific load-balancing algorithms.

HAProxy stands in front of your cache server for some good reasons:

  • SSL offloading (read PHK’s feeling about SSL, Varnish and HAProxy)
  • HTTP content switching capabilities
  • advanced load-balancing algorithms

The main purpose of this article is to show how HAProxy can be used to aggregate Varnish servers memory storage in some kind of “JBOD” mode (like the “Just a Bunch Of Disks“).
Main purpose of the examples delivered here are to optimize the resources on the cache, mainly its memory, in order to improve the HIT rate. This will also improve your application response time and make your site top ranked on google πŸ™‚

Content Switching in HAProxy

This has been covered many times on this blog.
As a quick introduction for readers who are not familiar with HAProxy, let’s explain how it works.

Clients will get connected to HAProxy through a Frontend. Then HAProxy routes traffic to a backend (server farm) where the load-balancing algorithm is used to choose a server.
A frontend can points to multiple backends and the choice of a backend is made through acls and use_backend rules..
Acls can be formed using fetches. A fetch is a directive which instructs HAProxy where to get content from.

Enough theory, let’s make a practical example: splitting static and dynamic traffic using the following rules:

  • Static content is hosted on domain names starting by ‘static.’ and ‘images.’
  • Static content files extensions are ‘.jpg’ ‘.png’ ‘.gif’ ‘.css’ ‘.js’
  • Static content can match any of the rule above
  • anything which is not static is considered as dynamic

The configuration sniplet below should be integrated into the HAProxy frontend. It matches the rules above to do traffic splitting. The varnish servers will stands in the bk_static farm.

frontend ft_public
 <frontend settings>
 acl static_domain  req.hdr_beg(Host) -i static. images.
 acl static_content path_end          -i .jpg .png .gif .css .js
 use_backend bk_static if static_domain or static_content
 default_backend bk_dynamic
   
backend bk_static
 <parameters related to static content delivery>

The configuration above creates 2 named acls ‘static_domain‘ and ‘static_content‘ which are used by the used_backend rule to route the traffic to varnish servers.

HAProxy and hash based load-balancing algotithm


Later in this article, we’ll heavily used the hash based load-balancing algorithms from HAProxy.
So a few information here (non exhaustive, it would deserve a long blog article) which will be useful for people wanting to understand what happens deep inside HAProxy.

The following parameters are taken into account when computing a hash algorithm:

  • number of servers in the farm
  • weight of each server in the farm
  • status of the servers (UP or DOWN)

If any of the parameter above changes, the whole hash computation also changes, hence request may hit an other server. This may lead to a negative impact on the response time of the application (during a short period of time).
Fortunately, HAProxy allows ‘consistent’ hashing, which means that only the traffic related to the change will be impacted.
That’s why you’ll see a lot of hash-type consistent directives in the configuration samples below.

Load-Balancing varnish cache server

Now, let’s focus on the magic we can add in the bk_static server farm.

Hashing the URL

HAProxy can hash the URL to pick up a server. With this load-balancing algorithm, we guarantee that a single URL will always hit the same Varnish server.

hashing the URL path only


In the example below, HAProxy hashes the URL path, which is from the first slash ‘/’ character up to the question mark ‘?’:

backend bk_static
  balance uri
  hash-type consistent

hashing the whole url, including the query string


In some cases, the query string may contain some variables in the query string, which means we must include the query string in the hash:

backend bk_static
  balance uri whole
  hash-type consistent

Query string parameter hash


That said, in some cases (API, etc…), hashing the whole URL is not enough. We may want to hash only on a particular query string parameter.
This applies well in cases where the client can forge itself the URL and all the parameters may be randomly ordered.
The configuration below tells HAProxy to apply the hash to the query string parameter named ‘id’ (IE: /image.php?width=512&id=12&height=256)

backend bk_static
  balance url_param id
  hash-type consistent

hash on a HTTP header


HAProxy can apply the hash to a specific HTTP header field.
The example below applies it on the Host header. This can be used for people hosting many domain names with a few pages, like users dedicated pages.

backend bk_static
  balance hdr(Host)
  hash-type consistent

Compose your own hash: concatenation of Host header and URL


Nowadays, HAProxy becomes more and more flexible and we can use this flexibility in its configuration.
Imagine, in your varnish configuration, you have a storage hash key based on the concatenation of the host header and the URI, then you may want to apply the same load-balancing algorithm into HAProxy, to optimize your caches.

The configuration below creates a new HTTP header field named X-LB which contains the host header (converted to lowercase) concatenated to the request uri (converted in lowercase too).

backend bk_static
  http-request set-header X-LB %[req.hdr(Host),lower]%[req.uri,lower]
  balance hdr(X-LB)
  hash-type consistent

Conclusion


HAProxy and Varnish works very well together. Each soft can benefit from performance and flexibility of the other one.

Links

Web application name to backend mapping in HAProxy

Synopsis

Let’s take a web application platform where many HTTP Host header points to.
Of course, this platform hosts many backends and HAProxy is used to perform content switching based on the Host header to route HTTP traffic to each backend.

HAProxy map


HAProxy 1.5 introduced a cool feature: converters. One converter type is map.
Long story made short: a map allows to map a data in input to an other one on output.

A map is stored in a flat file which is loaded by HAProxy on startup. It is composed by 2 columns, on the left the input string, on the right the output one:

in out

Basically, if you call the map above and give it the in strings, it will return out.

Mapping

Now, the interesting part of the article πŸ™‚

As stated in introduction, we want to map hundreds of Host headers to tens of backends.

The old way of mapping: acl and use_backend rules

Before the map, we had to use acls and use_backend rules.

like below:

frontend ft_allapps
 [...]
 use_backend bk_app1 if { hdr(Host) -i app1.domain1.com app1.domain2.com }
 use_backend bk_app2 if { hdr(Host) -i app2.domain1.com app2.domain2.com }
 default_backend bk_default

Add one statement per use_backend rule.

This works nicely for a few backends and a few domain names. But this type of configuration is hardly scallable…

The new way of mapping: one map and one use_backend rule

Now we can use map to achieve the same purpose.

First, let’s create a map file called domain2backend.map, with the following content: on the left, the domain name, on the right, the backend name:

#domainname  backendname
app1.domain1.com bk_app1
app1.domain2.com bk_app1
app2.domain1.com bk_app2
app2.domain2.com bk_app2

And now, HAProxy configuration:

frontend ft_allapps
 [...]
 use_backend %[req.hdr(host),lower,map_dom(/etc/hapee-1.5/domain2backend.map,bk_default)]

Here is what HAProxy will do:

  1. req.hdr(host) ==> fetch the Host header from the HTTP request
  2. lower ==> convert the string into lowercase
  3. map_dom(/etc/hapee-1.5/domain2backend.map) ==> look for the lowercase Host header in the map and return the backend name if found. If not found, the name of a default backend is returned
  4. route traffic to the backend name returned by the map

Now, adding a new content switching rule means just add one new line in the map content (and reload HAProxy). No regexes, map data is stored in a tree, so processing time is very low compared to matching many string in many ACLs for many use_backend rules.

simple is beautiful!!!

HAProxy map content auto update


If you are an HAPEE user (and soon available for the ALOHA), you can use the lb-update content to download the content of the map automatically.
Add the following statement in your configuration:

dynamic-update
 update id domain2backend.map url https://10.0.0.1/domain2backend.map delay 60s timeout 5s retries 3 map

Links

HAProxy advanced Redis health check

Introduction

Redis is an opensource nosql database working on a key/value model.
One interesting feature in Redis is that it is able to write data to disk as well as a master can synchronize many slaves.

HAProxy can load-balance Redis servers with no issues at all.
There is even a built-in health check for redis in HAProxy.
Unfortunately, there was no easy way for HAProxy to detect the status of a redis server: master or slave node. Hence people usually hacks this part of the architecture.

As written in the title of this post, we’ll learn today how to make a simple Redis infrastructure thanks to newest HAProxy advanced send/expect health checks.
This feature is available in HAProxy 1.5-dev20 and above.

Purpose is to make the redis infrastructure as simple as possible and ease fail over for the web servers. HAProxy will have to detect which node is MASTER and route all the connection to it.

Redis high availability diagram with HAProxy

Below, an ascii art diagram of HAProxy load-balancing Redis servers:

+----+ +----+ +----+ +----+
| W1 | | W2 | | W3 | | W4 |   Web application servers
+----+ +----+ +----+ +----+
          |   |     /
          |   |    /
          |   |   /
        +---------+
        | HAProxy |
        +---------+
           /   
       +----+ +----+
       | R1 | | R2 |           Redis servers
       +----+ +----+

The scenario is simple:
  * 4 web application servers need to store and retrieve data to/from a Redis database
  * one (better using 2) HAProxy servers which load-balance redis connections
  * 2 (at least) redis servers in an active/standby mode with replication

Configuration

Below, is the HAProxy configuration for the

defaults REDIS
 mode tcp
 timeout connect  4s
 timeout server  30s
 timeout client  30s

frontend ft_redis
 bind 10.0.0.1:6379 name redis
 default_backend bk_redis

backend bk_redis
 option tcp-check
 tcp-check send PINGrn
 tcp-check expect string +PONG
 tcp-check send info replicationrn
 tcp-check expect string role:master
 tcp-check send QUITrn
 tcp-check expect string +OK
 server R1 10.0.0.11:6379 check inter 1s
 server R2 10.0.0.12:6379 check inter 1s

The HAProxy health check sequence above allows to consider the Redis master server as UP in the farm and redirect connections to it.
When the Redis master server fails, the remaining nodes elect a new one. HAProxy will detect it thanks to its health check sequence.

It does not require third party tools and make fail over transparent.

Links

Configure syslog-ng to log readable HTTP URL from HAProxy

This tips is provided by Exosec.
Exosec provides a very good monitoring product called POM, based on Nagios with very strong value added such as very simple administration, application monitoring, etc…
For some of their project, they use either HAProxy or the ALOHA Load-Balancer (heh, what else???) and they export log entries into syslog-ng for storage and later analysis.

HAProxy’s log


HAProxy’s logs are very powerfull since they provide many information about the request and the status of the platform at the moment of the request.
For a readable HAProxy’s log description, please consult the ALOHA memo dedicated to HAProxy’s HTTP log line description.

One of the weakness of the log line is that it logs only the path and the query string of each URL. No server name neither protocol information.
Well, HAProxy allows us to log the Host header, which is fine and there is a tild ‘~’ after the frontend name when the connection is made over SSL.

Using syslog-ng flexible configuration, we can re-order things to make haproxy log the URL exactly like it was sent by the client, like:
“http://www.domain.tld/url/path”

Configuration

HAProxy configuration


Note: this is a very minimalistic configuration, not recommended in production πŸ™‚

global
 log 127.0.0.1:514 local2

frontend ft_http
 bind 127.0.0.1:8080
 bind 127.0.0.1:8081 ssl crt /etc/haproxy/haproxy.pem
 option http-server-close
 mode http
 log global
 option httplog
 # Mandatory to build the URL:
 capture request header Host       len 32
 # Optional, just for statistics:
 capture request header User-Agent len 200
 default_backend bk_http

backend bk_http
 option http-server-close
 mode http
 log global
 option httplog

 server srv1 10.0.0.1:80

Syslog-ng configuration


The configuration below will reproduce an HAProxy log line, but will replace the URL part by something more readable.

Note: if you capture a different number of HTTP headers in HAProxy (current example contains 2 captured headers), you may have to update the parser p_haproxy_headers_req and the destination d_haproxy_full.

source s_loopback { syslog(ip(127.0.0.1) port(514) transport("udp")); };

destination d_haproxy_full {
     file("/var/log/haproxy.$YEAR-$MONTH-$DAY.log"
          template("$DATE $FULLHOST $PROGRAM: ${HAPROXY.CLIENT_IPPORT} [${HAPROXY.DATE}] ${HAPROXY.FRONTEND} ${HAPROXY.BACKEND} ${HAPROXY.TIME} ${HAPROXY.STATUS_CODE} ${HAPROXY.BYTES_READ} ${HAPROXY.COOKIE_REQ} ${HAPROXY.COOKIE_RESP} ${HAPROXY.TERM_STATE} ${HAPROXY.RUN_STATE} ${HAPROXY.QUEUE_STATE} {${HAPROXY.HEADERS_REQ}} "${HAPROXY.METHOD} ${HAPROXY.FRONTEND_PROTOCOL}://${HAPROXY.HOST}${HAPROXY.URL} ${HAPROXY.HTTP_VERSION}"n")
          group(adm) perm(0640) dir_perm(0750) template_escape(no)
         );
};

filter f_haproxy { program("haproxy"); };
filter f_frontend_ssl { match("~ "); };

rewrite r_set_frontend_protocol {
  set("http", value("HAPROXY.FRONTEND_PROTOCOL") condition(filter(f_haproxy)));
  set("https", value("HAPROXY.FRONTEND_PROTOCOL") condition(filter(f_frontend_ssl)));
};

parser p_haproxy {
  csv-parser(columns("HAPROXY.CLIENT_IPPORT", "HAPROXY.DATE",
                     "HAPROXY.FRONTEND", "HAPROXY.BACKEND",
                     "HAPROXY.TIME", "HAPROXY.STATUS_CODE",
                     "HAPROXY.BYTES_READ", "HAPROXY.COOKIE_REQ",
                     "HAPROXY.COOKIE_RESP", "HAPROXY.TERM_STATE",
                     "HAPROXY.RUN_STATE", "HAPROXY.QUEUE_STATE",
                     "HAPROXY.HEADERS_REQ", "HAPROXY.REQUEST")
             flags(escape-double-char,strip-whitespace)
             delimiters(" ")
             quote-pairs('""[]{}'));
};

parser p_haproxy_request {
  csv-parser(columns("HAPROXY.METHOD", "HAPROXY.URL",
                     "HAPROXY.HTTP_VERSION")
             delimiters(" ")
             flags(escape-none)
             template("${HAPROXY.REQUEST}"));
};

parser p_haproxy_headers_req {
  csv-parser(columns("HAPROXY.HOST", "HAPROXY.USER_AGENT")
             delimiters("|")
             flags(escape-none)
             template("${HAPROXY.HEADERS_REQ}"));
};

log {
  source(s_loopback);
  filter(f_haproxy);
  parser(p_haproxy);
  parser(p_haproxy_request);
  parser(p_haproxy_headers_req);
  rewrite(r_set_frontend_protocol);
  destination(d_haproxy_full);
};

This configuration is downloadable from HAProxy Technologies github: https://raw.github.com/exceliance/haproxy/master/logs/syslog-ng_full_http_url.conf

And the result would look like below at the end of the logged line:

[...] "GET http://test.domain.tld/blah HTTP/1.1"
[...] "GET https://test.domain.tld/blah HTTP/1.1"

Links

HAProxy, high mysql request rate and TCP source port exhaustion

Synopsys


At HAProxy Technologies, we do provide professional services around HAPRoxy: this includes HAProxy itself, of course, but as well the underlying OS tuning, advice and recommendation about the architecture and sometimes we can also help customers troubleshooting application layer issues.
We don’t fix issues for the customer, but using information provided by HAProxy, we are able to reduce the investigation area, saving customer’s time and money.
The story I’m relating today is issued of one of this PS.

One of our customer is an hosting company which hosts some very busy PHP / MySQL websites. They used successfully HAProxy in front of their application servers.
They used to have a single MySQL server which was some kind of SPOF and which had to handle several thousands requests per seconds.
Sometimes, they had issues with this DB: it was like the clients (hence the Web servers) can’t hangs when using the DB.

So they decided to use MySQL replication and build an active/passive cluster. They also decided to split reads (SELECT queries) and writes (DELETE, INSERT, UPDATE queries) at the application level.
Then they were able to move the MySQL servers behind HAProxy.

Enough for the introduction πŸ™‚ Today’s article will discuss about HAProxy and MySQL at high request rate, and an error some of you may already have encountered: TCP source port exhaustion (the famous high number of sockets in TIME_WAIT).

Diagram


So basically, we have here a standard web platform which involves HAProxy to load-balance MySQL:
haproxy_mysql_replication

The MySQL Master server is used to send WRITE requests and the READ request are “weighted-ly” load-balanced (the slaves have a higher weight than the master) against all the MySQL servers.

MySql scalability

One way of scaling MySQL, is to use the replication method: one MySQL server is designed as master and must manages all the write operations (DELETE, INSERT, UPDATE, etc…). for each operation, it notifies all the MySQL slave servers. We can use slaves for reading only, offloading these types of requests from the master.
IMPORTANT NOTE: The replication method allows scalability of the read part, so if your application require much more writes, then this is not the method for you.

Of course, one MySQL slave server can be designed as master when the master fails! This also ensure MySQL high-availability.

So, where is the problem ???

This type of platform works very well for the majority of websites. The problem occurs when you start having a high rate of requests. By high, I mean several thousands per second.

TCP source port exhaustion

HAProxy works as a reverse-proxy and so uses its own IP address to get connected to the server.
Any system has around 64K TCP source ports available to get connected to a remote IP:port. Once a combination of “source IP:port => dst IP:port” is in use, it can’t be re-used.
First lesson: you can’t have more than 64K opened connections from a HAProxy box to a single remote IP:port couple. I think only people load-balancing MS Exchange RPC services or sharepoint with NTLM may one day reach this limit…
(well, it is possible to workaround this limit using some hacks we’ll explain later in this article)

Why does TCP port exhaustion occur with MySQL clients???


As I said, the MySQL request rate was a few thousands per second, so we never ever reach this limit of 64K simultaneous opened connections to the remote service…
What’s up then???
Well, there is an issue with MySQL client library: when a client sends its “QUIT” sequence, it performs a few internal operations before immediately shutting down the TCP connection, without waiting for the server to do it. A basic tcpdump will show it to you easily.
Note that you won’t be able to reproduce this issue on a loopback interface, because the server answers fast enough… You must use a LAN connection and 2 different servers.

Basically, here is the sequence currently performed by a MySQL client:

Mysql Client ==> "QUIT" sequence ==> Mysql Server
Mysql Client ==>       FIN       ==> MySQL Server
Mysql Client <==     FIN ACK     <== MySQL Server
Mysql Client ==>       ACK       ==> MySQL Server

Which leads the client connection to remain unavailable for twice the MSL (Maximum Segment Life) time, which means 2 minutes.
Note: this type of close has no negative impact when the connection is made over a UNIX socket.

Explication of the issue (much better that I could explain it myself):
“There is no way for the person who sent the first FIN to get an ACK back for that last ACK. You might want to reread that now. The person that initially closed the connection enters the TIME_WAIT state; in case the other person didn’t really get the ACK and thinks the connection is still open. Typically, this lasts one to two minutes.” (Source)

Since the source port is unavailable for the system for 2 minutes, this means that over 534 MySQL requests per seconds you’re in danger of TCP source port exhaustion: 64000 (available ports) / 120 (number of seconds in 2 minutes) = 533.333.
This TCP port exhaustion appears on the MySQL client server itself, but as well on the HAProxy box because it forwards the client traffic to the server… And since we have many web servers, it happens much faster on the HAProxy box !!!!

Remember: at spike traffic, my customer had a few thousands requests/s….

How to avoid TCP source port exhaustion?


Here comes THE question!!!!
First, a “clean” sequence should be:

Mysql Client ==> "QUIT" sequence ==> Mysql Server
Mysql Client <==       FIN       <== MySQL Server
Mysql Client ==>     FIN ACK     ==> MySQL Server
Mysql Client <==       ACK       <== MySQL Server

Actually, this sequence happens when both MySQL client and server are hosted on the same box and uses the loopback interface, that’s why I said sooner that if you want to reproduce the issue you must add “latency” between the client and the server and so use 2 boxes over the LAN.
So, until MySQL rewrite the code to follow the sequence above, there won’t be any improvement here!!!!

Increasing source port range


By default, on a Linux box, you have around 28K source ports available (for a single destination IP:port):

$ sysctl net.ipv4.ip_local_port_range
net.ipv4.ip_local_port_range = 32768    61000

In order to get 64K source ports, just run:

$ sudo sysctl net.ipv4.ip_local_port_range="1025 65000"

And don’t forget to update your /etc/sysctl.conf file!!!

Note: this should definitively be applied also on the web servers….

Allow usage of source port in TIME_WAIT


A few sysctls can be used to tell the kernel to reuse faster the connection in TIME_WAIT:

net.ipv4.tcp_tw_reuse
net.ipv4.tcp_tw_recycle

tw_reuse can be used safely, be but careful with tw_recycle… It could have side effects. Same people behind a NAT might be able to get connected on the same device. So only use if your HAProxy is fully dedicated to your MySql setup.

anyway, these sysctls were already properly setup (value = 1) on both HAProxy and web servers.

Note: this should definitively be applied also on the web servers….
Note 2: tw_reuse should definitively be applied also on the web servers….

Using multiple IPs to get connected to a single server


In HAProxy configuration, you can precise on the server line the source IP address to use to get connected to a server, so just add more server lines with different IPs.
In the example below, the IPs 10.0.0.100 and 10.0.0.101 are configured on the HAProxy box:

[...]
  server mysql1     10.0.0.1:3306 check source 10.0.0.100
  server mysql1_bis 10.0.0.1:3306 check source 10.0.0.101
[...]

This allows us to open up to 128K source TCP port…
The kernel is responsible to affect a new TCP port when HAProxy requests it. Dispite improving things a bit, we still reach some source port exhaustion… We could not get over 80K connections in TIME_WAIT with 4 source IPs…

Let HAProxy manage TCP source ports


You can let HAProxy decides which source port to use when opening a new TCP connection, on behalf of the kernel. To address this topic, HAProxy has built-in functions which make it more efficient than a regular kernel.

Let’s update the configuration above:

[...]
  server mysql1     10.0.0.1:3306 check source 10.0.0.100:1025-65000
  server mysql1_bis 10.0.0.1:3306 check source 10.0.0.101:1025-65000
[...]

We managed to get 170K+ connections in TIME_WAIT with 4 source IPs… and not source port exhaustion anymore !!!!

Use a memcache


Fortunately, the devs from this customer are skilled and write flexible code πŸ™‚
So they managed to move some requests from the MySQL DB to a memcache, opening much less connections.

Use MySQL persistant connections


This could prevent fine Load-Balancing on the Read-Only farm, but it would be very efficient on the MySQL master server.

Conclusion

  • If you see some SOCKERR information messages in HAProxy logs (mainly on health check), you may be running out of TCP source ports.
  • Have skilled developers who write flexible code, where moving from a DB to an other is made easy
  • This kind of issue can happen only with protocols or applications which make the client closing the connection first
  • This issue can’t happen on HAProxy in HTTP mode, since it let the server closes the connection before sending a TCP RST

Links

Websockets load-balancing with HAProxy

Why Websocket ???


HTTP protocol is connection-less and only the client can request information from a server. In any case, a server can contact a client… HTTP is purely half-duplex. Furthermore, a server can answer only one time to a client request.
Some websites or web applications require the server to update client from time to time. There were a few ways to do so:

  • the client request the server at a regular interval to check if there is a new information available
  • the client send a request to the server and the server answers as soon as he has an information to provide to the client (also known as long time polling)

But those methods have many drawbacks due to HTTP limitation.
So a new protocol has been designed: websockets, which allows a 2 ways communication (full duplex) between a client and a server, over a single TCP connection. Furthermore, websockets re-use the HTTP connection it was initialized on, which means it uses the standard TCP port.

How does websocket work ???

Basically, a websocket start with a HTTP request like the one below:

GET / HTTP/1.1
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Version: 13
Sec-WebSocket-Key: avkFOZvLE0gZTtEyrZPolA==
Host: localhost:8080
Sec-WebSocket-Protocol: echo-protocol

The most important part is the “Connection: Upgrade” header which let the client know to the server it wants to change to an other protocol, whose name is provided by “Upgrade: websocket” header.

When a server with websocket capability receive the request above, it would answer a response like below:

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: tD0l5WXr+s0lqKRayF9ABifcpzY=
Sec-WebSocket-Protocol: echo-protocol

The most important part is the status code 101 which acknoledge the protocol switch (from HTTP to websocket) as well as the “Connection: Upgrade” and “Upgrade: websocket” headers.

From now, the TCP connection used for the HTTP request/response challenge is used for the websocket: whenever a peer wants to interact with the other peer, it can use the it.

The socket finishes as soon as one peer decides it or the TCP connection is closed.

HAProxy and Websockets


As seen above, there are 2 protocols embedded in websockets:

  1. HTTP: for the websocket setup
  2. TCP: websocket data exchange

HAProxy must be able to support websockets on these two protocols without breaking the TCP connection at any time.
There are 2 things to take care of:

  1. being able to switch a connection from HTTP to TCP without breaking it
  2. smartly manage timeouts for both protocols at the same time

Fortunately, HAProxy embeds all you need to load-balance properly websockets and can meet the 2 requirements above.
It can even route regular HTTP traffic from websocket traffic to different backends and perform websocket aware health check (setup phase only).

The diagram below shows how things happens and HAProxy timeouts involved in each phase:
diagram websocket

During the setup phase, HAProxy can work in HTTP mode, processing layer 7 information. It detects automatically the Connection: Upgrade exchange and is ready to switch to tunnel mode if the upgrade negotiation succeeds. During this phase, there are 3 timeouts involved:

  1. timeout client: client inactivity
  2. timeout connect: allowed TCP connection establishment time
  3. timeout server: allowed time to the server to process the request

If everything goes well, the websocket is established, then HAProxy fails over to tunnel mode, no data is analyzed anymore (and anyway, websocket does not speak HTTP). There is a single timeout invloved:

  1. timeout tunnel: take precedence over client and server timeout

timeout connect is not used since the TCP connection is already established πŸ™‚

Testing websocket with node.js

node.js is a platform which can host applications. It owns a websocket module we’ll use in the test below.

Here is the procedure to install node.js and the websocket module on Debian Squeeze.
Example code is issued from https://github.com/Worlize/WebSocket-Node, at the bottom of the page.

So basically, I’ll have 2 servers, each one hosting web pages on Apache and an echo application on websocket application hosted by nodejs. High-availability and routing is managed by HAProxy.

Configuration


Simple configuration


In this configuration, the websocket and the web server are on the same application.
HAProxy switches automatically from HTTP to tunnel mode when the client request a websocket.

defaults
  mode http
  log global
  option httplog
  option  http-server-close
  option  dontlognull
  option  redispatch
  option  contstats
  retries 3
  backlog 10000
  timeout client          25s
  timeout connect          5s
  timeout server          25s
# timeout tunnel available in ALOHA 5.5 or HAProxy 1.5-dev10 and higher
  timeout tunnel        3600s
  timeout http-keep-alive  1s
  timeout http-request    15s
  timeout queue           30s
  timeout tarpit          60s
  default-server inter 3s rise 2 fall 3
  option forwardfor

frontend ft_web
  bind 192.168.10.3:80 name http
  maxconn 10000
  default_backend bk_web

backend bk_web                      
  balance roundrobin
  server websrv1 192.168.10.11:8080 maxconn 10000 weight 10 cookie websrv1 check
  server websrv2 192.168.10.12:8080 maxconn 10000 weight 10 cookie websrv2 check

Advanced Configuration


The configuration below allows to route requests based on either Host header (if you have a dedicated host for your websocket calls) or Connection and Upgrade header (required to switch to websocket).
In the backend dedicated to websocket, HAProxy validates the setup phase and also ensure the user is requesting a right application name.
HAProxy also performs a websocket health check, sending a Connection upgrade request and expecting a 101 response status code. We can’t go further for now on the health check for now.
Optional: the web server is hosted on Apache, but could be hosted by node.js as well.

defaults
  mode http
  log global
  option httplog
  option  http-server-close
  option  dontlognull
  option  redispatch
  option  contstats
  retries 3
  backlog 10000
  timeout client          25s
  timeout connect          5s
  timeout server          25s
# timeout tunnel available in ALOHA 5.5 or HAProxy 1.5-dev10 and higher
  timeout tunnel        3600s
  timeout http-keep-alive  1s
  timeout http-request    15s
  timeout queue           30s
  timeout tarpit          60s
  default-server inter 3s rise 2 fall 3
  option forwardfor



frontend ft_web
  bind 192.168.10.3:80 name http
  maxconn 60000

## routing based on Host header
  acl host_ws hdr_beg(Host) -i ws.
  use_backend bk_ws if host_ws

## routing based on websocket protocol header
  acl hdr_connection_upgrade hdr(Connection)  -i upgrade
  acl hdr_upgrade_websocket  hdr(Upgrade)     -i websocket

  use_backend bk_ws if hdr_connection_upgrade hdr_upgrade_websocket
  default_backend bk_web



backend bk_web                                                   
  balance roundrobin                                             
  option httpchk HEAD /                                          
  server websrv1 192.168.10.11:80 maxconn 100 weight 10 cookie websrv1 check
  server websrv2 192.168.10.12:80 maxconn 100 weight 10 cookie websrv2 check



backend bk_ws                                                    
  balance roundrobin

## websocket protocol validation
  acl hdr_connection_upgrade hdr(Connection)                 -i upgrade
  acl hdr_upgrade_websocket  hdr(Upgrade)                    -i websocket
  acl hdr_websocket_key      hdr_cnt(Sec-WebSocket-Key)      eq 1
  acl hdr_websocket_version  hdr_cnt(Sec-WebSocket-Version)  eq 1
  http-request deny if ! hdr_connection_upgrade ! hdr_upgrade_websocket ! hdr_w
ebsocket_key ! hdr_websocket_version

## ensure our application protocol name is valid 
## (don't forget to update the list each time you publish new applications)
  acl ws_valid_protocol hdr(Sec-WebSocket-Protocol) echo-protocol
  http-request deny if ! ws_valid_protocol

## websocket health checking
  option httpchk GET / HTTP/1.1rnHost:\ ws.domain.comrnConnection:\ Upgrade\r\nUpgrade:\ websocket\r\nSec-WebSocket-Key:\ haproxy\r\nSec-WebSocket-Version:\ 13\r\nSec-WebSocket-Protocol:\ echo-protocol
  http-check expect status 101

  server websrv1 192.168.10.11:8080 maxconn 30000 weight 10 cookie websrv1 check
  server websrv2 192.168.10.12:8080 maxconn 30000 weight 10 cookie websrv2 check

Note that HAProxy could also be used to select different Websocket application based on the Sec-WebSocket-Protocol header of the setup phase. (later I’ll write the article about it).

Related links

Links