Tag Archives: haproxy

Asymmetric routing, multiple default gateways on Linux with HAProxy

Why we may need multiple default gateways?

Nowadays, Application Delivery controllers (aka Load-Balancers) become the entry point for all the applications hosted in a company or administration.
That said, many different type of population could access the applications:
  * internal users from the LAN
  * partners through MPLS or VPNs
  * external users from internet

On the other side, applications could be hosted on different VLANs in the architecture:
  * internal LAN
  * external DMZ

The diagram below shows the “big picture” of this type of architecture:
multiple_default_gateways

Routing in the Linux network stack

I’m not going to deeply explain how it works, sorry… It would deserve a complete blog post 🙂
That said, any device connected on an IP network needs an IP address to be able to talk to other devices in its LAN. It also needs a default gateway to be able to reach devices which are located outside its LAN.
A Linux kernel can use a single default gateway at a time, but thanks to the metric you can configure many default gateways.
When needed, the Linux Kernel will parse the default gateway table and will use the one with the lowest metric. By default, when no metric is configured, the kernel attributes a metric 0.
Each metric must be unique in your Kernel IP stack.

How HAProxy can help in such situation??


Users access applications through a HAProxy bind. The bind can be hosted on any IP address available or not (play with your sysctl for this purpose) on the server.
By default, the traffic comes in HAProxy through this bind and HAProxy let the kernel choose the most appropriate default gateway to forward the answer to the client. As we’ve seen above, the most appropriate default gateway from the kernel point of view is the one with the lowest metric usually 0.

That said, HAProxy is smart enough to tell the kernel which network interface to use to forward the response to the client. Just add the statement interface ethX (where X is the id of the interface you want to use) on HAProxy bind line.
With this parameter, HAProxy can force the kernel to use the default gateway associated to the network interface ethX if it exists, otherwise, the interface with the lowest metric will be used.

Security concern


From a security point of view, some security manager would say that it is absolutely unsecure to plug a device in multiple DMZ or VLANs. They are right. But usually, this type of company’s business is very important and they can affoard one load-balancer per DMZ or LAN.
That said, there is no security breach with the setup introduced here. HAProxy is a reverse-proxy and so you don’t need to allow ip_forward between all interfaces for this solution to work.
I mean that nobody could use the loadbalancer as a default gateway to reach an other subnet bypassing the firewall…
Then only traffic allowed to pass through is the one load-balanced!

Configuration

The configuration below applies to the ALOHA Loadbalancer. Just update the content to match your Linux distribution configuration syntax.
The configuration is also related to the diagram above.

Network configuration


In your ALOHA, go in the Services tab, then edit the Network configuration.
To keep it simple, I’m not going to add any VRRP configuration.

service network eth0
    ########## eth0.
    auto on
    ip   address 10.0.0.2/24
    ip   route   default 10.0.0.1

service network eth1
    ########## eth1.
    auto on
    ip   address 10.0.1.2/24
    ip   route   default 10.0.1.1 metric 1

service network eth2
    ########## eth2.
    auto on
    ip   address 10.0.2.2/24
    ip   route   default 10.0.2.1 metric 2

service network eth3
    ########## eth3.
    auto on
    ip   address 10.0.3.2/24
    ip   route   default 10.0.3.1 metric 3

service network eth4
    ########## eth4.
    auto on
    ip   address 10.0.4.2/24
    ip   route   default 10.0.4.1 metric 4

The routing table from the ALOHA looks like:

default via 10.0.0.1 dev eth0
default via 10.0.1.1 dev eth1  metric 1
default via 10.0.2.1 dev eth2  metric 2
default via 10.0.3.1 dev eth3  metric 3
default via 10.0.4.1 dev eth4  metric 4

HAProxy configuration for Corporate website or ADFS proxies


These services are used by internet users only.

frontend ft_www
 bind 10.0.0.2:80
[...]

no need to specify any interface here, since the traffic comes from internet, HAProxy can let the kernel to use the default gateway which points in that direction (here eth0).

HAProxy configuration for Exchange 2010 or 2013


This service is used by both internal and internet users.

frontend ft_exchange
 bind 10.0.0.3:443
 bind 10.0.2.3:443 interface eth2
[...]

The responses to internet users will go through eth0 while the one for internal LAN users will use the default gateway configured on eth2 10.0.2.1.

HAProxy configuration for Sharepoint 2010 or 2013


This service is used by MPLS/VPN users and internal users.

frontend ft_exchange
 bind 10.0.1.4:443 interface eth1
 bind 10.0.2.4:443 interface eth2
[...]

The responses to MPLS/VPN users will go through eth1 default gateway 10.0.1.1 while the one for internal LAN users will use the default gateway configured on eth2 10.0.2.1.

Links

Howto transparent proxying and binding with HAProxy and ALOHA Load-Balancer

Transparent Proxy

HAProxy works has a reverse-proxy. Which means it maintains 2 connections when allowing a client to cross it:
  – 1 connection between HAProxy and the client
  – 1 connection between HAProxy and the server

HAProxy then manipulate buffers between these two connections.
One of the drawback of this mode is that HAProxy will let the kernel to establish the connection to the server. The kernel is going to use a local IP address to do this.
Because of this, HAProxy “hides” the client IP by its own one: this can be an issue in some cases.
Here comes the transparent proxy mode: HAProxy can be configured to spoof the client IP address when establishing the TCP connection to the server. That way, the server thinks the connection comes from the client directly (of course, the server must answer back to HAProxy and not to the client, otherwise it can’t work: the client will get an acknowledge from the server IP while it has established the connection on HAProxy‘s IP).

Transparent binding

By default, when one want HAProxy to get traffic, we have to tell it to bind an IP address and a port.
The IP address must exist on the operating system (unless you have setup the sysctl net.ipv4.ip_nonlocal_bind) and the OS must announce the availability to the other devices on the network through ARP protocol.
Well, in some cases we want HAProxy to be able to catch traffic on the fly without configuring any IP address or VRRP or whatever…
This is where transparent binding comes in: HAProxy can be configured to catch traffic on the fly even if the destination IP address is not configured on the server.
These IP addresses will never be pingable, but they’ll deliver the services configured in HAProxy.

HAProxy and the Linux Kernel

Unfortunately, HAProxy can’t do transparent binding or proxying alone. It must stand on a compiled and tuned Linux Kernel and operating system.
Below, I’ll explain how to do this in a standard Linux distribution.
Here is the check list to meet:
  1. appropriate HAProxy compilation option
  2. appropriate Linux Kernel compilation option
  3. sysctl settings
  4. iptables rules
  5. ip route rules
  6. HAProxy configuration

HAProxy compilation requirements


First of all, HAProxy must be compiled with the option TPROXY enabled.
It is enabled by default when you use the target LINUX26 or LINUX2628.

Linux Kernel requirements

You have to ensure your kernel has been compiled with the following options:
  – CONFIG_NETFILTER_TPROXY
  – CONFIG_NETFILTER_XT_TARGET_TPROXY

Of course, iptables must be enabled as well in your kernel 🙂

sysctl settings

The following sysctls must be enabled:
  – net.ipv4.ip_forward
  – net.ipv4.ip_nonlocal_bind

iptables rules


You must setup the following iptables rules:

iptables -t mangle -N DIVERT
iptables -t mangle -A PREROUTING -p tcp -m socket -j DIVERT
iptables -t mangle -A DIVERT -j MARK --set-mark 1
iptables -t mangle -A DIVERT -j ACCEPT

Purpose is to mark packets which matches a socket bound locally (by HAProxy).

IP route rules


Then, tell the Operating System to forward packets marked by iptables to the loopback where HAProxy can catch them:

ip rule add fwmark 1 lookup 100
ip route add local 0.0.0.0/0 dev lo table 100

HAProxy configuration


Finally, you can configure HAProxy.
  * Transparent binding can be configured like this:

[...]
frontend ft_application
  bind 1.1.1.1:80 transparent
[...]

  * Transparent proxying can be configured like this:

[...]
backend bk_application
  source 0.0.0.0 usesrc clientip
[...]

Transparent mode in the ALOHA Load-Balancer


Now, the same steps in the ALOHA Load-balancer, which is an HAProxy based load-balancing appliance:
  1-5. not required, the ALOHA kernel is deeply tuned for this purpose
  6. HAProxy configuration

LB Admin tab (AKA click mode)

  * Transparent binding can be configured like this, when editing a Frontend listener:
frontend_listener_transparent

  * Transparent proxying can be configured like this when editing a farm:
backend_transparent

LB Layer 7 tab (vi in a browser mode)


  * Transparent binding can be configured like this:

[...]
frontend ft_application
  bind 1.1.1.1:80 transparent
[...]

  * Transparent proxying can be configured like this:

[...]
backend bk_application
  source 0.0.0.0 usesrc clientip
[...]

Links

SSL Client certificate information in HTTP headers and logs

HAProxy and SSL

HAProxy has many nice features when speaking about SSL, despite SSL has been introduced in it lately.

One of those features is the client side certificate management, which has already been discussed on the blog.
One thing was missing in the article, since HAProxy did not have the feature when I first write the article. It is the capability of inserting client certificate information in HTTP headers and reporting them as well in the log line.

Fortunately, the devs at HAProxy Technologies keep on improving HAProxy and it is now available (well, for some time now, but I did not have any time to write the article yet).

OpenSSL commands to generate SSL certificates

Well, just take the script from HAProxy Technologies github, follow the instruction and you’ll have an environment setup in a very few seconds.
Here is the script: https://github.com/exceliance/haproxy/tree/master/blog/ssl_client_certificate_management_at_application_level

Configuration

The configuration below shows a frontend and a backend with SSL offloading and with insertion of client certificate information into HTTP headers. As you can see, this is pretty straight forward.

frontend ft_www
 bind 127.0.0.1:8080 name http
 bind 127.0.0.1:8081 name https ssl crt ./server.pem ca-file ./ca.crt verify required
 log-format %ci:%cp [%t] %ft %b/%s %Tq/%Tw/%Tc/%Tr/%Tt %ST %B %CC %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs {%[ssl_c_verify],%{+Q}[ssl_c_s_dn],%{+Q}[ssl_c_i_dn]} %{+Q}r
 http-request set-header X-SSL                  %[ssl_fc]
 http-request set-header X-SSL-Client-Verify    %[ssl_c_verify]
 http-request set-header X-SSL-Client-DN        %{+Q}[ssl_c_s_dn]
 http-request set-header X-SSL-Client-CN        %{+Q}[ssl_c_s_dn(cn)]
 http-request set-header X-SSL-Issuer           %{+Q}[ssl_c_i_dn]
 http-request set-header X-SSL-Client-NotBefore %{+Q}[ssl_c_notbefore]
 http-request set-header X-SSL-Client-NotAfter  %{+Q}[ssl_c_notafter]
 default_backend bk_www

backend bk_www
 cookie SRVID insert nocache
 server server1 127.0.0.1:8088 maxconn 1

To observe the result, I just fake a server using netcat and observe the headers sent by HAProxy:

X-SSL: 1
X-SSL-Client-Verify: 0
X-SSL-Client-DN: "/C=FR/ST=Ile de France/L=Jouy en Josas/O=haproxy.com/CN=client1/emailAddress=ba@haproxy.com"
X-SSL-Client-CN: "client1"
X-SSL-Issuer: "/C=FR/ST=Ile de France/L=Jouy en Josas/O=haproxy.com/CN=haproxy.com/emailAddress=ba@haproxy.com"
X-SSL-Client-NotBefore: "130613144555Z"
X-SSL-Client-NotAfter: "140613144555Z"

And the associated log line which has been generated:

Jun 13 18:09:49 localhost haproxy[32385]: 127.0.0.1:38849 [13/Jun/2013:18:09:45.277] ft_www~ bk_www/server1 
1643/0/1/-1/4645 504 194 - - sHNN 0/0/0/0/0 0/0 
{0,"/C=FR/ST=Ile de France/L=Jouy en Josas/O=haproxy.com/CN=client1/emailAddress=ba@haproxy.com",
"/C=FR/ST=Ile de France/L=Jouy en Josas/O=haproxy.com/CN=haproxy.com/emailAddress=ba@haproxy.com"} "GET /" 

NOTE: I have inserted a few CRLF to make it easily readable.

Now, my HAProxy can deliver the following information to my web server:
  * ssl_fc: did the client used a secured connection (1) or not (0)
  * ssl_c_verify: the status code of the TLS/SSL client connection
  * ssl_c_s_dn: returns the full Distinguished Name of the certificate presented by the client
  * ssl_c_s_dn(cn): same as above, but extracts only the Common Name
  * ssl_c_i_dn: full distinguished name of the issuer of the certificate presented by the client
  * ssl_c_notbefore: start date presented by the client as a formatted string YYMMDDhhmmss
  * ssl_c_notafter: end date presented by the client as a formatted string YYMMDDhhmmss

Related Links

Links

Configure syslog-ng to log readable HTTP URL from HAProxy

This tips is provided by Exosec.
Exosec provides a very good monitoring product called POM, based on Nagios with very strong value added such as very simple administration, application monitoring, etc…
For some of their project, they use either HAProxy or the ALOHA Load-Balancer (heh, what else???) and they export log entries into syslog-ng for storage and later analysis.

HAProxy’s log


HAProxy’s logs are very powerfull since they provide many information about the request and the status of the platform at the moment of the request.
For a readable HAProxy’s log description, please consult the ALOHA memo dedicated to HAProxy’s HTTP log line description.

One of the weakness of the log line is that it logs only the path and the query string of each URL. No server name neither protocol information.
Well, HAProxy allows us to log the Host header, which is fine and there is a tild ‘~’ after the frontend name when the connection is made over SSL.

Using syslog-ng flexible configuration, we can re-order things to make haproxy log the URL exactly like it was sent by the client, like:
“http://www.domain.tld/url/path”

Configuration

HAProxy configuration


Note: this is a very minimalistic configuration, not recommended in production 🙂

global
 log 127.0.0.1:514 local2

frontend ft_http
 bind 127.0.0.1:8080
 bind 127.0.0.1:8081 ssl crt /etc/haproxy/haproxy.pem
 option http-server-close
 mode http
 log global
 option httplog
 # Mandatory to build the URL:
 capture request header Host       len 32
 # Optional, just for statistics:
 capture request header User-Agent len 200
 default_backend bk_http

backend bk_http
 option http-server-close
 mode http
 log global
 option httplog

 server srv1 10.0.0.1:80

Syslog-ng configuration


The configuration below will reproduce an HAProxy log line, but will replace the URL part by something more readable.

Note: if you capture a different number of HTTP headers in HAProxy (current example contains 2 captured headers), you may have to update the parser p_haproxy_headers_req and the destination d_haproxy_full.

source s_loopback { syslog(ip(127.0.0.1) port(514) transport("udp")); };

destination d_haproxy_full {
     file("/var/log/haproxy.$YEAR-$MONTH-$DAY.log"
          template("$DATE $FULLHOST $PROGRAM: ${HAPROXY.CLIENT_IPPORT} [${HAPROXY.DATE}] ${HAPROXY.FRONTEND} ${HAPROXY.BACKEND} ${HAPROXY.TIME} ${HAPROXY.STATUS_CODE} ${HAPROXY.BYTES_READ} ${HAPROXY.COOKIE_REQ} ${HAPROXY.COOKIE_RESP} ${HAPROXY.TERM_STATE} ${HAPROXY.RUN_STATE} ${HAPROXY.QUEUE_STATE} {${HAPROXY.HEADERS_REQ}} "${HAPROXY.METHOD} ${HAPROXY.FRONTEND_PROTOCOL}://${HAPROXY.HOST}${HAPROXY.URL} ${HAPROXY.HTTP_VERSION}"n")
          group(adm) perm(0640) dir_perm(0750) template_escape(no)
         );
};

filter f_haproxy { program("haproxy"); };
filter f_frontend_ssl { match("~ "); };

rewrite r_set_frontend_protocol {
  set("http", value("HAPROXY.FRONTEND_PROTOCOL") condition(filter(f_haproxy)));
  set("https", value("HAPROXY.FRONTEND_PROTOCOL") condition(filter(f_frontend_ssl)));
};

parser p_haproxy {
  csv-parser(columns("HAPROXY.CLIENT_IPPORT", "HAPROXY.DATE",
                     "HAPROXY.FRONTEND", "HAPROXY.BACKEND",
                     "HAPROXY.TIME", "HAPROXY.STATUS_CODE",
                     "HAPROXY.BYTES_READ", "HAPROXY.COOKIE_REQ",
                     "HAPROXY.COOKIE_RESP", "HAPROXY.TERM_STATE",
                     "HAPROXY.RUN_STATE", "HAPROXY.QUEUE_STATE",
                     "HAPROXY.HEADERS_REQ", "HAPROXY.REQUEST")
             flags(escape-double-char,strip-whitespace)
             delimiters(" ")
             quote-pairs('""[]{}'));
};

parser p_haproxy_request {
  csv-parser(columns("HAPROXY.METHOD", "HAPROXY.URL",
                     "HAPROXY.HTTP_VERSION")
             delimiters(" ")
             flags(escape-none)
             template("${HAPROXY.REQUEST}"));
};

parser p_haproxy_headers_req {
  csv-parser(columns("HAPROXY.HOST", "HAPROXY.USER_AGENT")
             delimiters("|")
             flags(escape-none)
             template("${HAPROXY.HEADERS_REQ}"));
};

log {
  source(s_loopback);
  filter(f_haproxy);
  parser(p_haproxy);
  parser(p_haproxy_request);
  parser(p_haproxy_headers_req);
  rewrite(r_set_frontend_protocol);
  destination(d_haproxy_full);
};

This configuration is downloadable from HAProxy Technologies github: https://raw.github.com/exceliance/haproxy/master/logs/syslog-ng_full_http_url.conf

And the result would look like below at the end of the logged line:

[...] "GET http://test.domain.tld/blah HTTP/1.1"
[...] "GET https://test.domain.tld/blah HTTP/1.1"

Links

Mitigating the SSL Beast attack using the ALOHA Load-Balancer / HAProxy

The beast attack on SSL isn’t new, but we have not yet published an article to explain how to mitigate it with the ALOHA or HAProxy.
First of all, to mitigate this attack, you must use the Load-Balancer as the SSL endpoint, then just append the following parameter on your HAProxy SSL frontend:
  * For the ALOHA Load-Balancer:

bind 10.0.0.9:443 name https ssl crt domain ciphers RC4:HIGH:!aNULL:!MD5

  * For HAProxy OpenSource:

bind 10.0.0.9:443 name https ssl crt /path/to/domain.pem ciphers RC4:HIGH:!aNULL:!MD5

As you may have understood, the most important part is the ciphers RC4:HIGH:!aNULL:!MD5 directive which can be used to force the cipher used during the connection and to force it to be strong enough to resist to the attack.

Related Links

Links

HAProxy, high mysql request rate and TCP source port exhaustion

Synopsys


At HAProxy Technologies, we do provide professional services around HAPRoxy: this includes HAProxy itself, of course, but as well the underlying OS tuning, advice and recommendation about the architecture and sometimes we can also help customers troubleshooting application layer issues.
We don’t fix issues for the customer, but using information provided by HAProxy, we are able to reduce the investigation area, saving customer’s time and money.
The story I’m relating today is issued of one of this PS.

One of our customer is an hosting company which hosts some very busy PHP / MySQL websites. They used successfully HAProxy in front of their application servers.
They used to have a single MySQL server which was some kind of SPOF and which had to handle several thousands requests per seconds.
Sometimes, they had issues with this DB: it was like the clients (hence the Web servers) can’t hangs when using the DB.

So they decided to use MySQL replication and build an active/passive cluster. They also decided to split reads (SELECT queries) and writes (DELETE, INSERT, UPDATE queries) at the application level.
Then they were able to move the MySQL servers behind HAProxy.

Enough for the introduction 🙂 Today’s article will discuss about HAProxy and MySQL at high request rate, and an error some of you may already have encountered: TCP source port exhaustion (the famous high number of sockets in TIME_WAIT).

Diagram


So basically, we have here a standard web platform which involves HAProxy to load-balance MySQL:
haproxy_mysql_replication

The MySQL Master server is used to send WRITE requests and the READ request are “weighted-ly” load-balanced (the slaves have a higher weight than the master) against all the MySQL servers.

MySql scalability

One way of scaling MySQL, is to use the replication method: one MySQL server is designed as master and must manages all the write operations (DELETE, INSERT, UPDATE, etc…). for each operation, it notifies all the MySQL slave servers. We can use slaves for reading only, offloading these types of requests from the master.
IMPORTANT NOTE: The replication method allows scalability of the read part, so if your application require much more writes, then this is not the method for you.

Of course, one MySQL slave server can be designed as master when the master fails! This also ensure MySQL high-availability.

So, where is the problem ???

This type of platform works very well for the majority of websites. The problem occurs when you start having a high rate of requests. By high, I mean several thousands per second.

TCP source port exhaustion

HAProxy works as a reverse-proxy and so uses its own IP address to get connected to the server.
Any system has around 64K TCP source ports available to get connected to a remote IP:port. Once a combination of “source IP:port => dst IP:port” is in use, it can’t be re-used.
First lesson: you can’t have more than 64K opened connections from a HAProxy box to a single remote IP:port couple. I think only people load-balancing MS Exchange RPC services or sharepoint with NTLM may one day reach this limit…
(well, it is possible to workaround this limit using some hacks we’ll explain later in this article)

Why does TCP port exhaustion occur with MySQL clients???


As I said, the MySQL request rate was a few thousands per second, so we never ever reach this limit of 64K simultaneous opened connections to the remote service…
What’s up then???
Well, there is an issue with MySQL client library: when a client sends its “QUIT” sequence, it performs a few internal operations before immediately shutting down the TCP connection, without waiting for the server to do it. A basic tcpdump will show it to you easily.
Note that you won’t be able to reproduce this issue on a loopback interface, because the server answers fast enough… You must use a LAN connection and 2 different servers.

Basically, here is the sequence currently performed by a MySQL client:

Mysql Client ==> "QUIT" sequence ==> Mysql Server
Mysql Client ==>       FIN       ==> MySQL Server
Mysql Client <==     FIN ACK     <== MySQL Server
Mysql Client ==>       ACK       ==> MySQL Server

Which leads the client connection to remain unavailable for twice the MSL (Maximum Segment Life) time, which means 2 minutes.
Note: this type of close has no negative impact when the connection is made over a UNIX socket.

Explication of the issue (much better that I could explain it myself):
“There is no way for the person who sent the first FIN to get an ACK back for that last ACK. You might want to reread that now. The person that initially closed the connection enters the TIME_WAIT state; in case the other person didn’t really get the ACK and thinks the connection is still open. Typically, this lasts one to two minutes.” (Source)

Since the source port is unavailable for the system for 2 minutes, this means that over 534 MySQL requests per seconds you’re in danger of TCP source port exhaustion: 64000 (available ports) / 120 (number of seconds in 2 minutes) = 533.333.
This TCP port exhaustion appears on the MySQL client server itself, but as well on the HAProxy box because it forwards the client traffic to the server… And since we have many web servers, it happens much faster on the HAProxy box !!!!

Remember: at spike traffic, my customer had a few thousands requests/s….

How to avoid TCP source port exhaustion?


Here comes THE question!!!!
First, a “clean” sequence should be:

Mysql Client ==> "QUIT" sequence ==> Mysql Server
Mysql Client <==       FIN       <== MySQL Server
Mysql Client ==>     FIN ACK     ==> MySQL Server
Mysql Client <==       ACK       <== MySQL Server

Actually, this sequence happens when both MySQL client and server are hosted on the same box and uses the loopback interface, that’s why I said sooner that if you want to reproduce the issue you must add “latency” between the client and the server and so use 2 boxes over the LAN.
So, until MySQL rewrite the code to follow the sequence above, there won’t be any improvement here!!!!

Increasing source port range


By default, on a Linux box, you have around 28K source ports available (for a single destination IP:port):

$ sysctl net.ipv4.ip_local_port_range
net.ipv4.ip_local_port_range = 32768    61000

In order to get 64K source ports, just run:

$ sudo sysctl net.ipv4.ip_local_port_range="1025 65000"

And don’t forget to update your /etc/sysctl.conf file!!!

Note: this should definitively be applied also on the web servers….

Allow usage of source port in TIME_WAIT


A few sysctls can be used to tell the kernel to reuse faster the connection in TIME_WAIT:

net.ipv4.tcp_tw_reuse
net.ipv4.tcp_tw_recycle

tw_reuse can be used safely, be but careful with tw_recycle… It could have side effects. Same people behind a NAT might be able to get connected on the same device. So only use if your HAProxy is fully dedicated to your MySql setup.

anyway, these sysctls were already properly setup (value = 1) on both HAProxy and web servers.

Note: this should definitively be applied also on the web servers….
Note 2: tw_reuse should definitively be applied also on the web servers….

Using multiple IPs to get connected to a single server


In HAProxy configuration, you can precise on the server line the source IP address to use to get connected to a server, so just add more server lines with different IPs.
In the example below, the IPs 10.0.0.100 and 10.0.0.101 are configured on the HAProxy box:

[...]
  server mysql1     10.0.0.1:3306 check source 10.0.0.100
  server mysql1_bis 10.0.0.1:3306 check source 10.0.0.101
[...]

This allows us to open up to 128K source TCP port…
The kernel is responsible to affect a new TCP port when HAProxy requests it. Dispite improving things a bit, we still reach some source port exhaustion… We could not get over 80K connections in TIME_WAIT with 4 source IPs…

Let HAProxy manage TCP source ports


You can let HAProxy decides which source port to use when opening a new TCP connection, on behalf of the kernel. To address this topic, HAProxy has built-in functions which make it more efficient than a regular kernel.

Let’s update the configuration above:

[...]
  server mysql1     10.0.0.1:3306 check source 10.0.0.100:1025-65000
  server mysql1_bis 10.0.0.1:3306 check source 10.0.0.101:1025-65000
[...]

We managed to get 170K+ connections in TIME_WAIT with 4 source IPs… and not source port exhaustion anymore !!!!

Use a memcache


Fortunately, the devs from this customer are skilled and write flexible code 🙂
So they managed to move some requests from the MySQL DB to a memcache, opening much less connections.

Use MySQL persistant connections


This could prevent fine Load-Balancing on the Read-Only farm, but it would be very efficient on the MySQL master server.

Conclusion

  • If you see some SOCKERR information messages in HAProxy logs (mainly on health check), you may be running out of TCP source ports.
  • Have skilled developers who write flexible code, where moving from a DB to an other is made easy
  • This kind of issue can happen only with protocols or applications which make the client closing the connection first
  • This issue can’t happen on HAProxy in HTTP mode, since it let the server closes the connection before sending a TCP RST

Links

Websockets load-balancing with HAProxy

Why Websocket ???


HTTP protocol is connection-less and only the client can request information from a server. In any case, a server can contact a client… HTTP is purely half-duplex. Furthermore, a server can answer only one time to a client request.
Some websites or web applications require the server to update client from time to time. There were a few ways to do so:

  • the client request the server at a regular interval to check if there is a new information available
  • the client send a request to the server and the server answers as soon as he has an information to provide to the client (also known as long time polling)

But those methods have many drawbacks due to HTTP limitation.
So a new protocol has been designed: websockets, which allows a 2 ways communication (full duplex) between a client and a server, over a single TCP connection. Furthermore, websockets re-use the HTTP connection it was initialized on, which means it uses the standard TCP port.

How does websocket work ???

Basically, a websocket start with a HTTP request like the one below:

GET / HTTP/1.1
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Version: 13
Sec-WebSocket-Key: avkFOZvLE0gZTtEyrZPolA==
Host: localhost:8080
Sec-WebSocket-Protocol: echo-protocol

The most important part is the “Connection: Upgrade” header which let the client know to the server it wants to change to an other protocol, whose name is provided by “Upgrade: websocket” header.

When a server with websocket capability receive the request above, it would answer a response like below:

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: tD0l5WXr+s0lqKRayF9ABifcpzY=
Sec-WebSocket-Protocol: echo-protocol

The most important part is the status code 101 which acknoledge the protocol switch (from HTTP to websocket) as well as the “Connection: Upgrade” and “Upgrade: websocket” headers.

From now, the TCP connection used for the HTTP request/response challenge is used for the websocket: whenever a peer wants to interact with the other peer, it can use the it.

The socket finishes as soon as one peer decides it or the TCP connection is closed.

HAProxy and Websockets


As seen above, there are 2 protocols embedded in websockets:

  1. HTTP: for the websocket setup
  2. TCP: websocket data exchange

HAProxy must be able to support websockets on these two protocols without breaking the TCP connection at any time.
There are 2 things to take care of:

  1. being able to switch a connection from HTTP to TCP without breaking it
  2. smartly manage timeouts for both protocols at the same time

Fortunately, HAProxy embeds all you need to load-balance properly websockets and can meet the 2 requirements above.
It can even route regular HTTP traffic from websocket traffic to different backends and perform websocket aware health check (setup phase only).

The diagram below shows how things happens and HAProxy timeouts involved in each phase:
diagram websocket

During the setup phase, HAProxy can work in HTTP mode, processing layer 7 information. It detects automatically the Connection: Upgrade exchange and is ready to switch to tunnel mode if the upgrade negotiation succeeds. During this phase, there are 3 timeouts involved:

  1. timeout client: client inactivity
  2. timeout connect: allowed TCP connection establishment time
  3. timeout server: allowed time to the server to process the request

If everything goes well, the websocket is established, then HAProxy fails over to tunnel mode, no data is analyzed anymore (and anyway, websocket does not speak HTTP). There is a single timeout invloved:

  1. timeout tunnel: take precedence over client and server timeout

timeout connect is not used since the TCP connection is already established 🙂

Testing websocket with node.js

node.js is a platform which can host applications. It owns a websocket module we’ll use in the test below.

Here is the procedure to install node.js and the websocket module on Debian Squeeze.
Example code is issued from https://github.com/Worlize/WebSocket-Node, at the bottom of the page.

So basically, I’ll have 2 servers, each one hosting web pages on Apache and an echo application on websocket application hosted by nodejs. High-availability and routing is managed by HAProxy.

Configuration


Simple configuration


In this configuration, the websocket and the web server are on the same application.
HAProxy switches automatically from HTTP to tunnel mode when the client request a websocket.

defaults
  mode http
  log global
  option httplog
  option  http-server-close
  option  dontlognull
  option  redispatch
  option  contstats
  retries 3
  backlog 10000
  timeout client          25s
  timeout connect          5s
  timeout server          25s
# timeout tunnel available in ALOHA 5.5 or HAProxy 1.5-dev10 and higher
  timeout tunnel        3600s
  timeout http-keep-alive  1s
  timeout http-request    15s
  timeout queue           30s
  timeout tarpit          60s
  default-server inter 3s rise 2 fall 3
  option forwardfor

frontend ft_web
  bind 192.168.10.3:80 name http
  maxconn 10000
  default_backend bk_web

backend bk_web                      
  balance roundrobin
  server websrv1 192.168.10.11:8080 maxconn 10000 weight 10 cookie websrv1 check
  server websrv2 192.168.10.12:8080 maxconn 10000 weight 10 cookie websrv2 check

Advanced Configuration


The configuration below allows to route requests based on either Host header (if you have a dedicated host for your websocket calls) or Connection and Upgrade header (required to switch to websocket).
In the backend dedicated to websocket, HAProxy validates the setup phase and also ensure the user is requesting a right application name.
HAProxy also performs a websocket health check, sending a Connection upgrade request and expecting a 101 response status code. We can’t go further for now on the health check for now.
Optional: the web server is hosted on Apache, but could be hosted by node.js as well.

defaults
  mode http
  log global
  option httplog
  option  http-server-close
  option  dontlognull
  option  redispatch
  option  contstats
  retries 3
  backlog 10000
  timeout client          25s
  timeout connect          5s
  timeout server          25s
# timeout tunnel available in ALOHA 5.5 or HAProxy 1.5-dev10 and higher
  timeout tunnel        3600s
  timeout http-keep-alive  1s
  timeout http-request    15s
  timeout queue           30s
  timeout tarpit          60s
  default-server inter 3s rise 2 fall 3
  option forwardfor



frontend ft_web
  bind 192.168.10.3:80 name http
  maxconn 60000

## routing based on Host header
  acl host_ws hdr_beg(Host) -i ws.
  use_backend bk_ws if host_ws

## routing based on websocket protocol header
  acl hdr_connection_upgrade hdr(Connection)  -i upgrade
  acl hdr_upgrade_websocket  hdr(Upgrade)     -i websocket

  use_backend bk_ws if hdr_connection_upgrade hdr_upgrade_websocket
  default_backend bk_web



backend bk_web                                                   
  balance roundrobin                                             
  option httpchk HEAD /                                          
  server websrv1 192.168.10.11:80 maxconn 100 weight 10 cookie websrv1 check
  server websrv2 192.168.10.12:80 maxconn 100 weight 10 cookie websrv2 check



backend bk_ws                                                    
  balance roundrobin

## websocket protocol validation
  acl hdr_connection_upgrade hdr(Connection)                 -i upgrade
  acl hdr_upgrade_websocket  hdr(Upgrade)                    -i websocket
  acl hdr_websocket_key      hdr_cnt(Sec-WebSocket-Key)      eq 1
  acl hdr_websocket_version  hdr_cnt(Sec-WebSocket-Version)  eq 1
  http-request deny if ! hdr_connection_upgrade ! hdr_upgrade_websocket ! hdr_w
ebsocket_key ! hdr_websocket_version

## ensure our application protocol name is valid 
## (don't forget to update the list each time you publish new applications)
  acl ws_valid_protocol hdr(Sec-WebSocket-Protocol) echo-protocol
  http-request deny if ! ws_valid_protocol

## websocket health checking
  option httpchk GET / HTTP/1.1rnHost:\ ws.domain.comrnConnection:\ Upgrade\r\nUpgrade:\ websocket\r\nSec-WebSocket-Key:\ haproxy\r\nSec-WebSocket-Version:\ 13\r\nSec-WebSocket-Protocol:\ echo-protocol
  http-check expect status 101

  server websrv1 192.168.10.11:8080 maxconn 30000 weight 10 cookie websrv1 check
  server websrv2 192.168.10.12:8080 maxconn 30000 weight 10 cookie websrv2 check

Note that HAProxy could also be used to select different Websocket application based on the Sec-WebSocket-Protocol header of the setup phase. (later I’ll write the article about it).

Related links

Links