Category Archives: optimization

HAProxy and gzip compression

Synopsis

Compression is a Technic to reduce object size to reduce delivery delay for objects over HTTP protocol.
Until now, HAProxy did not include such feature. But the guys at HAProxy Technologies worked hard on it (mainly David Du Colombier and @wlallemand).
HAProxy can now be considered an new option to compress HTTP streams, as well as nginx, apache or IIS which already does it.

Note that this is in early beta, so use it with care.

Compilation


Get the latest HAProxy git version, by running a “git pull” in your HAProxy git directory.
If you don’t already have such directory, then run the a:

git clone http://git.1wt.eu/git/haproxy.git

Once your HAProxy sources are updated, then you can compile HAProxy:

make TARGET=linux26 USE_ZLIB=yes

Configuration

this is a very simple configuration test:

listen ft_web
 option http-server-close
 mode http
 bind 127.0.0.1:8090 name http
 default_backend bk_web

backend bk_web
 option http-server-close
 mode http
 compression algo gzip
 compression type text/html text/plain text/css
 server localhost 127.0.0.1:80

Compression test

On my localost, I have an apache with compression disabled and a style.css object whose size is 16302 bytes.

Download without compression requested

curl -o/dev/null -D - "http://127.0.0.1:8090/style.css" 
HTTP/1.1 200 OK
Date: Fri, 26 Oct 2012 08:55:42 GMT
Server: Apache/2.2.16 (Debian)
Last-Modified: Sun, 11 Mar 2012 17:01:39 GMT
ETag: "a35d6-3fae-4bafa944542c0"
Accept-Ranges: bytes
Content-Length: 16302
Content-Type: text/css

100 16302  100 16302    0     0  5722k      0 --:--:-- --:--:-- --:--:-- 7959k

Download with compression requested

 curl -o/dev/null -D - "http://127.0.0.1:8090/style.css" -H "Accept-Encoding: gzip"
HTTP/1.1 200 OK
Date: Fri, 26 Oct 2012 08:56:28 GMT
Server: Apache/2.2.16 (Debian)
Last-Modified: Sun, 11 Mar 2012 17:01:39 GMT
ETag: "a35d6-3fae-4bafa944542c0"
Accept-Ranges: bytes
Content-Type: text/css
Transfer-Encoding: chunked
Content-Encoding: gzip

100  4036    0  4036    0     0  1169k      0 --:--:-- --:--:-- --:--:-- 1970k

In this example, object size passed from 16302 bytes to 4036 bytes.

Have fun !!!!

Links

Quand le marketing dépense des sous qui ne vont pas à l’innovation

Pas facile le load-balancing !

J’ai d’ordinaire un très grand respect pour nos concurrents, que je qualifie plutôt de confrères et qu’il m’est même déjà arrivé d’aider en privé pour au moins deux d’entre eux. Etant à l’origine du répartiteur de charge libre le plus répandu qui donna naissance à l’Aloha, je suis très bien placé pour savoir à quel point cette tâche est difficile. Aussi j’éprouve une certaine admiration pour quiconque s’engage avec succès dans cette voie et plus particulièrement pour ceux qui parviennent à enrichir leurs produits sur le long terme, chose encore plus difficile que d’en recréer un de zéro.

C’est ainsi qu’il m’arrive de rappeller à mes collègues qu’on ne doit jamais se moquer de nos confrères quand il leur arrive des expériences très désagréables comme par exemple le fait de laisser traîner une clé SSH root sur des appliances, parce que ça arrive même aux plus grands, que nous ne sommes pas à l’abri d’une boulette similaire malgré tout le soin apporté à chaque nouvelle version, et que face à une telle situation nous serions également très embarrassés.

Toutefois il en est un qui ne semble pas connaître ces règles de bonne conduite, probablement parce qu’il ne connait pas la valeur du travail de recherche et développement apporté aux produits de ses concurrents. Je ne le nommerai pas, ce serait lui faire l’économie d’une publicité alors que c’est sa seule spécialité.

En effet, depuis que ce concurrent a touché $16M d’investissements, il n’a de cesse de débiner nos produits ainsi que ceux de quelques autres de nos confrères auprès de nos partenaires, et ce de manière complètement gratuite et sans aucun fondement (ce que les anglo-saxons appellent le “FUD”), juste pour essayer de se placer.

Je pense que leur première motivation vient sans doute de leur amertume d’avoir vu leur produit systématiquement éliminé par nos clients lors des tests comparatifs sur les critères de facilité d’intégration, de performance et de stabilité. C’est d’ailleurs la cause la plus probable vu que ce concurrent s’en prend à plusieurs éditeurs à la fois et que nous rencontrons nous-mêmes sur le terrain des confrères offrant des produits de qualité tels que Cisco, et F5 pour les produits matériels, ou load-balancer.org pour le logiciel. Bon et bien sûr il y a aussi ce concurrent agressif que je ne nomme pas.

Il est vrai que cela ne doit pas être plaisant pour lui de perdre les tests à chaque fois, mais lorsque nous perdons un test face à un confrère, ce qui nous arrive comme à tous, nous nous efforçons d’améliorer notre produit sur le critère qui nous a fait défaut, dans l’objectif de gagner la fois suivante, au lieu d’investir lourdement dans des campagnes de désinformation sur le vainqueur.

A mon avis, ce concurrent ne sait pas par où commencer pour améliorer sa solution, ce qui explique qu’il s’attaque à plusieurs concurrents en même temps. Je pense que ça relèverait un peu le débat de lui donner quelques bases pour améliorer ses solutions et se donner un peu plus de chances de se placer.

Déjà, il gagnerait du temps en commençant par regarder un peu comment fonctionne notre produit et à s’en inspirer. Je n’ai vraiment pas pour habitude de copier la concurrence et je préfère très largement l’innovation. Mais pour eux ça ne devrait pas être un problème vu que déjà ils ont choisi les mêmes matériels que notre milieu de gamme, à part qu’ils en ont changé la couleur, préférant celle des cocus. Les malheureux ne savaient pas qu’un matériel de bonne qualité ne fait pas tout et que le logiciel ainsi que la qualité de l’intégration comptent énormément (sinon les VM seraient toutes les mêmes). C’est comme cela qu’ils se sont retrouvés avec un décalage dans la gamme par rapport à nous : ils ont systématiquement besoin du boîtier plus gros pour atteindre un niveau de performances comparable (je parle de performances réelles mesurées sur le terrain avec les applications des clients et les méthodologies de test des clients, pas celles de la fiche produit qu’on trouve sur leur site et qui n’intéressent pas les clients).

Et oui messieurs, il faudrait aussi se pencher un peu sur la partie logicielle, le véritable savoir faire d’un éditeur. Déjà, utiliser une distribution Linux générique de type poste de travail pour faire une appliance optimisée réseau, ce n’était pas très fin, mais s’affranchir du tuning système, ça relève de la paresse. Au lieu de perdre votre temps à chercher des manuels d’optimisation sur Internet, prenez donc les paramètres directement dans notre Aloha, vous savez qu’ils sont bons et vous gagnerez du temps. Sans doute que certains paramétrages n’existeront pas chez vous vu que nous ajoutons constamment des fonctionnalités pour mieux répondre aux besoins, mais ce sera déjà un bon début. Ne comptez quand même pas sur nous pour vous attendre, pendant que vous nous copiez, nous innovons et conserverons toujours cette longueur d’avance :-). Mais au moins vous aurez l’air moins ridicules en avant-vente et éviterez de mettre vos partenaires dans l’embarras chez le client avec un produit qui ne fonctionne toujours pas au bout de 6 heures passées sur un test simple.

Volontairement je vais publier cet article en français. Cela leur fera un petit exercice de traduction qui leur sera bénéfique pour s’implanter sur le territoire français où les clients sont très exigeants sur l’usage de la langue française que leur support ne pratique toujours pas d’ailleurs.

Ah dernier point, j’invite tous les lecteurs de cet article à chercher “Exceliance” sur Google, par exemple en cliquant sur ce lien : http://www.google.com/search?q=exceliance

Vous noterez que notre concurrent préféré a même été jusqu’à payer des Google Adwords pour faire apparaître ses publicités lorsqu’on cherche notre nom, il faut croire qu’il nous en veut vraiment! C’est le seul à déployer autant d’efforts pour essayer de nous faire de l’ombre, comme si c’était absolument stratégique pour lui. Vous ne verrez pas cela de la part d’A10, Brocade, F5 ou Cisco (ni même Exceliance bien sûr) : ces produits ont chacun des atouts sur le terrain et n’ont pas besoin de recourir à des méthodes pareilles pour exister. Pensez à cliquer sur leur lien de pub, ça leur fera plaisir, ça leur coûte un petit peu à chaque clic, et puis ça vous donnera l’occasion d’admirer leurs beaux produits :-).

Links

Use GeoIP database within HAProxy

Introduction

Sometimes we need to know the country of the user using the application, for different purposes:

  • Automatically select the most appropriate language
  • Send a 302 to redirect the user to the closest POP from his location
  • Allow only a single country to browse the site, for legal reason
  • Block some countries we don’t do business with and which are the source of most web attacks

IP based location


To achieve this purpose, the most “reliable” information we have from a user is his IP address.
Well, not as reliable as we could hope:

  • not “reliable” because it’s easy to use a proxy installed in a foreign country to fake your IP address…
  • not “reliable” because GeoIP databases are not accurate.
  • not “reliable” because GeoIP databases rely on information provided by the ISP.
  • not “reliable” because any subnets can be routed from anywhere on earth.

When an ISP does a request for a new subnet to its local RIR, it has to tell the country where it will be used.
This country is supplied as a code composed by two letters, normalized by ISO 3166.

You can use whois tool can be used to know the country code of a IP address:

whois 1.1.1.1
[...]
country:        AU
[...]

geolocation definition

Well, it’s quite easy to understand: Geolocation is the process to link a third party to a geographicl location. In easier words: know the country of a client IP address.
On Internet, such base is called GeoIP.

geolocation databases


There are a few GeoIP databases available on the Internet, most of them uses IP ranges to link an IP address to its country code.
An IP range is simply a couple of IP addresses representing the first and the last IP address of a range.
NOTE: It might correspond to a real subnet, but in most cases, it doesn’t ;).

In example:

"1.1.2.0","1.1.63.255","16843264","16859135","CN","China"

What’s the issue with HAProxy then???


HAProxy can only use CIDR notation, with real subnets.
It means we’ll have to turn the IP ranges into CIDR notation. This is not a easy task since you must split the ip range in multiple subnets…
Once done, we’ll be able to configure HAProxy to use them in ACLs and do anything an ACL can do.

For example, the range above should be translated to the following subnets:

1.1.2.0/23 "CN"
1.1.4.0/22 "CN"
1.1.8.0/21 "CN"
1.1.16.0/20 "CN"
1.1.32.0/19 "CN"

Now, you can understand why GeoIP databases uses IP ranges: it takes fewer lines 🙂

iprange tool

To ease this job, Willy released a tool called iprange in HAProxy sources contrib directory.
You can see it here, in HAProxy’s git: http://www.haproxy.org/git/?p=haproxy.git;a=tree;f=contrib/iprange
It can be used to extract CIDR subnets from an IP range.

iprange installation


Just downlaod both Makefile and iprange.c then run make:

make
gcc -s -O3 -o iprange iprange.c

too complicated 🙂

iprange usage


iprange take a single incoming format, composed by 3 columns separated by commas:

  1. first IP
  2. Last IP
  3. Country code

For example:

"1.1.2.0","1.1.63.255","CN"

NOTE: in the example below, we’ll work with Maxmind Country code lite database.

The database looks like:

$ head GeoIPCountryWhois.csv
"1.0.0.0","1.0.0.255","16777216","16777471","AU","Australia"
"1.0.1.0","1.0.3.255","16777472","16778239","CN","China"
"1.0.4.0","1.0.7.255","16778240","16779263","AU","Australia"
"1.0.8.0","1.0.15.255","16779264","16781311","CN","China"
"1.0.16.0","1.0.31.255","16781312","16785407","JP","Japan"
"1.0.32.0","1.0.63.255","16785408","16793599","CN","China"
"1.0.64.0","1.0.127.255","16793600","16809983","JP","Japan"
"1.0.128.0","1.0.255.255","16809984","16842751","TH","Thailand"
"1.1.0.0","1.1.0.255","16842752","16843007","CN","China"
"1.1.1.0","1.1.1.255","16843008","16843263","AU","Australia"

In order to make it compatible with iprange tool, use cut:

$ cut -d, -f1,2,5 GeoIPCountryWhois.csv | head
"1.0.0.0","1.0.0.255","AU"
"1.0.1.0","1.0.3.255","CN"
"1.0.4.0","1.0.7.255","AU"
"1.0.8.0","1.0.15.255","CN"
"1.0.16.0","1.0.31.255","JP"
"1.0.32.0","1.0.63.255","CN"
"1.0.64.0","1.0.127.255","JP"
"1.0.128.0","1.0.255.255","TH"
"1.1.0.0","1.1.0.255","CN"
"1.1.1.0","1.1.1.255","AU"

Now, you can use it with iprange:

$ cut -d, -f1,2,5 GeoIPCountryWhois.csv | head | ./iprange 
1.0.0.0/24 "AU"
1.0.1.0/24 "CN"
1.0.2.0/23 "CN"
1.0.4.0/22 "AU"
1.0.8.0/21 "CN"
1.0.16.0/20 "JP"
1.0.32.0/19 "CN"
1.0.64.0/18 "JP"
1.0.128.0/17 "TH"
1.1.0.0/24 "CN"
1.1.1.0/24 "AU"

Country codes and HAProxy ACLs


Now we’re ready to turn IP ranges into subnets associated to a country code.
Still need to be able to use it in HAProxy.
The easiest is to write all the subnets concerning a country code in a single file.

$ cut -d, -f1,2,5 GeoIPCountryWhois.csv | ./iprange | sed 's/"//g' 
| awk -F' ' '{ print $1 >> $2".subnets" }'

And the result is nice:

$ ls *.subnets
A1.subnets  AX.subnets  BW.subnets  CX.subnets  FJ.subnets  GR.subnets  IR.subnets  LA.subnets  ML.subnets  NF.subnets  PR.subnets  SI.subnets  TK.subnets  VE.subnets
A2.subnets  AZ.subnets  BY.subnets  CY.subnets  FK.subnets  GS.subnets  IS.subnets  LB.subnets  MM.subnets  NG.subnets  PS.subnets  SJ.subnets  TL.subnets  VG.subnets
AD.subnets  BA.subnets  BZ.subnets  CZ.subnets  FM.subnets  GT.subnets  IT.subnets  LC.subnets  MN.subnets  NI.subnets  PT.subnets  SK.subnets  TM.subnets  VI.subnets
AE.subnets  BB.subnets  CA.subnets  DE.subnets  FO.subnets  GU.subnets  JE.subnets  LI.subnets  MO.subnets  NL.subnets  PW.subnets  SL.subnets  TN.subnets  VN.subnets
AF.subnets  BD.subnets  CC.subnets  DJ.subnets  FR.subnets  GW.subnets  JM.subnets  LK.subnets  MP.subnets  NO.subnets  PY.subnets  SM.subnets  TO.subnets  VU.subnets
AG.subnets  BE.subnets  CD.subnets  DK.subnets  GA.subnets  GY.subnets  JO.subnets  LR.subnets  MQ.subnets  NP.subnets  QA.subnets  SN.subnets  TR.subnets  WF.subnets
AI.subnets  BF.subnets  CF.subnets  DM.subnets  GB.subnets  HK.subnets  JP.subnets  LS.subnets  MR.subnets  NR.subnets  RE.subnets  SO.subnets  TT.subnets  WS.subnets
AL.subnets  BG.subnets  CG.subnets  DO.subnets  GD.subnets  HN.subnets  KE.subnets  LT.subnets  MS.subnets  NU.subnets  RO.subnets  SR.subnets  TV.subnets  YE.subnets
AM.subnets  BH.subnets  CH.subnets  DZ.subnets  GE.subnets  HR.subnets  KG.subnets  LU.subnets  MT.subnets  NZ.subnets  RS.subnets  ST.subnets  TW.subnets  YT.subnets
AN.subnets  BI.subnets  CI.subnets  EC.subnets  GF.subnets  HT.subnets  KH.subnets  LV.subnets  MU.subnets  OM.subnets  RU.subnets  SV.subnets  TZ.subnets  ZA.subnets
AO.subnets  BJ.subnets  CK.subnets  EE.subnets  GG.subnets  HU.subnets  KI.subnets  LY.subnets  MV.subnets  PA.subnets  RW.subnets  SY.subnets  UA.subnets  ZM.subnets
AP.subnets  BM.subnets  CL.subnets  EG.subnets  GH.subnets  ID.subnets  KM.subnets  MA.subnets  MW.subnets  PE.subnets  SA.subnets  SZ.subnets  UG.subnets  ZW.subnets
AQ.subnets  BN.subnets  CM.subnets  EH.subnets  GI.subnets  IE.subnets  KN.subnets  MC.subnets  MX.subnets  PF.subnets  SB.subnets  TC.subnets  UM.subnets
AR.subnets  BO.subnets  CN.subnets  ER.subnets  GL.subnets  IL.subnets  KP.subnets  MD.subnets  MY.subnets  PG.subnets  SC.subnets  TD.subnets  US.subnets
AS.subnets  BR.subnets  CO.subnets  ES.subnets  GM.subnets  IM.subnets  KR.subnets  ME.subnets  MZ.subnets  PH.subnets  SD.subnets  TF.subnets  UY.subnets
AT.subnets  BS.subnets  CR.subnets  ET.subnets  GN.subnets  IN.subnets  KW.subnets  MG.subnets  NA.subnets  PK.subnets  SE.subnets  TG.subnets  UZ.subnets
AU.subnets  BT.subnets  CU.subnets  EU.subnets  GP.subnets  IO.subnets  KY.subnets  MH.subnets  NC.subnets  PL.subnets  SG.subnets  TH.subnets  VA.subnets
AW.subnets  BV.subnets  CV.subnets  FI.subnets  GQ.subnets  IQ.subnets  KZ.subnets  MK.subnets  NE.subnets  PM.subnets  SH.subnets  TJ.subnets  VC.subnets

Which makes subnets available for 246 countries !!!!!

In example, the subnets associated to AUstralia are:

$ cat AU.subnets 
1.0.0.0/24
1.0.4.0/22
1.1.1.0/24
[...]

The bash loop below prepares the ACLs in a file called haproxy.cfg:

$ for f in `ls *.subnets` ; do echo $f | 
awk -F'.' '{ print "acl "$1" src -f "$0 >> "haproxy.cfg" }' ; done
$ head haproxy.cfg 
acl src A1 -f A1.subnets
acl src A2 -f A2.subnets
acl src AD -f AD.subnets
acl src AE -f AE.subnets
acl src AF -f AF.subnets
acl src AG -f AG.subnets
acl src AI -f AI.subnets
acl src AL -f AL.subnets
acl src AM -f AM.subnets
acl src AN -f AN.subnets

That makes a lot of countries 🙂

Continent codes and HAProxy ACLs


Fortunately, we can summarizes it to continent. Copy and paste into a file the country code and continent relation from maxmind website: http://www.maxmind.com/app/country_continent.

The script below will create files with named with the continent name and containing the country codes related to it:

$ for c in `fgrep -v '-' country_continents.txt | sort -t',' -k 2` ; 
do echo $c | awk -F',' '{ print $1 >> $2".continent" }' ; done

We have now 7 new files:

$ ls *.continent
AF.continent  AN.continent  AS.continent  EU.continent  
NA.continent  OC.continent  SA.continent

Let’s have a look for countries in South America:

$ cat SA.continent 
AR
BO
BR
CL
CO
EC
FK
GF
GY
PE
PY
SR
UY
VE

Let’s aggregate subnets for each country in a continent into a single file:

$ for f in `ls *.continent` ; do for c in $(cat $f) ; 
do cat ${c}.subnets >> ${f%%.*}.subnets ; done ;  done

Now we can generate the HAProxy configuration file to use them:

$ for c in AF AN AS EU NA OC SA ; do 
echo acl $c src -f $c.subnets >> "haproxy.conf" ; done

Usage in HAProxy


Coming soon, an article giving some examples on how to use the files generated below to improve performance and security of your platforms.

Links

Enhanced SSL load-balancing with Server Name Indication (SNI) TLS extension

Synopsis

Some time ago, we wrote an article which explained how to load-balance SSL services, maintaining affinity using the SSLID.
The main limitation of this kind of architecture is that you must dedicate a public IP address and port per service.
If you’re hosting web or mail services, you could run out of public IP address quickly.

TLS protocol has been extended in 2003, RFC 3546, by an extension called SNI: Server Name Indication, which allows a client to announce in clear the server name it is contacting.

NOTE: two RFC have obsoleted the one above, the latest one is RFC 6066

The Aloha Load-balancer can use this information to choose a backend or a server.
This allows people to share a single VIP for several services.

Of course, we can use SNI switching with SSLID affinity to build a smart and reliable SSL load-balanced platform.

NOTE: Server Name information is sent with each SSL Handshake, whether you’re establishing a new session or you’re resuming an old one.

SNI is independent from the protocol used at layer 7. So basically, it will work with IMAP, HTTP, SMTP, POP, etc…

Limitation

Bear in mind, that in 2012, not all clients are compatible with SNI.
Concerning web browsers, a few of used in 2012 them are still not compatible with this TLS protocol extension.

We strongly recommend you to read the Wikipedia Server Name Indication page which lists all the limitation of this extension.

  • Only HAProxy nightly snapshots from 8th of April are compatible (with no bug knows) with it.
  • Concerning Aloha, it will be available by Aloha Load-balancer firmware 5.0.2.

Diagram

The picture below shows a platform with a single VIP which host services for 2 applications:
sni_loadbalancing

We can use SNI information to choose a backend, then, inside a backend, we can use SSLID affinity.

Configuration

Choose a backend using SNI TLS extension


The configuration below matches names provided by the SNI extention and choose a farm based on it.
In the farm, it provides SSLID affinity.
If no SNI extention is sent, then we redirect the user to a server farm which can be used to tell the user to upgrade its software.

# Adjust the timeout to your needs
defaults
  timeout client 30s
  timeout server 30s
  timeout connect 5s

# Single VIP with sni content switching
frontend ft_ssl_vip
  bind 10.0.0.10:443
  mode tcp

  tcp-request inspect-delay 5s
  tcp-request content accept if { req_ssl_hello_type 1 }
  
  acl application_1 req_ssl_sni -i application1.domain.com
  acl application_2 req_ssl_sni -i application2.domain.com

  use_backend bk_ssl_application_1 if application_1
  use_backend bk_ssl_application_2 if application_2

  default_backend bk_ssl_default

# Application 1 farm description
backend bk_ssl_application_1
  mode tcp
  balance roundrobin

  # maximum SSL session ID length is 32 bytes.
  stick-table type binary len 32 size 30k expire 30m

  acl clienthello req_ssl_hello_type 1
  acl serverhello rep_ssl_hello_type 2

  # use tcp content accepts to detects ssl client and server hello.
  tcp-request inspect-delay 5s
  tcp-request content accept if clienthello

  # no timeout on response inspect delay by default.
  tcp-response content accept if serverhello

  stick on payload_lv(43,1) if clienthello

  # Learn on response if server hello.
  stick store-response payload_lv(43,1) if serverhello

  option ssl-hello-chk
  server server1 192.168.1.1:443 check
  server server2 192.168.1.2:443 check

# Application 2 farm description
backend bk_ssl_application_2
  mode tcp
  balance roundrobin

  # maximum SSL session ID length is 32 bytes.
  stick-table type binary len 32 size 30k expire 30m

  acl clienthello req_ssl_hello_type 1
  acl serverhello rep_ssl_hello_type 2

  # use tcp content accepts to detects ssl client and server hello.
  tcp-request inspect-delay 5s
  tcp-request content accept if clienthello

  # no timeout on response inspect delay by default.
  tcp-response content accept if serverhello

  stick on payload_lv(43,1) if clienthello

  # Learn on response if server hello.
  stick store-response payload_lv(43,1) if serverhello

  option ssl-hello-chk
  server server1 192.168.2.1:443 check
  server server2 192.168.2.2:443 check

# Sorry backend which should invite the user to update its client
backend bk_ssl_default
  mode tcp
  balance roundrobin
  
  # maximum SSL session ID length is 32 bytes.
  stick-table type binary len 32 size 30k expire 30m

  acl clienthello req_ssl_hello_type 1
  acl serverhello rep_ssl_hello_type 2

  # use tcp content accepts to detects ssl client and server hello.
  tcp-request inspect-delay 5s
  tcp-request content accept if clienthello

  # no timeout on response inspect delay by default.
  tcp-response content accept if serverhello

  stick on payload_lv(43,1) if clienthello

  # Learn on response if server hello.
  stick store-response payload_lv(43,1) if serverhello

  option ssl-hello-chk
  server server1 10.0.0.11:443 check
  server server2 10.0.0.12:443 check

Choose a server using SNI: aka SSL routing


The configuration below matches names provided by the SNI extention and choose a server based on it.
If no SNI is provided or we can’t find the expected name, then the traffic is forwarded to server3 which can be used to tell the user to upgrade its software.

# Adjust the timeout to your needs
defaults
  timeout client 30s
  timeout server 30s
  timeout connect 5s

# Single VIP 
frontend ft_ssl_vip
  bind 10.0.0.10:443
  mode tcp

  tcp-request inspect-delay 5s
  tcp-request content accept if { req_ssl_hello_type 1 }

  default_backend bk_ssl_default

# Using SNI to take routing decision
backend bk_ssl_default
  mode tcp

  acl application_1 req_ssl_sni -i application1.domain.com
  acl application_2 req_ssl_sni -i application2.domain.com

  use-server server1 if application_1
  use-server server2 if application_2
  use-server server3 if !application_1 !application_2

  option ssl-hello-chk
  server server1 10.0.0.11:443 check
  server server2 10.0.0.12:443 check
  server server3 10.0.0.13:443 check

Related Links

Links

load balancing, affinity, persistence, sticky sessions: what you need to know

Synopsis

To ensure high availability and performance of Web applications, it is now common to use a load-balancer.
While some people uses layer 4 load-balancers, it can be sometime recommended to use layer 7 load-balancers to be more efficient with HTTP protocol.

NOTE: To understand better the difference between such load-balancers, please read the Load-Balancing FAQ.

A load-balancer in an infrastructure

The picture below shows how we usually install a load-balancer in an infrastructure:
reverse_proxy
This is a logical diagram. When working at layer 7 (aka Application layer), the load-balancer acts as a reverse proxy.
So, from a physical point of view, it can be plugged anywhere in the architecture:

  • in a DMZ
  • in the server LAN
  • as front of the servers, acting as the default gateway
  • far away in an other separated datacenter

Why does load-balancing web application is a problem????


Well, HTTP is not a connected protocol: it means that the session is totally independent from the TCP connections.
Even worst, an HTTP session can be spread over a few TCP connections…
When there is no load-balancer involved, there won’t be any issues at all, since the single application server will be aware the session information of all users, and whatever the number of client connections, they are all redirected to the unique server.
When using several application servers, then the problem occurs: what happens when a user is sending requests to a server which is not aware of its session?
The user will get back to the login page since the application server can’t access his session: he is considered as a new user.

To avoid this kind of problem, there are several ways:

  • Use a clustered web application server where the session are available for all the servers
  • Sharing user’s session information in a database or a file system on application servers
  • Use IP level information to maintain affinity between a user and a server
  • Use application layer information to maintain persistance between a user and a server

NOTE: you can mix different technc listed above.

Building a web application cluster

Only a few products on the market allow administrators to create a cluster (like Weblogic, tomcat, jboss, etc…).
I’ve never configured any of them, but from Administrators I talk too, it does not seem to be an easy task.
By the way, for Web applications, clustering does not mean scaling. Later, I’ll write an article explaining while even if you’re clustering, you still may need a load-balancer in front of your cluster to build a robust and scalable application.

Sharing user’s session in a database or a file system

This Technic applies to application servers which has no clustering features, or if you don’t want to enable cluster feature from.
It is pretty simple, you choose a way to share your session, usually a file system like NFS or CIFS, or a Database like MySql or SQLServer or a memcached then you configure each application server with proper parameters to share the sessions and to access them if required.
I’m not going to give any details on how to do it here, just google with proper keywords and you’ll get answers very quickly.

IP source affinity to server

An easy way to maintain affinity between a user and a server is to use user’s IP address: this is called Source IP affinity.
There are a lot of issues doing that and I’m not going to detail them right now (TODO++: an other article to write).
The only thing you have to know is that source IP affinity is the latest method to use when you want to “stick” a user to a server.
Well, it’s true that it will solve our issue as long as the user use a single IP address or he never change his IP address during the session.

Application layer persistence

Since a web application server has to identify each users individually, to avoid serving content from a user to an other one, we may use this information, or at least try to reproduce the same behavior in the load-balancer to maintain persistence between a user and a server.
The information we’ll use is the Session Cookie, either set by the load-balancer itself or using one set up by the application server.

What is the difference between Persistence and Affinity

Affinity: this is when we use an information from a layer below the application layer to maintain a client request to a single server
Persistence: this is when we use Application layer information to stick a client to a single server
sticky session: a sticky session is a session maintained by persistence

The main advantage of the persistence over affinity is that it’s much more accurate, but sometimes, Persistence is not doable, so we must rely on affinity.

Using persistence, we mean that we’re 100% sure that a user will get redirected to a single server.
Using affinity, we mean that the user may be redirected to the same server…

What is the interraction with load-balancing???


In load-balancer you can choose between several algorithms to pick up a server from a web farm to forward your client requests to.
Some algorithm are deterministic, which means they can use a client side information to choose the server and always send the owner of this information to the same server. This is where you usually do Affinity 😉 IE: “balance source”
Some algorithm are not deterministic, which means they choose the server based on internal information, whatever the client sent. This is where you don’t do any affinity nor persistence 🙂 IE: “balance roundrobin” or “balance leastconn”
I don’t want to go too deep in details here, this can be the purpose of a new article about load-balancing algorithms…

You may be wondering: “we have not yet speak about persistence in this chapter”. That’s right, let’s do it.
As we saw previously, persistence means that the server can be chosen based on application layer information.
This means that persistence is an other way to choose a server from a farm, as load-balancing algorithm does.

Actually, session persistence has precedence over load-balancing algorithm.
Let’s show this on a diagram:

      client request
            |
            V
    HAProxy Frontend
            |
            V
     backend choice
            |
            V
    HAproxy Backend
            |
            V
Does the request contain
persistence information ---------
            |                   |
            | NO                |
            V                   |
    Server choice by            |  YES
load-balancing algorithm        |
            |                   |
            V                   |
   Forwarding request <----------
      to the server

Which means that when doing session persistence in a load balancer, the following happens:

  • the first user’s request comes in without session persistence information
  • the request bypass the session persistence server choice since it has no session persistence information
  • the request pass through the load-balancing algorithm, where a server is chosen and affected to the client
  • the server answers back, setting its own session information
  • depending on its configuration, the load-balancer can either use this session information or setup its own before sending the response back to the client
  • the client sends a second request, now with session information he learnt during the first request
  • the load-balancer choose the server based on the client side information
  • the request DOES NOT PASS THROUGH the load-balancing algorithm
  • the server answers the request

and so on…

At HAProxy Technologies we say that “Persistence is a exception to load-balancing“.
And the demonstration is just above.

Affinity configuration in HAProxy / Aloha load-balancer

The configuration below shows how to do affinity within HAProxy, based on client IP information:

frontend ft_web
  bind 0.0.0.0:80
  default_backend bk_web

backend bk_web
  balance source
  hash-type consistent # optional
  server s1 192.168.10.11:80 check
  server s2 192.168.10.21:80 check

Web application persistence

In order to provide persistence at application layer, we usually use Cookies.
As explained previously, there are two ways to provide persistence using cookies:

  • Let the load-balancer set up a cookie for the session.
  • Using application cookies, such as ASP.NET_SessionId, JSESSIONID, PHPSESSIONID, or any other chosen name

Session cookie setup by the Load-Balancer

The configuration below shows how to configure HAProxy / Aloha load balancer to inject a cookie in the client browser:

frontend ft_web
  bind 0.0.0.0:80
  default_backend bk_web

backend bk_web
  balance roundrobin
  cookie SERVERID insert indirect nocache
  server s1 192.168.10.11:80 check cookie s1
  server s2 192.168.10.21:80 check cookie s2

Two things to notice:

  1. the line “cookie SERVERID insert indirect nocache”:
    This line tells HAProxy to setup a cookie called SERVERID only if the user did not come with such cookie. It is going to append a “Cache-Control: nocache” as well, since this type of traffic is supposed to be personnal and we don’t want any shared cache on the internet to cache it
  2. the statement “cookie XXX” on the server line definition:
    It provides the value of the cookie inserted by HAProxy. When the client comes back, then HAProxy knows directly which server to choose for this client.

So what happens?

  1. At the first response, HAProxy will send the client the following header, if the server chosen by the load-balancing algorithm is s1:
    Set-Cookie: SERVERID=s1
  2. For the second request, the client will send a request containing the header below:
    Cookie: SERVERID=s1

Basically, this kind of configuration is compatible with active/active Aloha load-balancer cluster configuration.

Using application session cookie for persistence

The configuration below shows how to configure HAProxy / Aloha load balancer to use the cookie setup by the application server to maintain affinity between a server and a client:

frontend ft_web
  bind 0.0.0.0:80
  default_backend bk_web

backend bk_web
  balance roundrobin
  cookie JSESSIONID prefix nocache
  server s1 192.168.10.11:80 check cookie s1
  server s2 192.168.10.21:80 check cookie s2

Just replace JSESSIONID by your application cookie. It can be anything, like the default ones from PHP and IIS: PHPSESSID and ASP.NET_SessionId.

So what happens?

  1. At the first response, the server will send the client the following header
    Set-Cookie: JSESSIONID=i12KJF23JKJ1EKJ21213KJ
  2. when passing through HAProxy, the cookie is modified like this:
    Set-Cookie: JSESSIONID=s1~i12KJF23JKJ1EKJ21213KJ


    Note that the Set-Cookie header has been prefixed by the server cookie value (“s1” in this case) and a “~” is used as a separator between this information and the cookie value.

  3. For the second request, the client will send a request containing the header below:
    Cookie: JSESSIONID=s1~i12KJF23JKJ1EKJ21213KJ
  4. HAProxy will clean it up on the fly to set it up back like the origin:
    Cookie: JSESSIONID=i12KJF23JKJ1EKJ21213KJ

Basically, this kind of configuration is compatible with active/active Aloha load-balancer cluster configuration.

What happens when my server goes down???

When doing persistence, if a server goes down, then HAProxy will redispatch the user to an other server.
Since the user will get connected on a new server, then this one may not be aware of the session, so be redirected to the login page.
But this is not a load-balancer problem, this is related to the application server farm.

Links

Web traffic limitation

Synopsis

For different reason, we may want to limit the number of connections or the number of requests we allow to a web farm.
In example:

  • give more capacity to authenticated users compared to anonymous one
  • limit web farm users per virtualhost
  • protect your website from spiders
  • etc…

Basically, we’ll manage two webfarm, one with as much as capacity as we need, and an other one where we’ll redirect people we want to slow down.
The routing decision can be taken using a header, a cookie, a part of the url, source IP address, etc…

Configuration

The configuration below would do the job.

There are only two webservers in the farm, but we want to slow down some virtual host or old and almost never used applications in order to protect and let more capacity to the regular traffic.

you can play with the inspect-delay time to be more or less aggressive.

frontend www
  bind :80
  mode http
  acl spiderbots hdr_cnt(User-Agent) eq 0
  acl personnal hdr(Host) www.personnalwebsite.tld www.oldname.tld
  acl oldies path_beg /old /foo /bar
  use_backend limited_www if spiderbots or personnal or oldies
  default_backend www

backend www
 mode http
 server be1  192.168.0.1:80 check maxconn 100
 server be1  192.168.0.2:80 check maxconn 100

backend limited_www
 mode http
 acl too_fast be_sess_rate gt 10
 acl too_many be_conn gt 10
 tcp-request inspect-delay 3s
 tcp-request content accept if ! too_fast or ! too_many
 tcp-request content accept if WAIT_END
 server be1  192.168.0.1:80 check maxconn 100
 server be1  192.168.0.2:80 check maxconn 100

Results

Without the example above, an apache bench would be able to go up to 3600 req/s on the regular farm and only 9 req/s on the limited one.

Related articles

Links

Scaling out SSL

Synopsis

We’ve seen recently how we could scale up SSL performance.
But what about scaling out SSL performance?
Well, thanks to Aloha and HAProxy, it’s easy to manage smartly a farm of SSL accelerator servers, using persistence based on the SSL Session ID.
This way of load-balancing is smart, but in case of SSL accelerator failure, other servers in the farm would have a CPU overhead to generate SSL Session IDs for sessions re-balanced by the Aloha.

After a talk with (the famous) emericbr, HAProxy Technologies dev team leader, he decided to write a patch for stud to add a new feature: sharing SSL session between different stud processes.
That way, in case of SSL accelerator failure, the servers getting re-balanced sessions would not have to generate a new SSL session.

Emericbr’s patch is available here: https://github.com/bumptech/stud/pull/50
At the end of this article, you’ll learn how to use it.

Stud SSL Session shared caching

Description

As we’ve seen in our article on SSL performance, a good way to improve performance on SSL is to use a SSL Session ID cache.

The idea here, is to use this cache as well as sending updates into a shared cache one can consult to get the SSL Session ID and the data associated to it.

As a consequence, there are 2 levels of cache:

      * Level 1: local process cache, with the currently used SSL session
      * Level 2: shared cache, with the SSL session from all local cache

Way of working

The protocol understand 3 types of request:

      * New: When a process generates a new session, it updates its local cache then the shared cache
      * Get: When a client tries to resume a session and the process receiving it is not aware of it, then the process tries to get it from the shared cache
      * Del: When a session has expired or there is a bad SSL ID sent by a client, then the process will delete the session from the shared cache

Who does what?

Stud has a Father/Son architecture.
The Father starts up then starts up Sons. The Sons bind the TCP external ports, load the certificate and process the SSL requests.
Each son manages its local cache and send the updates to the shared cache. The Father manages the shared cache, receiving the changes and maintaining it up to date.

How are the updates exchanged?

Updates are sent either on Unicast or Multicast, on a specified UDP port.
Updates are compatible both IPv4 and IPv6.
Each packet are signed by an encrypted signature using the SSL certificate, to avoid cache poisoning.

What does a packet look like?


SSL Session ID ASN-1 of SSL Session structure Timestamp Signature
[32 bytes] [max 512 bytes] [4 bytes] [20 bytes]

Note: the SSL Session ID field is padded with 0 if required

Diagram

Let’s show this on a nice picture where each potato represents each process memory area.
stud_shared_cache
Here, the son on host 1 got a new SSL connection to process, since he could not find it in its cache and in the shared cache, he generated the asymmetric key, then push it to his shared cache and the father on host 2 which updates the shared cache for this host.
That way, if this user is routed to any stud son process, he would not have to compute again its asymmetric key.

Let’s try Stud shared cache

Installation:

git clone https://github.com/EmericBr/stud.git
cd stud
wget http://1wt.eu/tools/ebtree/ebtree-6.0.6.tar.gz
tar xvzf ebtree-6.0.6.tar.gz
ln -s ebtree-6.0.6 ebtree
make USE_SHARED_CACHE=1

Generate a key and a certificate, add them in a single file.

Now you can run stud:

sudo ./stud -n 2 -C 10000 -U 10.0.3.20,8888 -P 10.0.0.17 -f 10.0.3.20,443 -b 10.0.0.3,80 cert.pem

and run a test:

curl --noproxy * --insecure -D - https://10.0.3.20:443/

And you can watch the synchronization packets:

$ sudo tcpdump -n -i any port 8888
[sudo] password for bassmann: 
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes

17:47:10.557362 IP 10.0.3.20.8888 > 10.0.0.17.8888: UDP, length 176
17:49:04.592522 IP 10.0.3.20.8888 > 10.0.0.17.8888: UDP, length 176
17:49:05.476032 IP 10.0.3.20.8888 > 10.0.0.17.8888: UDP, length 176

Related links