Scaling out SSL


We’ve seen recently how we could scale up SSL performance.
But what about scaling out SSL performance?
Well, thanks to Aloha and HAProxy, it’s easy to manage smartly a farm of SSL accelerator servers, using persistence based on the SSL Session ID.
This way of load-balancing is smart, but in case of SSL accelerator failure, other servers in the farm would have a CPU overhead to generate SSL Session IDs for sessions re-balanced by the Aloha.

After a talk with (the famous) emericbr, HAProxy Technologies dev team leader, he decided to write a patch for stud to add a new feature: sharing SSL session between different stud processes.
That way, in case of SSL accelerator failure, the servers getting re-balanced sessions would not have to generate a new SSL session.

Emericbr’s patch is available here:
At the end of this article, you’ll learn how to use it.

Stud SSL Session shared caching


As we’ve seen in our article on SSL performance, a good way to improve performance on SSL is to use a SSL Session ID cache.

The idea here, is to use this cache as well as sending updates into a shared cache one can consult to get the SSL Session ID and the data associated to it.

As a consequence, there are 2 levels of cache:

      * Level 1: local process cache, with the currently used SSL session
      * Level 2: shared cache, with the SSL session from all local cache

Way of working

The protocol understand 3 types of request:

      * New: When a process generates a new session, it updates its local cache then the shared cache
      * Get: When a client tries to resume a session and the process receiving it is not aware of it, then the process tries to get it from the shared cache
      * Del: When a session has expired or there is a bad SSL ID sent by a client, then the process will delete the session from the shared cache

Who does what?

Stud has a Father/Son architecture.
The Father starts up then starts up Sons. The Sons bind the TCP external ports, load the certificate and process the SSL requests.
Each son manages its local cache and send the updates to the shared cache. The Father manages the shared cache, receiving the changes and maintaining it up to date.

How are the updates exchanged?

Updates are sent either on Unicast or Multicast, on a specified UDP port.
Updates are compatible both IPv4 and IPv6.
Each packet are signed by an encrypted signature using the SSL certificate, to avoid cache poisoning.

What does a packet look like?

SSL Session ID ASN-1 of SSL Session structure Timestamp Signature
[32 bytes] [max 512 bytes] [4 bytes] [20 bytes]

Note: the SSL Session ID field is padded with 0 if required


Let’s show this on a nice picture where each potato represents each process memory area.
Here, the son on host 1 got a new SSL connection to process, since he could not find it in its cache and in the shared cache, he generated the asymmetric key, then push it to his shared cache and the father on host 2 which updates the shared cache for this host.
That way, if this user is routed to any stud son process, he would not have to compute again its asymmetric key.

Let’s try Stud shared cache


git clone
cd stud
tar xvzf ebtree-6.0.6.tar.gz
ln -s ebtree-6.0.6 ebtree

Generate a key and a certificate, add them in a single file.

Now you can run stud:

sudo ./stud -n 2 -C 10000 -U,8888 -P -f,443 -b,80 cert.pem

and run a test:

curl --noproxy * --insecure -D -

And you can watch the synchronization packets:

$ sudo tcpdump -n -i any port 8888
[sudo] password for bassmann: 
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes

17:47:10.557362 IP > UDP, length 176
17:49:04.592522 IP > UDP, length 176
17:49:05.476032 IP > UDP, length 176

Related links