Tag Archives: ecommerce

What is a slow POST Attack and how to turn HAProxy into your first line of Defense?

One of the biggest security challenges that companies face in today’s modern climate is the POST attack. Unlike a more traditional “Denial-of-Service” attack, POST attacks target a servers logical resources – making them particularly powerful when executed.

What is a slow POST Attack?


In a POST attack, an attacker begins by sending a legitimate HTTP POST header to a Web server, exactly as they would under normal circumstances. The header specifies the exact size of the message body that will then follow. However, that message body is then sent at an alarmingly low rate – sometimes as slow as 1 byte per approximately two minutes. Because the entire message is technically correct and complete, the targeted server attempts to obey all specified rules – which as you would expect, can take quite awhile. The issue is that if an attacker were to establish hundreds or even thousands of these POST attacks simultaneously, it will quickly use up all server resources and make legitimate connections impossible.

How HAProxy can protect against slow POST attack?


Because POST attacks can be incredibly powerful, it’s always important to have a tool in place to identify these types of issues when they’re still in their nascent stages to prevent them from becoming much larger, more serious issues down the road. Because HAProxy was designed as an application delivery controller to manage Web application high availability and performance, it is already in an ideal position to stop these types of POST attacks in their tracks.

HAProxy Configuration Example

Because of HAProxy‘s structure and configuration flexibility, many professionals and consumers alike often use it as a security tool. Case in point: by using the following configuration example, you can easily help protect your servers against POST attacks to prevent attackers from clogging resources and ultimately harming the well-being of not only your equipment but your entire organization at the same time.

frontend ft_myapp
 [...]
 option http-buffer-request
 timeout http-request 10s

As you can see, with just a few simple modifications, HAProxy can quickly and effortlessly remove POST attacks from the list of things you have to worry about on a daily basis with regards to your mission-critical business applications and API.
The option http-buffer-request instructs HAProxy to wait for the whole DATA before forwarding it to a server and the timeout http-request 10s option tells how much time HAProxy let to a client to send the whole POST.

Thanks to its functionality as a security tool, a reverse proxy and more in addition to its intended functionality as a load balancer, it’s easy to see why HAProxy is used by some of the largest sites on the Internet including Reddit, Tumblr, GitHub and more on a daily basis.

This function is available in the following versions of HAProxy:

Related links

Links

HAProxy and HTTP Strict Transport Security (HSTS) header in HTTP redirects

SSL/TLS and HSTS

SSL everywhere is on its way.
Unfortunately, many applications were written for HTTP only and switching to HTTPs is not an easy and straight forward path. Read more here about impact of TLS offloading (when a third party tool perform TLS in front of your web application servers).

A mechanism called HTTP Strict Transport Security (HSTS) has been introduced through the RFC 6797.

HSTS main purpose is to let the application server to instruct the client it’s supposed to get connected only a ciphered and secured HTTPs connection when browsing the application.
It means that of course, that both the client and the server must be compatible…
That way, the application cookie is protected on its way from the client’s browser to the remote TLS endpoint (either the load-balancer or the application server). No cookie hijacking is possible on the wire.

HAProxy configuration for Strict-Transport-Security HTTP header

HSTS header insertion in server responses

To insert the header in every server response, you can use the following HAProxy directive, in HAProxy 1.5:

 # 16000000 seconds: a bit more than 6 months
 http-response set-header Strict-Transport-Security max-age=16000000;\ includeSubDomains;\ preload;

With the upcoming HAProxy 1.6, and thanks to William’s work, we can now get rid of these ugly backslashes:

 # 16000000 seconds: a bit more than 6 months
 http-response set-header Strict-Transport-Security "max-age=16000000; includeSubDomains; preload;"

Inserting HSTS header in HTTP redirects


When HAProxy has to perform HTTP redirects, it does in at the moment of the client request, through the http-request rules.
Since we want to insert a header in the response, we can use the http-response rules. Unfortunately, these rules are enabled when HAProxy get traffic from a backend server.
Here is the trick: we do perform the http-request redirect rule in a dedicated frontend where we route traffic to. That way, our application backend or frontend can perform HSTS insertion.

A simple configuration sniplet is usually easier to explain:

frontend fe_myapp
 bind :443 ssl crt /path/to/my/cert.pem
 bind :80
 use_backend be_dummy if !{ ssl_fc }
 default_backend be_myapp

backend be_myapp
 http-response set-header Strict-Transport-Security max-age=16000000;\ includeSubDomains;\ preload;
 server s1 10.0.0.1:80

be_dummy
 server haproxy_fe_dummy_ssl_redirect 127.0.0.1:8000

frontend fe_dummy
 bind 127.0.0.1:8000
 http-request redirect scheme https

Links

HAProxy’s load-balancing algorithm for static content delivery with Varnish

HAProxy’s load-balancing algorithms

HAProxy supports many load-balancing algorithms which may be used in many different type of cases.
That said, cache servers, which deliver most of the time the static content from your web applications, may require some specific load-balancing algorithms.

HAProxy stands in front of your cache server for some good reasons:

  • SSL offloading (read PHK’s feeling about SSL, Varnish and HAProxy)
  • HTTP content switching capabilities
  • advanced load-balancing algorithms

The main purpose of this article is to show how HAProxy can be used to aggregate Varnish servers memory storage in some kind of “JBOD” mode (like the “Just a Bunch Of Disks“).
Main purpose of the examples delivered here are to optimize the resources on the cache, mainly its memory, in order to improve the HIT rate. This will also improve your application response time and make your site top ranked on google 🙂

Content Switching in HAProxy

This has been covered many times on this blog.
As a quick introduction for readers who are not familiar with HAProxy, let’s explain how it works.

Clients will get connected to HAProxy through a Frontend. Then HAProxy routes traffic to a backend (server farm) where the load-balancing algorithm is used to choose a server.
A frontend can points to multiple backends and the choice of a backend is made through acls and use_backend rules..
Acls can be formed using fetches. A fetch is a directive which instructs HAProxy where to get content from.

Enough theory, let’s make a practical example: splitting static and dynamic traffic using the following rules:

  • Static content is hosted on domain names starting by ‘static.’ and ‘images.’
  • Static content files extensions are ‘.jpg’ ‘.png’ ‘.gif’ ‘.css’ ‘.js’
  • Static content can match any of the rule above
  • anything which is not static is considered as dynamic

The configuration sniplet below should be integrated into the HAProxy frontend. It matches the rules above to do traffic splitting. The varnish servers will stands in the bk_static farm.

frontend ft_public
 <frontend settings>
 acl static_domain  req.hdr_beg(Host) -i static. images.
 acl static_content path_end          -i .jpg .png .gif .css .js
 use_backend bk_static if static_domain or static_content
 default_backend bk_dynamic
   
backend bk_static
 <parameters related to static content delivery>

The configuration above creates 2 named acls ‘static_domain‘ and ‘static_content‘ which are used by the used_backend rule to route the traffic to varnish servers.

HAProxy and hash based load-balancing algotithm


Later in this article, we’ll heavily used the hash based load-balancing algorithms from HAProxy.
So a few information here (non exhaustive, it would deserve a long blog article) which will be useful for people wanting to understand what happens deep inside HAProxy.

The following parameters are taken into account when computing a hash algorithm:

  • number of servers in the farm
  • weight of each server in the farm
  • status of the servers (UP or DOWN)

If any of the parameter above changes, the whole hash computation also changes, hence request may hit an other server. This may lead to a negative impact on the response time of the application (during a short period of time).
Fortunately, HAProxy allows ‘consistent’ hashing, which means that only the traffic related to the change will be impacted.
That’s why you’ll see a lot of hash-type consistent directives in the configuration samples below.

Load-Balancing varnish cache server

Now, let’s focus on the magic we can add in the bk_static server farm.

Hashing the URL

HAProxy can hash the URL to pick up a server. With this load-balancing algorithm, we guarantee that a single URL will always hit the same Varnish server.

hashing the URL path only


In the example below, HAProxy hashes the URL path, which is from the first slash ‘/’ character up to the question mark ‘?’:

backend bk_static
  balance uri
  hash-type consistent

hashing the whole url, including the query string


In some cases, the query string may contain some variables in the query string, which means we must include the query string in the hash:

backend bk_static
  balance uri whole
  hash-type consistent

Query string parameter hash


That said, in some cases (API, etc…), hashing the whole URL is not enough. We may want to hash only on a particular query string parameter.
This applies well in cases where the client can forge itself the URL and all the parameters may be randomly ordered.
The configuration below tells HAProxy to apply the hash to the query string parameter named ‘id’ (IE: /image.php?width=512&id=12&height=256)

backend bk_static
  balance url_param id
  hash-type consistent

hash on a HTTP header


HAProxy can apply the hash to a specific HTTP header field.
The example below applies it on the Host header. This can be used for people hosting many domain names with a few pages, like users dedicated pages.

backend bk_static
  balance hdr(Host)
  hash-type consistent

Compose your own hash: concatenation of Host header and URL


Nowadays, HAProxy becomes more and more flexible and we can use this flexibility in its configuration.
Imagine, in your varnish configuration, you have a storage hash key based on the concatenation of the host header and the URI, then you may want to apply the same load-balancing algorithm into HAProxy, to optimize your caches.

The configuration below creates a new HTTP header field named X-LB which contains the host header (converted to lowercase) concatenated to the request uri (converted in lowercase too).

backend bk_static
  http-request set-header X-LB %[req.hdr(Host),lower]%[req.uri,lower]
  balance hdr(X-LB)
  hash-type consistent

Conclusion


HAProxy and Varnish works very well together. Each soft can benefit from performance and flexibility of the other one.

Links

high performance WAF platform with Naxsi and HAProxy

Synopsis

I’ve already described WAF in a previous article, where I spoke about WAF scalability with apache and modsecurity.
One of the main issue with Apache and modsecurity is the performance. To address this issue, an alternative exists: naxsi, a Web Application Firewall module for nginx.

So using Naxsi and HAProxy as a load-balancer, we’re able to build a platform which meets the following requirements:

  • Web Application Firewall: achieved by Apache and modsecurity
  • High-availability: application server and WAF monitoring, achieved by HAProxy
  • Scalability: ability to adapt capacity to the upcoming volume of traffic, achieved by HAProxy
  • DDOS protection: blind and brutal attacks protection, slowloris protection, achieved by HAProxy
  • Content-Switching: ability to route only dynamic requests to the WAF, achieved by HAProxy
  • Reliability: ability to detect capacity overusage, this is achieved by HAProxy
  • Performance: deliver response as fast as possible, achieved by the whole platform

The picture below provides a better overview:

The LAB platform is composed by 6 boxes:

  • 2 ALOHA Load-Balancers (could be replaced by HAProxy 1.5-dev)
  • 2 WAF servers: CentOS 6.0, nginx and Naxsi
  • 2 Web servers: Debian + apache + PHP + dokuwiki

Nginx and Naxsi installation on CentOS 6

Purpose of this article is not to provide such procedue. So please read this wiki article which summarizes how to install nginx and naxsi on CentOS 6.0.

Diagram

The diagram below shows the platform with HAProxy frontends (prefixed by ft_) and backends (prefixed by bk_). Each farm is composed by 2 servers.

Configuration

Nginx and Naxsi


Configure nginx as a reverse-proxy which listen in bk_waf and forward traffic to ft_web. In the mean time, naxsi is there to analyze the requests.

server {
 proxy_set_header Proxy-Connection "";
 listen       192.168.10.15:81;
 access_log  /var/log/nginx/naxsi_access.log;
 error_log  /var/log/nginx/naxsi_error.log debug;

 location / {
  include    /etc/nginx/test.rules;
  proxy_pass http://192.168.10.2:81/;
 }

 error_page 403 /403.html;
 location = /403.html {
  root /opt/nginx/html;
  internal;
 }

 location /RequestDenied {
  return 403;
 }
}

HAProxy Load-Balancer configuration


The configuration below allows the following advanced features:

  • DDOS protection on the frontend
  • abuser or attacker detection in bk_waf and blocking on the public interface (ft_waf)
  • Bypassing WAF when overusage or unavailable
######## Default values for all entries till next defaults section
defaults
  option  http-server-close
  option  dontlognull
  option  redispatch
  option  contstats
  retries 3
  timeout connect 5s
  timeout http-keep-alive 1s
  # Slowloris protection
  timeout http-request 15s
  timeout queue 30s
  timeout tarpit 1m          # tarpit hold tim
  backlog 10000

# public frontend where users get connected to
frontend ft_waf
  bind 192.168.10.2:80 name http
  mode http
  log global
  option httplog
  timeout client 25s
  maxconn 10000

  # DDOS protection
  # Use General Purpose Couter (gpc) 0 in SC1 as a global abuse counter
  # Monitors the number of request sent by an IP over a period of 10 seconds
  stick-table type ip size 1m expire 1m store gpc0,http_req_rate(10s),http_err_rate(10s)
  tcp-request connection track-sc1 src
  tcp-request connection reject if { sc1_get_gpc0 gt 0 }
  # Abuser means more than 100reqs/10s
  acl abuse sc1_http_req_rate(ft_web) ge 100
  acl flag_abuser sc1_inc_gpc0(ft_web)
  tcp-request content reject if abuse flag_abuser

  acl static path_beg /static/ /dokuwiki/images/
  acl no_waf nbsrv(bk_waf) eq 0
  acl waf_max_capacity queue(bk_waf) ge 1
  # bypass WAF farm if no WAF available
  use_backend bk_web if no_waf
  # bypass WAF farm if it reaches its capacity
  use_backend bk_web if static waf_max_capacity
  default_backend bk_waf

# WAF farm where users' traffic is routed first
backend bk_waf
  balance roundrobin
  mode http
  log global
  option httplog
  option forwardfor header X-Client-IP
  option httpchk HEAD /waf_health_check HTTP/1.0

  # If the source IP generated 10 or more http request over the defined period,
  # flag the IP as abuser on the frontend
  acl abuse sc1_http_err_rate(ft_waf) ge 10
  acl flag_abuser sc1_inc_gpc0(ft_waf)
  tcp-request content reject if abuse flag_abuser

  # Specific WAF checking: a DENY means everything is OK
  http-check expect status 403
  timeout server 25s
  default-server inter 3s rise 2 fall 3
  server waf1 192.168.10.15:81 maxconn 100 weight 10 check
  server waf2 192.168.10.16:81 maxconn 100 weight 10 check

# Traffic secured by the WAF arrives here
frontend ft_web
  bind 192.168.10.2:81 name http
  mode http
  log global
  option httplog
  timeout client 25s
  maxconn 1000
  # route health check requests to a specific backend to avoid graph pollution in ALOHA GUI
  use_backend bk_waf_health_check if { path /waf_health_check }
  default_backend bk_web

# application server farm
backend bk_web
  balance roundrobin
  mode http
  log global
  option httplog
  option forwardfor
  cookie SERVERID insert indirect nocache
  default-server inter 3s rise 2 fall 3
  option httpchk HEAD /
  # get connected on the application server using the user ip
  # provided in the X-Client-IP header setup by ft_waf frontend
  source 0.0.0.0 usesrc hdr_ip(X-Client-IP)
  timeout server 25s
  server server1 192.168.10.11:80 maxconn 100 weight 10 cookie server1 check
  server server2 192.168.10.12:80 maxconn 100 weight 10 cookie server2 check

# backend dedicated to WAF checking (to avoid graph pollution)
backend bk_waf_health_check
  balance roundrobin
  mode http
  log global
  option httplog
  option forwardfor
  default-server inter 3s rise 2 fall 3
  timeout server 25s
  server server1 192.168.10.11:80 maxconn 100 weight 10 check
  server server2 192.168.10.12:80 maxconn 100 weight 10 check

Detecting attacks


On the load-balancer


The ft_waf frontend stick table tracks two information: http_req_rate and http_err_rate which are respectively the http request rate and the http error rate generated by a single IP address.
HAProxy would automatically block an IP which has generated more than 100 requests over a period of 10s or 10 errors (WAF detection 403 responses included) in 10s. The user is blocked for 1 minute as long as he keeps on abusing.
Of course, you can setup above values to whatever you need: it is fully flexible.

To know the status of IPs in your load-balancer, just run the command below:

echo show table ft_waf | socat /var/run/haproxy.stat - 
# table: ft_waf, type: ip, size:1048576, used:1
0xc33304: key=192.168.10.254 use=0 exp=4555 gpc0=0 http_req_rate(10000)=1 http_err_rate(10000)=1

Note: The ALOHA Load-balancer does not provide watch tool, but you can monitor the content of the table in live with the command below:

while true ; do echo show table ft_waf | socat /var/run/haproxy.stat - ; sleep 2 ; clear ; done

On the Waf


Every Naxsi error log appears in /var/log/nginx/naxsi_error.log. IE:

2012/10/16 13:40:13 [error] 10556#0: *10293 NAXSI_FMT: ip=192.168.10.254&server=192.168.10.15&uri=/testphp.vulnweb.com/artists.php&total_processed=3195&total_blocked=2&zone0=ARGS&id0=1000&var_name0=artist, client: 192.168.10.254, server: , request: "GET /testphp.vulnweb.com/artists.php?artist=0+div+1+union%23foo*%2F*bar%0D%0Aselect%23foo%0D%0A1%2C2%2Ccurrent_user HTTP/1.1", host: "192.168.10.15:81"

Naxsi log line is less obvious than modsecurity one. The rule which matched os provided by the argument idX=abcde.
No false positive during the test, I had to build a request to make Naxsi match it 🙂 .

conclusion


Today, we saw it’s easy to build a scalable and performing WAF platform in front of any web application.
The WAF is able to communicate to HAProxy which IPs to automatically blacklist (throuth error rate monitoring), which is convenient since the attacker won’t bother the WAF for a certain amount of time 😉
The platform allows to detect WAF farm availability and to bypass it in case of total failure, we even saw it is possible to bypass the WAF for static content if the farm is running out of capacity. Purpose is to deliver a good end-user experience without dropping too much the security.
Note that it is possible to route all the static content to the web servers (or a static farm) directly, whatever the status of the WAF farm.
This make me say that the platform is fully scallable and flexible.
Thanks to HAProxy, the architecture is very flexible: I could switch my apache + modexurity to nginx + naxsi with no issues at all 🙂 This could be done as well for any third party waf appliances.
Note that I did not try any naxsi advanced features like learning mode and the UI as well.

Related links

Links

Application Delivery Controller and ecommerce websites

Synopsis

Today, almost any ecommerce website uses a load-balancer or an application delivery controller in front of it, in order to improve its availability and reliability.
In today’s article, I’ll explain how we can take advantage of ADCs’ layer 7 features to improve an ecommerce website performance and give the best experience to end-user in order to increase the revenue.
The points on which we can work are:

  • Network optimization
  • Traffic regulation
  • Overusage protection
  • User “tagging” based on cart content
  • User “tagging” based purchase phase
  • Blackout prevention
  • SEO optimization
  • Partner slowness protection

Note: the list is not exhaustive and the given example will be very simple. My purpose is not to create a very complicated configuration but give the reader clues on how he can take advantage of our product.


Note2: I won’t discuss about static content, there is already one article with a lot of details about it on this blog.


As Usual, the configuration example below applies on our ALOHA ADC appliance, but should work as well on HAProxy 1.5.

Network optimization

Client-side network latency have a negative impact on websites: the slowest the user connectivity is, the longest the connection will remain opened on the web server, the time for the client to download the object. This could last much longer if the client and server uses HTTP Keepalives.
Basically, this is what happens with basic layer 4 load-balancers like LVS or some other appliance vendors, when the TCP connection is established between the client and the server directly.
Since HAProxy works as a HTTP reverse-proxy, it breaks the TCP connection and enables TCP buffering between both connections. It means HAProxy reads the response at the speed of the server and delivers it at the speed of the client.
Slow clients with high latency will have no impact anymore on application servers because HAProxy “hides” it by its own latency to the server.
An other good point is that you can enable HTTP Keepalives on the client side and disable it on the server side: it allows a client to re-use a connection to download several objects, with no impact on server resources.
TCP buffering does not require any configuration, while enabling client side HTTP keep-alive is achieved by the line option http-server-close.
And The configuration is pretty simple:

# default options
defaults
  option http-server-close
  mode http
  log 10.0.0.1 local2
  option httplog
  timeout connect 5s
  timeout client 20s
  timeout server 15s
  timeout check 1s
  timeout http-keep-alive 1s
  timeout http-request 10s  # slowloris protection
  default-server inter 3s fall 2 rise 2 slowstart 60s

# main frontend
frontend ft_web
  bind 10.0.0.3:80
  default_backend bk_appsrv

# application server farm
backend bk_appsrv
  balance roundrobin
  # app servers must say if everything is fine on their side and 
  # they are ready to process traffic
  option httpchk GET /appcheck
  http-check expect rstring [oO][kK]
  cookie SERVERID insert indirect nocache
  server s1 10.0.1.101:80 cookie s1 check
  server s2 10.0.1.102:80 cookie s2 check

Traffic Regulation


Any server has a maximum capacity. The more it handles requests, the slower it will be to process each request. And if it has too many requests to process, it can even crash and won’t obviously be able to answer to anybody!
HAProxy can regulate request streams to servers in order to prevent them from crashing or even slowing down. Note that when well set up, it can allow you to use your server at their maximum capacity without never being in trouble.
Basically, HAProxy is able to manage request queues.
You can configure traffic regulation with fullconn and maxconn parameters in the backend and with minconn and maxconn parameters on the server line description.
Let’s update our server line description above with a simple maxconn parameter:

  server s1 10.0.1.101:80 cookie s1 check maxconn 250
  server s2 10.0.1.102:80 cookie s2 check maxconn 250

Note: there would be many many things to say about queueing and the HAProxy parameter cited above, but this is not the purpose of the current article.

Over usage protection

By over usage, I mean that you want to be able to handle an unexpected flow of users and be able to classify users in 2 categories:

  1. Those who have already been identified by the website and are using it
  2. Those who have just arrived and wants to use it

The difference between both type of users can be done through the ecommerce CMS cookie: identified users owns a Cookie while brand new users doesn’t.
If you know your server farm has the capacity to manage 10000 users, then you don’t want to allow more than this number until you expand the farm.
Here is the configuration to protect against over-usage (The application Cookie is “MYCOOKIE”.):

# default options
defaults
  option http-server-close
  mode http
  log 10.0.0.2 local2
  option httplog
  timeout connect 5s
  timeout client 20s
  timeout server 15s
  timeout check 1s
  timeout http-keep-alive 1s
  timeout http-request 10s  # slowloris protection
  default-server inter 3s fall 2 rise 2 slowstart 60s

# main frontend
frontend ft_web
  bind 10.0.0.3:80
  # update the number below to the number of people you want to allow
  acl maxcapacity table_cnt(bk_appsrv) ge 10000
  acl knownuser hdr_sub(Cookie) MYCOOK
  # route any unknown user to the sorry page if we reached the maximum number
  # of allowed users and the request does not have a cookie
  use_backend bk_sorrypage if maxcapacity !knownuser
  default_backend bk_appsrv

# appsrv backend for dynamic content
backend bk_appsrv
  balance roundrobin
  # define a stick-table with at most 10K entries
  # cookie value would be cleared from the table if not used for 10 mn
  stick-table type string len 32 size 10K expire 10m nopurge
  stick store-response set-cookie(MYCOOK)
  # Reset the TTL in the stick table each time a request comes in
  stick store-request cookie(MYCOOK)
  # app servers must say if everything is fine on their side and 
  # they are ready to process traffic
  option httpchk GET /appcheck
  http-check expect rstring [oO][kK]
  cookie SERVERID insert indirect nocache
  server s1 10.0.1.101:80 cookie s1 check maxconn 250
  server s2 10.0.1.102:80 cookie s2 check maxconn 250

# sorry page management
backend bk_sorrypage
  balance roundrobin
  server s1 10.0.1.103:80 check maxconn 1000
  server s2 10.0.1.104:80 check maxconn 1000

User tagging based on cart content

When your architecture has enough capacity, you don’t need to classify users. But imagine if your platform runs out of capacity, you want to be able to reserve resources for users who have no article in the cart, that way the website looks very fast, hopefully these users will buy some articles.
Just configure your ecommerce application to setup a cookie with some information about the cart: either the number of article, the whole value, etc…
In the example below, we’ll consider the application creates a cookie named CART and the number of articles as a value.
Based on the information provided by this cookie, we’ll take routing decision and choose different farms with different capacity.

# default options
defaults
  option http-server-close
  mode http
  log 10.0.0.2 local2
  option httplog
  timeout connect 5s
  timeout client 20s
  timeout server 15s
  timeout check 1s
  timeout http-keep-alive 1s
  timeout http-request 10s  # slowloris protection
  default-server inter 3s fall 2 rise 2 slowstart 60s

# main frontend
frontend ft_web
  bind 10.0.0.3:80
  # update the number below to the number of people you want to allow
  acl maxcapacity table_cnt(bk_appsrv) ge 10000
  acl knownuser hdr_sub(Cookie) MYCOOK
  acl empty_cart hdr_sub(Cookie) CART=0
  # route any unknown user to the sorry page if we reached the maximum number
  # of allowed users and the request does not own a cookie
  use_backend bk_sorrypage if maxcapacity !knownuser
  # Once the user have something in the cart, move it to a farm with less resources
  # only when there are too many users on the website
  use_backend bk_appsrv if maxcapacity !empty_cart 
  default_backend bk_appsrv_empty_cart

# Default farm when everything goes well
backend bk_appsrv_empty_cart
  balance roundrobin
  # create the entry in the table when the server generates the cookie
  stick store-response set-cookie(MYCOOK) table bk_appsrv
  # Reset the TTL in the stick table each time a request comes in
  stick store-request cookie(MYCOOK) table bk_appsrv
  # app servers must say if everything is fine on their side
  # and they can process requests
  option httpchk GET /appcheck
  http-check expect rstring [oO][kK]
  cookie SERVERID insert indirect nocache
  server s1 10.0.1.101:80 cookie s1 check maxconn 200
  server s2 10.0.1.102:80 cookie s2 check maxconn 200

# Reserve resources for the few users which have something in their cart
backend bk_appsrv
  balance roundrobin
  # define a stick-table with at most 10K entries
  # cookie would be cleared from the table if not used for 10  mn
  stick-table type string len 32 size 10K expire 10m nopurge
  # create the entry in the table when the server generates the cookie
  stick store-response set-cookie(MYCOOK)
  # Reset the TTL in the stick table each time a request comes in
  stick store-request cookie(MYCOOK)
  cookie SERVERID insert indirect nocache
  server s1 10.0.1.101:80 cookie s1 track bk_appsrv_empty_cart/s1 maxconn 50
  server s2 10.0.1.102:80 cookie s2 track bk_appsrv_empty_cart/s2 maxconn 50

backend bk_sorrypage
  balance roundrobin
  server s1 10.0.1.103:80 check maxconn 1000
  server s2 10.0.1.104:80 check maxconn 1000

User tagging based on purchase phase

The synopsis of this chapter is the same as the precedent chapter: behing able to classify users and ability to reserve resources.
But this time, we’ll identify users based on the phase they are. Basically, we’ll consider two phases:

  1. browsing phase, when people add articles in the cart
  2. purchasing phase, when people have finished filling up the cart and start providing billing, delivery and payment information

In order to classify users, we’ll use the URL path. It starts by /purchase/ when the user is in the purchasing phase. Any other URLs are considered as browsing.
Based on the information provided by requested URL, we’ll take routing decision and choose different farms with different capacity.

# defaults options
defaults
  option http-server-close
  mode http
  log 10.0.0.2 local2
  option httplog
  timeout connect 5s
  timeout client 20s
  timeout server 15s
  timeout check 1s
  timeout http-keep-alive 1s
  timeout http-request 10s  # slowloris protection
  default-server inter 3s fall 2 rise 2 slowstart 60s

# main frontend
frontend ft_web
  bind 10.0.0.3:80
  # update the number below to the number of people you want to allow
  acl maxcapacity table_cnt(bk_appsrv) ge 10000
  acl knownuser hdr_sub(Cookie) MYCOOK
  acl purchase_phase path_beg /purchase/
  # route any unknown user to the sorry page if we reached the maximum number
  # of allowed users and the request does not own a cookie
  use_backend bk_sorrypage if maxcapacity !knownuser
  # Once the user is in the purchase phase, move it to a farm with less resources
  # only when there are too many users on the website
  use_backend bk_appsrv if maxcapacity purchase_phase 
  default_backend bk_appsrv_browse

# Default farm when everything goes well
backend bk_appsrv_browse
  balance roundrobin
  # create the entry in the table when the server generates the cookie
  stick store-response set-cookie(MYCOOK) table bk_appsrv
  # Reset the TTL in the stick table each time a request comes in
  stick store-request cookie(MYCOOK) table bk_appsrv
  # app servers must say if everything is fine on their side
  # and they can process requests
  option httpchk GET /appcheck
  http-check expect rstring [oO][kK]
  cookie SERVERID insert indirect nocache
  server s1 10.0.1.101:80 cookie s1 check maxconn 200
  server s2 10.0.1.102:80 cookie s2 check maxconn 200

# Reserve resources for the few users in the purchase phase
backend bk_appsrv
  balance roundrobin
  # define a stick-table with at most 10K entries
  # cookie would be cleared from the table if not used for 10  mn
  stick-table type string len 32 size 10K expire 10m nopurge
  # create the entry in the table when the server generates the cookie
  stick store-response set-cookie(MYCOOK)
  # Reset the TTL in the stick table each time a request comes in
  stick store-request cookie(MYCOOK)
  cookie SERVERID insert indirect nocache
  server s1 10.0.1.101:80 cookie s1 track bk_appsrv_browse/s1 maxconn 50
  server s2 10.0.1.102:80 cookie s2 track bk_appsrv_browse/s2 maxconn 50

backend bk_sorrypage
  balance roundrobin
  server s1 10.0.1.103:80 check maxconn 1000
  server s2 10.0.1.104:80 check maxconn 1000

Blackout prevention

A website blackout is the worst thing that could happen: something has crashed and the application does not work anymore, or none of the servers are reachable.
When such thing occurs, it is common to get 503 errors or a blank page after 30 seconds.
In both cases, end users have a negative feeling about the website. At least an excuse page with an estimated recovery date would be appreciated. HAProxy allows to communicate to end user even if none of the servers are available.
The configuration below shows how to do it:

# defaults options
defaults
  option http-server-close
  mode http
  log 10.0.0.2 local2
  option httplog
  timeout connect 5s
  timeout client 20s
  timeout server 15s
  timeout check 1s
  timeout http-keep-alive 1s
  timeout http-request 10s  # slowloris protection
  default-server inter 3s fall 2 rise 2 slowstart 60s

# main frontend
frontend ft_web
  bind 10.0.0.3:80
  # update the number below to the number of people you want to allow
  acl maxcapacity table_cnt(bk_appsrv) ge 10000
  acl knownuser hdr_sub(Cookie) MYCOOK
  acl purchase_phase path_beg /purchase/
  acl no_appsrv nbsrv(bk_appsrv_browse) eq 0
  acl no_sorrysrv nbsrv(bk_sorrypage) eq 0
  # worst case management
  use_backend bk_worst_case_management if no_appsrv no_sorrysrv
  # use sorry servers if available
  use_backend bk_sorrypage if no_appsrv !no_sorrysrv
  # route any unknown user to the sorry page if we reached the maximum number
  # of allowed users and the request does not own a cookie
  use_backend bk_sorrypage if maxcapacity !knownuser
  # Once the user is in the purchase phase, move it to a farm with less resources
  # only when there are too many users on the website
  use_backend bk_appsrv if maxcapacity purchase_phase 
  default_backend bk_appsrv_browse

# Default farm when everything goes well
backend bk_appsrv_browse
  balance roundrobin
  # create the entry in the table when the server generates the cookie
  stick store-response set-cookie(MYCOOK) table bk_appsrv
  # Reset the TTL in the stick table each time a request comes in
  stick store-request cookie(MYCOOK) table bk_appsrv
  # app servers must say if everything is fine on their side
  # and they can process requests
  option httpchk GET /appcheck
  http-check expect rstring [oO][kK]
  cookie SERVERID insert indirect nocache
  server s1 10.0.1.101:80 cookie s1 check maxconn 200
  server s2 10.0.1.102:80 cookie s2 check maxconn 200

# Reserve resources for the few users in the purchase phase
backend bk_appsrv
  balance roundrobin
  # define a stick-table with at most 10K entries
  # cookie would be cleared from the table if not used for 10  mn
  stick-table type string len 32 size 10K expire 10m nopurge
  # create the entry in the table when the server generates the cookie
  stick store-response set-cookie(MYCOOK)
  # Reset the TTL in the stick table each time a request comes in
  stick store-request cookie(MYCOOK)
  cookie SERVERID insert indirect nocache
  server s1 10.0.1.101:80 cookie s1 track bk_appsrv_browse/s1 maxconn 50
  server s2 10.0.1.102:80 cookie s2 track bk_appsrv_browse/s2 maxconn 50

backend bk_sorrypage
  balance roundrobin
  server s1 10.0.1.103:80 check maxconn 1000
  server s2 10.0.1.104:80 check maxconn 1000

backend bk_worst_case_management
  errorfile 503 /etc/haproxy/errors/503.txt

And the content of the file /etc/haproxy/errors/503.txt could look like:

HTTP/1.0 200 OK
Cache-Control: no-cache
Connection: close
Content-Type: text/html
Content-Length: 246

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Maintenance</title>
</head>
<body>
<h1>Maintenance</h1>
We're sorry, ecommerce.com is currently under maintenance and will come back soon.
</body>
</html>

SEO optimization

Most search engines takes now into account pages response time.
The configuration below redirects search engine bots to a dedicated server and if it’s not available, then it is forwarded to the default farm. The bot is identified by its User-Agent header.

# defaults options
defaults
  option http-server-close
  mode http
  log 10.0.0.2 local2
  option httplog
  timeout connect 5s
  timeout client 20s
  timeout server 15s
  timeout check 1s
  timeout http-keep-alive 1s
  timeout http-request 10s  # slowloris protection
  default-server inter 3s fall 2 rise 2 slowstart 60s

# main frontend
frontend ft_web
  bind 10.0.0.3:80
  # update the number below to the number of people you want to allow
  acl maxcapacity table_cnt(bk_appsrv) ge 10000
  acl knownuser hdr_sub(Cookie) MYCOOK
  acl purchase_phase path_beg /purchase/
  acl bot hdr_sub(User-Agent) -i googlebot bingbot slurp
  acl no_appsrv nbsrv(bk_appsrv_browse) eq 0
  acl no_sorrysrv nbsrv(bk_sorrypage) eq 0
  acl no_seosrv nbsrv(bk_seo) eq 0
  # worst caperformancese management
  use_backend bk_worst_case_management if no_appsrv no_sorrysrv
  # use sorry servers if available
  use_backend bk_sorrypage if no_appsrv !no_sorrysrv
  # redirect bots
  use_backend bk_seo if bot !no_seosrv
  use_backend bk_appsrv if bot no_seosrv
  # route any unknown user to the sorry page if we reached the maximum number
  # of allowed users and the request does not own a cookie
  use_backend bk_sorrypage if maxcapacity !knownuser
  # Once the user is in the purchase phase, move it to a farm with less resources
  # only when there are too many users on the website
  use_backend bk_appsrv if maxcapacity purchase_phase 
  default_backend bk_appsrv_browse

# Default farm when everything goes well
backend bk_appsrv_browse
  balance roundrobin
  # create the entry in the table when the server generates the cookie
  stick store-response set-cookie(MYCOOK) table bk_appsrv
  # Reset the TTL in the stick table each time a request comes in
  stick store-request cookie(MYCOOK) table bk_appsrv
  # app servers must say if everything is fine on their side
  # and they can process requests
  option httpchk GET /appcheck
  http-check expect rstring [oO][kK]
  cookie SERVERID insert indirect nocache
  server s1 10.0.1.101:80 cookie s1 check maxconn 200
  server s2 10.0.1.102:80 cookie s2 check maxconn 200

# Reserve resources for the few users in the purchase phase
backend bk_appsrv
  balance roundrobin
  # define a stick-table with at most 10K entries
  # cookie would be cleared from the table if not used for 10  mn
  stick-table type string len 32 size 10K expire 10m nopurge
  # create the entry in the table when the server generates the cookie
  stick store-response set-cookie(MYCOOK)
  # Reset the TTL in the stick table each time a request comes in
  stick store-request cookie(MYCOOK)
  cookie SERVERID insert indirect nocache
  server s1 10.0.1.101:80 cookie s1 track bk_appsrv_browse/s1 maxconn 50
  server s2 10.0.1.102:80 cookie s2 track bk_appsrv_browse/s2 maxconn 50

# Reserve resources search engines bot
backend bk_seo
  option httpchk GET /appcheck
  http-check expect rstring [oO][kK]
  server s3 10.0.1.103:80 check

backend bk_sorrypage
  balance roundrobin
  server s1 10.0.1.103:80 check maxconn 1000
  server s2 10.0.1.104:80 check maxconn 1000

backend bk_worst_case_management
  errorfile 503 /etc/haproxy/errors/503.txt

Partner slowness protection

Some ecommerce website relies on partners for some product or services. Unfortunately, if the partner’s webservice application slows down, then our own application will slow down. Even worst, we may see sessions pilling up and server crashes due to lack of resources…
In order to prevent this, just configure your appserver to pass through HAProxy to reach your partners’ webservices. HAProxy can shut a session if a partner is too slow to answer. If the partner complain you don’t send them enough deals, just tell him to improve his platform, maybe using a ADC like HAProxy / ALOHA Load-Balancer 😉

frontend ft_partner1
  bind 10.0.0.3:8001
  use_backend bk_partner1

backend bk_partner1
  # the partner has 2 seconds to answer each requests
  timeout server 2s
  # you can add a maxconn here if you're not supposed to open 
  # too many connections on the partner application
  server partner1 1.2.3.4:80 check

Related links

Links