Tag Archives: content switching

Application Delivery Controller and ecommerce websites

Synopsis

Today, almost any ecommerce website uses a load-balancer or an application delivery controller in front of it, in order to improve its availability and reliability.
In today’s article, I’ll explain how we can take advantage of ADCs’ layer 7 features to improve an ecommerce website performance and give the best experience to end-user in order to increase the revenue.
The points on which we can work are:

  • Network optimization
  • Traffic regulation
  • Overusage protection
  • User “tagging” based on cart content
  • User “tagging” based purchase phase
  • Blackout prevention
  • SEO optimization
  • Partner slowness protection

Note: the list is not exhaustive and the given example will be very simple. My purpose is not to create a very complicated configuration but give the reader clues on how he can take advantage of our product.


Note2: I won’t discuss about static content, there is already one article with a lot of details about it on this blog.


As Usual, the configuration example below applies on our ALOHA ADC appliance, but should work as well on HAProxy 1.5.

Network optimization

Client-side network latency have a negative impact on websites: the slowest the user connectivity is, the longest the connection will remain opened on the web server, the time for the client to download the object. This could last much longer if the client and server uses HTTP Keepalives.
Basically, this is what happens with basic layer 4 load-balancers like LVS or some other appliance vendors, when the TCP connection is established between the client and the server directly.
Since HAProxy works as a HTTP reverse-proxy, it breaks the TCP connection and enables TCP buffering between both connections. It means HAProxy reads the response at the speed of the server and delivers it at the speed of the client.
Slow clients with high latency will have no impact anymore on application servers because HAProxy “hides” it by its own latency to the server.
An other good point is that you can enable HTTP Keepalives on the client side and disable it on the server side: it allows a client to re-use a connection to download several objects, with no impact on server resources.
TCP buffering does not require any configuration, while enabling client side HTTP keep-alive is achieved by the line option http-server-close.
And The configuration is pretty simple:

# default options
defaults
  option http-server-close
  mode http
  log 10.0.0.1 local2
  option httplog
  timeout connect 5s
  timeout client 20s
  timeout server 15s
  timeout check 1s
  timeout http-keep-alive 1s
  timeout http-request 10s  # slowloris protection
  default-server inter 3s fall 2 rise 2 slowstart 60s

# main frontend
frontend ft_web
  bind 10.0.0.3:80
  default_backend bk_appsrv

# application server farm
backend bk_appsrv
  balance roundrobin
  # app servers must say if everything is fine on their side and 
  # they are ready to process traffic
  option httpchk GET /appcheck
  http-check expect rstring [oO][kK]
  cookie SERVERID insert indirect nocache
  server s1 10.0.1.101:80 cookie s1 check
  server s2 10.0.1.102:80 cookie s2 check

Traffic Regulation


Any server has a maximum capacity. The more it handles requests, the slower it will be to process each request. And if it has too many requests to process, it can even crash and won’t obviously be able to answer to anybody!
HAProxy can regulate request streams to servers in order to prevent them from crashing or even slowing down. Note that when well set up, it can allow you to use your server at their maximum capacity without never being in trouble.
Basically, HAProxy is able to manage request queues.
You can configure traffic regulation with fullconn and maxconn parameters in the backend and with minconn and maxconn parameters on the server line description.
Let’s update our server line description above with a simple maxconn parameter:

  server s1 10.0.1.101:80 cookie s1 check maxconn 250
  server s2 10.0.1.102:80 cookie s2 check maxconn 250

Note: there would be many many things to say about queueing and the HAProxy parameter cited above, but this is not the purpose of the current article.

Over usage protection

By over usage, I mean that you want to be able to handle an unexpected flow of users and be able to classify users in 2 categories:

  1. Those who have already been identified by the website and are using it
  2. Those who have just arrived and wants to use it

The difference between both type of users can be done through the ecommerce CMS cookie: identified users owns a Cookie while brand new users doesn’t.
If you know your server farm has the capacity to manage 10000 users, then you don’t want to allow more than this number until you expand the farm.
Here is the configuration to protect against over-usage (The application Cookie is “MYCOOKIE”.):

# default options
defaults
  option http-server-close
  mode http
  log 10.0.0.2 local2
  option httplog
  timeout connect 5s
  timeout client 20s
  timeout server 15s
  timeout check 1s
  timeout http-keep-alive 1s
  timeout http-request 10s  # slowloris protection
  default-server inter 3s fall 2 rise 2 slowstart 60s

# main frontend
frontend ft_web
  bind 10.0.0.3:80
  # update the number below to the number of people you want to allow
  acl maxcapacity table_cnt(bk_appsrv) ge 10000
  acl knownuser hdr_sub(Cookie) MYCOOK
  # route any unknown user to the sorry page if we reached the maximum number
  # of allowed users and the request does not have a cookie
  use_backend bk_sorrypage if maxcapacity !knownuser
  default_backend bk_appsrv

# appsrv backend for dynamic content
backend bk_appsrv
  balance roundrobin
  # define a stick-table with at most 10K entries
  # cookie value would be cleared from the table if not used for 10 mn
  stick-table type string len 32 size 10K expire 10m nopurge
  stick store-response set-cookie(MYCOOK)
  # Reset the TTL in the stick table each time a request comes in
  stick store-request cookie(MYCOOK)
  # app servers must say if everything is fine on their side and 
  # they are ready to process traffic
  option httpchk GET /appcheck
  http-check expect rstring [oO][kK]
  cookie SERVERID insert indirect nocache
  server s1 10.0.1.101:80 cookie s1 check maxconn 250
  server s2 10.0.1.102:80 cookie s2 check maxconn 250

# sorry page management
backend bk_sorrypage
  balance roundrobin
  server s1 10.0.1.103:80 check maxconn 1000
  server s2 10.0.1.104:80 check maxconn 1000

User tagging based on cart content

When your architecture has enough capacity, you don’t need to classify users. But imagine if your platform runs out of capacity, you want to be able to reserve resources for users who have no article in the cart, that way the website looks very fast, hopefully these users will buy some articles.
Just configure your ecommerce application to setup a cookie with some information about the cart: either the number of article, the whole value, etc…
In the example below, we’ll consider the application creates a cookie named CART and the number of articles as a value.
Based on the information provided by this cookie, we’ll take routing decision and choose different farms with different capacity.

# default options
defaults
  option http-server-close
  mode http
  log 10.0.0.2 local2
  option httplog
  timeout connect 5s
  timeout client 20s
  timeout server 15s
  timeout check 1s
  timeout http-keep-alive 1s
  timeout http-request 10s  # slowloris protection
  default-server inter 3s fall 2 rise 2 slowstart 60s

# main frontend
frontend ft_web
  bind 10.0.0.3:80
  # update the number below to the number of people you want to allow
  acl maxcapacity table_cnt(bk_appsrv) ge 10000
  acl knownuser hdr_sub(Cookie) MYCOOK
  acl empty_cart hdr_sub(Cookie) CART=0
  # route any unknown user to the sorry page if we reached the maximum number
  # of allowed users and the request does not own a cookie
  use_backend bk_sorrypage if maxcapacity !knownuser
  # Once the user have something in the cart, move it to a farm with less resources
  # only when there are too many users on the website
  use_backend bk_appsrv if maxcapacity !empty_cart 
  default_backend bk_appsrv_empty_cart

# Default farm when everything goes well
backend bk_appsrv_empty_cart
  balance roundrobin
  # create the entry in the table when the server generates the cookie
  stick store-response set-cookie(MYCOOK) table bk_appsrv
  # Reset the TTL in the stick table each time a request comes in
  stick store-request cookie(MYCOOK) table bk_appsrv
  # app servers must say if everything is fine on their side
  # and they can process requests
  option httpchk GET /appcheck
  http-check expect rstring [oO][kK]
  cookie SERVERID insert indirect nocache
  server s1 10.0.1.101:80 cookie s1 check maxconn 200
  server s2 10.0.1.102:80 cookie s2 check maxconn 200

# Reserve resources for the few users which have something in their cart
backend bk_appsrv
  balance roundrobin
  # define a stick-table with at most 10K entries
  # cookie would be cleared from the table if not used for 10  mn
  stick-table type string len 32 size 10K expire 10m nopurge
  # create the entry in the table when the server generates the cookie
  stick store-response set-cookie(MYCOOK)
  # Reset the TTL in the stick table each time a request comes in
  stick store-request cookie(MYCOOK)
  cookie SERVERID insert indirect nocache
  server s1 10.0.1.101:80 cookie s1 track bk_appsrv_empty_cart/s1 maxconn 50
  server s2 10.0.1.102:80 cookie s2 track bk_appsrv_empty_cart/s2 maxconn 50

backend bk_sorrypage
  balance roundrobin
  server s1 10.0.1.103:80 check maxconn 1000
  server s2 10.0.1.104:80 check maxconn 1000

User tagging based on purchase phase

The synopsis of this chapter is the same as the precedent chapter: behing able to classify users and ability to reserve resources.
But this time, we’ll identify users based on the phase they are. Basically, we’ll consider two phases:

  1. browsing phase, when people add articles in the cart
  2. purchasing phase, when people have finished filling up the cart and start providing billing, delivery and payment information

In order to classify users, we’ll use the URL path. It starts by /purchase/ when the user is in the purchasing phase. Any other URLs are considered as browsing.
Based on the information provided by requested URL, we’ll take routing decision and choose different farms with different capacity.

# defaults options
defaults
  option http-server-close
  mode http
  log 10.0.0.2 local2
  option httplog
  timeout connect 5s
  timeout client 20s
  timeout server 15s
  timeout check 1s
  timeout http-keep-alive 1s
  timeout http-request 10s  # slowloris protection
  default-server inter 3s fall 2 rise 2 slowstart 60s

# main frontend
frontend ft_web
  bind 10.0.0.3:80
  # update the number below to the number of people you want to allow
  acl maxcapacity table_cnt(bk_appsrv) ge 10000
  acl knownuser hdr_sub(Cookie) MYCOOK
  acl purchase_phase path_beg /purchase/
  # route any unknown user to the sorry page if we reached the maximum number
  # of allowed users and the request does not own a cookie
  use_backend bk_sorrypage if maxcapacity !knownuser
  # Once the user is in the purchase phase, move it to a farm with less resources
  # only when there are too many users on the website
  use_backend bk_appsrv if maxcapacity purchase_phase 
  default_backend bk_appsrv_browse

# Default farm when everything goes well
backend bk_appsrv_browse
  balance roundrobin
  # create the entry in the table when the server generates the cookie
  stick store-response set-cookie(MYCOOK) table bk_appsrv
  # Reset the TTL in the stick table each time a request comes in
  stick store-request cookie(MYCOOK) table bk_appsrv
  # app servers must say if everything is fine on their side
  # and they can process requests
  option httpchk GET /appcheck
  http-check expect rstring [oO][kK]
  cookie SERVERID insert indirect nocache
  server s1 10.0.1.101:80 cookie s1 check maxconn 200
  server s2 10.0.1.102:80 cookie s2 check maxconn 200

# Reserve resources for the few users in the purchase phase
backend bk_appsrv
  balance roundrobin
  # define a stick-table with at most 10K entries
  # cookie would be cleared from the table if not used for 10  mn
  stick-table type string len 32 size 10K expire 10m nopurge
  # create the entry in the table when the server generates the cookie
  stick store-response set-cookie(MYCOOK)
  # Reset the TTL in the stick table each time a request comes in
  stick store-request cookie(MYCOOK)
  cookie SERVERID insert indirect nocache
  server s1 10.0.1.101:80 cookie s1 track bk_appsrv_browse/s1 maxconn 50
  server s2 10.0.1.102:80 cookie s2 track bk_appsrv_browse/s2 maxconn 50

backend bk_sorrypage
  balance roundrobin
  server s1 10.0.1.103:80 check maxconn 1000
  server s2 10.0.1.104:80 check maxconn 1000

Blackout prevention

A website blackout is the worst thing that could happen: something has crashed and the application does not work anymore, or none of the servers are reachable.
When such thing occurs, it is common to get 503 errors or a blank page after 30 seconds.
In both cases, end users have a negative feeling about the website. At least an excuse page with an estimated recovery date would be appreciated. HAProxy allows to communicate to end user even if none of the servers are available.
The configuration below shows how to do it:

# defaults options
defaults
  option http-server-close
  mode http
  log 10.0.0.2 local2
  option httplog
  timeout connect 5s
  timeout client 20s
  timeout server 15s
  timeout check 1s
  timeout http-keep-alive 1s
  timeout http-request 10s  # slowloris protection
  default-server inter 3s fall 2 rise 2 slowstart 60s

# main frontend
frontend ft_web
  bind 10.0.0.3:80
  # update the number below to the number of people you want to allow
  acl maxcapacity table_cnt(bk_appsrv) ge 10000
  acl knownuser hdr_sub(Cookie) MYCOOK
  acl purchase_phase path_beg /purchase/
  acl no_appsrv nbsrv(bk_appsrv_browse) eq 0
  acl no_sorrysrv nbsrv(bk_sorrypage) eq 0
  # worst case management
  use_backend bk_worst_case_management if no_appsrv no_sorrysrv
  # use sorry servers if available
  use_backend bk_sorrypage if no_appsrv !no_sorrysrv
  # route any unknown user to the sorry page if we reached the maximum number
  # of allowed users and the request does not own a cookie
  use_backend bk_sorrypage if maxcapacity !knownuser
  # Once the user is in the purchase phase, move it to a farm with less resources
  # only when there are too many users on the website
  use_backend bk_appsrv if maxcapacity purchase_phase 
  default_backend bk_appsrv_browse

# Default farm when everything goes well
backend bk_appsrv_browse
  balance roundrobin
  # create the entry in the table when the server generates the cookie
  stick store-response set-cookie(MYCOOK) table bk_appsrv
  # Reset the TTL in the stick table each time a request comes in
  stick store-request cookie(MYCOOK) table bk_appsrv
  # app servers must say if everything is fine on their side
  # and they can process requests
  option httpchk GET /appcheck
  http-check expect rstring [oO][kK]
  cookie SERVERID insert indirect nocache
  server s1 10.0.1.101:80 cookie s1 check maxconn 200
  server s2 10.0.1.102:80 cookie s2 check maxconn 200

# Reserve resources for the few users in the purchase phase
backend bk_appsrv
  balance roundrobin
  # define a stick-table with at most 10K entries
  # cookie would be cleared from the table if not used for 10  mn
  stick-table type string len 32 size 10K expire 10m nopurge
  # create the entry in the table when the server generates the cookie
  stick store-response set-cookie(MYCOOK)
  # Reset the TTL in the stick table each time a request comes in
  stick store-request cookie(MYCOOK)
  cookie SERVERID insert indirect nocache
  server s1 10.0.1.101:80 cookie s1 track bk_appsrv_browse/s1 maxconn 50
  server s2 10.0.1.102:80 cookie s2 track bk_appsrv_browse/s2 maxconn 50

backend bk_sorrypage
  balance roundrobin
  server s1 10.0.1.103:80 check maxconn 1000
  server s2 10.0.1.104:80 check maxconn 1000

backend bk_worst_case_management
  errorfile 503 /etc/haproxy/errors/503.txt

And the content of the file /etc/haproxy/errors/503.txt could look like:

HTTP/1.0 200 OK
Cache-Control: no-cache
Connection: close
Content-Type: text/html
Content-Length: 246

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Maintenance</title>
</head>
<body>
<h1>Maintenance</h1>
We're sorry, ecommerce.com is currently under maintenance and will come back soon.
</body>
</html>

SEO optimization

Most search engines takes now into account pages response time.
The configuration below redirects search engine bots to a dedicated server and if it’s not available, then it is forwarded to the default farm. The bot is identified by its User-Agent header.

# defaults options
defaults
  option http-server-close
  mode http
  log 10.0.0.2 local2
  option httplog
  timeout connect 5s
  timeout client 20s
  timeout server 15s
  timeout check 1s
  timeout http-keep-alive 1s
  timeout http-request 10s  # slowloris protection
  default-server inter 3s fall 2 rise 2 slowstart 60s

# main frontend
frontend ft_web
  bind 10.0.0.3:80
  # update the number below to the number of people you want to allow
  acl maxcapacity table_cnt(bk_appsrv) ge 10000
  acl knownuser hdr_sub(Cookie) MYCOOK
  acl purchase_phase path_beg /purchase/
  acl bot hdr_sub(User-Agent) -i googlebot bingbot slurp
  acl no_appsrv nbsrv(bk_appsrv_browse) eq 0
  acl no_sorrysrv nbsrv(bk_sorrypage) eq 0
  acl no_seosrv nbsrv(bk_seo) eq 0
  # worst caperformancese management
  use_backend bk_worst_case_management if no_appsrv no_sorrysrv
  # use sorry servers if available
  use_backend bk_sorrypage if no_appsrv !no_sorrysrv
  # redirect bots
  use_backend bk_seo if bot !no_seosrv
  use_backend bk_appsrv if bot no_seosrv
  # route any unknown user to the sorry page if we reached the maximum number
  # of allowed users and the request does not own a cookie
  use_backend bk_sorrypage if maxcapacity !knownuser
  # Once the user is in the purchase phase, move it to a farm with less resources
  # only when there are too many users on the website
  use_backend bk_appsrv if maxcapacity purchase_phase 
  default_backend bk_appsrv_browse

# Default farm when everything goes well
backend bk_appsrv_browse
  balance roundrobin
  # create the entry in the table when the server generates the cookie
  stick store-response set-cookie(MYCOOK) table bk_appsrv
  # Reset the TTL in the stick table each time a request comes in
  stick store-request cookie(MYCOOK) table bk_appsrv
  # app servers must say if everything is fine on their side
  # and they can process requests
  option httpchk GET /appcheck
  http-check expect rstring [oO][kK]
  cookie SERVERID insert indirect nocache
  server s1 10.0.1.101:80 cookie s1 check maxconn 200
  server s2 10.0.1.102:80 cookie s2 check maxconn 200

# Reserve resources for the few users in the purchase phase
backend bk_appsrv
  balance roundrobin
  # define a stick-table with at most 10K entries
  # cookie would be cleared from the table if not used for 10  mn
  stick-table type string len 32 size 10K expire 10m nopurge
  # create the entry in the table when the server generates the cookie
  stick store-response set-cookie(MYCOOK)
  # Reset the TTL in the stick table each time a request comes in
  stick store-request cookie(MYCOOK)
  cookie SERVERID insert indirect nocache
  server s1 10.0.1.101:80 cookie s1 track bk_appsrv_browse/s1 maxconn 50
  server s2 10.0.1.102:80 cookie s2 track bk_appsrv_browse/s2 maxconn 50

# Reserve resources search engines bot
backend bk_seo
  option httpchk GET /appcheck
  http-check expect rstring [oO][kK]
  server s3 10.0.1.103:80 check

backend bk_sorrypage
  balance roundrobin
  server s1 10.0.1.103:80 check maxconn 1000
  server s2 10.0.1.104:80 check maxconn 1000

backend bk_worst_case_management
  errorfile 503 /etc/haproxy/errors/503.txt

Partner slowness protection

Some ecommerce website relies on partners for some product or services. Unfortunately, if the partner’s webservice application slows down, then our own application will slow down. Even worst, we may see sessions pilling up and server crashes due to lack of resources…
In order to prevent this, just configure your appserver to pass through HAProxy to reach your partners’ webservices. HAProxy can shut a session if a partner is too slow to answer. If the partner complain you don’t send them enough deals, just tell him to improve his platform, maybe using a ADC like HAProxy / ALOHA Load-Balancer ūüėČ

frontend ft_partner1
  bind 10.0.0.3:8001
  use_backend bk_partner1

backend bk_partner1
  # the partner has 2 seconds to answer each requests
  timeout server 2s
  # you can add a maxconn here if you're not supposed to open 
  # too many connections on the partner application
  server partner1 1.2.3.4:80 check

Related links

Links

HAProxy, Varnish and the single hostname website

As explained in a previous article, HAProxy and Varnish are two great OpenSource software which aim to improve performance, resilience and scalability of web applications.
We saw also that these two softwares are not competitors. Instead of that, they can work properly together, each one bringing the other one its features, making any web infrastructure more agile and robust at the same time.

In the current article, I’m going to explain how to use both of them on a web application hosted on a single domain name.

Main advantages of each soft


As a reminder, here are the main features each product owns.

HAProxy


HAProxy‘s main features:

  • Real load-balancer with smart persistence
  • Request queueing
  • Transparent proxy

Varnish


Varnish‘s main features:

  • Cache server with stale content delivery
  • Content compression
  • Edge Side Includes

Common features


HAProxy and Varnish both have the features below:

  • Content switching
  • URL rewritting
  • DDOS protection

So if we need any of them, we could use either HAProxy or Varnish.

Why a single domain

In web application, there are two types of content: static and dynamic.

By dynamic, I mean content which is generated on the fly and which is dedicated to a single user based on its current browsing on the application. Anything which is not in this category, can be considered as static. Even a page which is generated by PHP and whose content does change every minutes or few seconds (like the CMS WordPress or drupal). I call these pages “pseudo-static

The biggest strength of Varnish is that it can cache static objects, delivering them on behalf of the server, offloading most of the traffic from the server.



An object is identified by a Host header and its URL. When you have a single domain name, you have a single Host header for all your requests: static, pseudo static or dynamic.

You can’t split your traffic: everything requests must arrive on a single type of device: the LB, the cache, etc…

A good practise to split dynamic and static content is to use one domain name per type of object: www.domain.tld for dynamic and static.domain.tld for static content. Doing that you could forward dynamic traffic to the LB and static traffic to the caches directly.



Now, I guess you understand that the web application host naming can have an impact on the platform you’re going to build.

In the current article, I’ll only focus on applications using a single domain name. We’ll see how we can route traffic to the right product despite the limitation of the single domain name.



Don’t worry, I’ll write an other article later about the fun we could have when building a platform for an application hosted on multiple domain names.

Available architectures

Considering I summarize the “web application” as a single brick called “APPSERVER“, we have 2 main architectures available:

  1. CLIENT ==> HAPROXY ==> VARNISH ==> APPSERVER
  2. CLIENT ==> VARNISH ==> HAPROXY ==> APPSERVER

Pro and cons of HAProxy in front of Varnish


Pros:

  • Use HAProxy‘s smart load-balancing algorithm such as uri, url_param to make varnish caching more efficient and improve the hit rate
  • Make the Varnish layer scalable, since load-balanced
  • Protect Varnish ramp up when starting up (related to thread pool creation)
  • HAProxy can protect against DDOS and slowloris
  • Varnish can be used as a WAF

Cons:

  • no easy way do do application layer persistence
  • HAProxy queueing system can hardly protect the application hidden by Varnish
  • The client IP will be mandatory forwwarded on the X-Forwarded-For header (or any header you want)

Pro and cons of Varnish in front of HAProxy


Pros:

  • Smart layer 7 persistence with HAProxy
  • HAProxy layer scalable (with persistence preserved) since load-balanced by Varnish
  • APPSERVER protection through HAProxy request queueing
  • Varnish can be used as a WAF
  • HAProxy can use the client IP address (provided by Varnish in a HTTP header) to do Transparent proying (getting connected on APPSERVER with the client ip)

Cons:

  • HAProxy can’t protect against DDOS, Varnish will do
  • Cache size must be big enough to store all objects
  • Varnish layer not scalable

Finally, which is the best architecture??


No need to choose between both architecture above which one is the less worst for you.

It would be better to build a platform where there are no negative points.

The Architecture


The diagram below shows the architecture we’re going to work on.
haproxy_varnish
Legend:

  • H: HAProxy Load-Balancers (could be ALOHA Load-Balancer or any home made)
  • V: Varnish servers
  • S: Web application servers, whatever the product used here (tomcat, jboss, etc…)…
  • C: Client or end user

Main roles of each layers:

  • HAProxy: Layer 7 traffic routing, first row of protection against DDOS (syn flood, slowloris, etc…), application request flow optimiation
  • Varnish: Caching, compression. Could be used later as a WAF to protect the application
  • Server: hosts the application and the static content
  • Client: browse and use the web application

traffic flow


Basically, the client will send all the requests to HAProxy, then HAProxy, based on URL or file extension will take a routing decision:

  • If the request looks to be for a (pseudo) static object, then forward it to Varnish
    If Varnish misses the object, it will use HAProxy to get the content from the server.
  • Send all the other requests to the appserver. If we’ve done our job properly, there should be only dynamic traffic here.

I don’t want to use Varnish as the default option in the flow, cause a dynamic content could be cached, which could lead to somebody’s personal information sent to everybody

Furthermore, in case of massive misses or purposely built request to bypass the caches, I don’t the servers to be hammered by Varnish, so HAProxy protects them with a tight traffic regulation between Varnish and appservers..

Dynamic traffic flow


The diagram below shows how the request requiring dynamic content should be ideally routed through the platform:
haproxy_varnish_dynamic_flow
Legend:

  1. The client sends its request to HAProxy
  2. HAProxy chooses a server based on cookie persistence or Load-Balancing Algorithm if there is no cookie.
    The server processes the request and send the response back to HAPRoxy which forwards it to the client

Static traffic flow


The diagram below shows how the request requiring static content should be ideally routed through the platform:
haproxy_varnish_static_flow

  1. The client sends its request to HAProxy which sees it asks for a static content
  2. HAProxy forward the request to Varnish. If Varnish has the object in Cache (a HIT), it forwards it directly to HAProxy.
  3. If Varnish doesn’t have the object in cache or if the cache has expired, then Varnish forwards the request to HAProxy
  4. HAProxy randomly chooses a server. The response goes back to the client through Varnish.

In case of a MISS, the flow looks heavy ūüôā I want to do it that way to use the HAProxy traffic regulation features to prevent Varnish to flood the servers. Furthermore, since Varnish sees only static content, its HIT rate is over 98%… So the overhead is very low and the protection is improved.

Pros of such architecture

  • Use smart load-balancing algorithm such as uri, url_param to make varnish caching more efficient and improve the hit rate
  • Make the Varnish layer scalable, since load-balanced
  • Startup protection for Varnish and APPSERVER, allowing server reboot or farm expansion even under heavy load
  • HAProxy can protect against DDOS and slowloris
  • Smart layer 7 persistence with HAProxy
  • APPSERVER protection through HAProxy request queueing
  • HAProxy can use the client IP address to do Transparent proxying (getting connected on APPSERVER with the client ip)
  • Cache farm failure detection and routing to application servers (worst case management)
  • Can load-balance any type of TCP based protocol hosted on APPSERVER

Cons of such architecture


To be totally fair, there are a few “non-blocking” issues:

  • HAProxy layer is hardly scalable (must use 2 crossed Virtual IPs declared in the DNS)
  • Varnish can’t be used as a WAF since it will see only static traffic passing through. This can be updated very easily

Configuration

HAProxy Configuration

# On Aloha, the global section is already setup for you
# and the haproxy stats socket is available at /var/run/haproxy.stats
global
  stats socket ./haproxy.stats level admin
  log 10.0.1.10 local3

# default options
defaults
  option http-server-close
  mode http
  log global
  option httplog
  timeout connect 5s
  timeout client 20s
  timeout server 15s
  timeout check 1s
  timeout http-keep-alive 1s
  timeout http-request 10s  # slowloris protection
  default-server inter 3s fall 2 rise 2 slowstart 60s

# HAProxy's stats
listen stats
  bind 10.0.1.3:8880
  stats enable
  stats hide-version
  stats uri     /
  stats realm   HAProxy Statistics
  stats auth    admin:admin

# main frontend dedicated to end users
frontend ft_web
  bind 10.0.0.3:80
  acl static_content path_end .jpg .gif .png .css .js .htm .html
  acl pseudo_static path_end .php ! path_beg /dynamic/
  acl image_php path_beg /images.php
  acl varnish_available nbsrv(bk_varnish_uri) ge 1
  # Caches health detection + routing decision
  use_backend bk_varnish_uri if varnish_available static_content
  use_backend bk_varnish_uri if varnish_available pseudo_static
  use_backend bk_varnish_url_param if varnish_available image_php
  # dynamic content or all caches are unavailable
  default_backend bk_appsrv

# appsrv backend for dynamic content
backend bk_appsrv
  balance roundrobin
  # app servers must say if everything is fine on their side
  # and they can process requests
  option httpchk
  option httpchk GET /appcheck
  http-check expect rstring [oO][kK]
  cookie SERVERID insert indirect nocache
  # Transparent proxying using the client IP from the TCP connection
  source 10.0.1.1 usesrc clientip
  server s1 10.0.1.101:80 cookie s1 check maxconn 250
  server s2 10.0.1.102:80 cookie s2 check maxconn 250

# static backend with balance based on the uri, including the query string
# to avoid caching an object on several caches
backend bk_varnish_uri
  balance uri # in latest HAProxy version, one can add 'whole' keyword
  # Varnish must tell it's ready to accept traffic
  option httpchk HEAD /varnishcheck
  http-check expect status 200
  # client IP information
  option forwardfor
  # avoid request redistribution when the number of caches changes (crash or start up)
  hash-type consistent
  server varnish1 10.0.1.201:80 check maxconn 1000
  server varnish2 10.0.1.202:80 check maxconn 1000

# cache backend with balance based on the value of the URL parameter called "id"
# to avoid caching an object on several caches
backend bk_varnish_url_param
  balance url_param id
  # client IP information
  option forwardfor
  # avoid request redistribution when the number of caches changes (crash or start up)
  hash-type consistent
  server varnish1 10.0.1.201:80 maxconn 1000 track bk_varnish_uri/varnish1
  server varnish2 10.0.1.202:80 maxconn 1000 track bk_varnish_uri/varnish2

# frontend used by Varnish servers when updating their cache
frontend ft_web_static
  bind 10.0.1.3:80
  monitor-uri /haproxycheck
  # Tells Varnish to stop asking for static content when servers are dead
  # Varnish would deliver staled content
  monitor fail if nbsrv(bk_appsrv_static) eq 0
  default_backend bk_appsrv_static

# appsrv backend used by Varnish to update their cache
backend bk_appsrv_static
  balance roundrobin
  # anything different than a status code 200 on the URL /staticcheck.txt
  # must be considered as an error
  option httpchk
  option httpchk HEAD /staticcheck.txt
  http-check expect status 200
  # Transparent proxying using the client IP provided by X-Forwarded-For header
  source 10.0.1.1 usesrc hdr_ip(X-Forwarded-For)
  server s1 10.0.1.101:80 check maxconn 50 slowstart 10s
  server s2 10.0.1.102:80 check maxconn 50 slowstart 10s

Varnish Configuration

backend bk_appsrv_static {
        .host = "10.0.1.3";
        .port = "80";
        .connect_timeout = 3s;
        .first_byte_timeout = 10s;
        .between_bytes_timeout = 5s;
        .probe = {
                .url = "/haproxycheck";
                .expected_response = 200;
                .timeout = 1s;
                .interval = 3s;
                .window = 2;
                .threshold = 2;
                .initial = 2;
        }
}

acl purge {
        "localhost";
}

sub vcl_recv {
### Default options

        # Health Checking
        if (req.url == /varnishcheck) {
                error 751 "health check OK!";
        }

        # Set default backend
        set req.backend = bk_appsrv_static;

        # grace period (stale content delivery while revalidating)
        set req.grace = 30s;

        # Purge request
        if (req.request == "PURGE") {
                if (!client.ip ~ purge) {
                        error 405 "Not allowed.";
                }
                return (lookup);
        }

        # Accept-Encoding header clean-up
        if (req.http.Accept-Encoding) {
                # use gzip when possible, otherwise use deflate
                if (req.http.Accept-Encoding ~ "gzip") {
                        set req.http.Accept-Encoding = "gzip";
                } elsif (req.http.Accept-Encoding ~ "deflate") {
                        set req.http.Accept-Encoding = "deflate";
                } else {
                        # unknown algorithm, remove accept-encoding header
                        unset req.http.Accept-Encoding;
                }

                # Microsoft Internet Explorer 6 is well know to be buggy with compression and css / js
                if (req.url ~ ".(css|js)" && req.http.User-Agent ~ "MSIE 6") {
                        remove req.http.Accept-Encoding;
                }
        }

### Per host/application configuration
        # bk_appsrv_static
        # Stale content delivery
        if (req.backend.healthy) {
                set req.grace = 30s;
        } else {
                set req.grace = 1d;
        }

        # Cookie ignored in these static pages
        unset req.http.cookie;

### Common options
         # Static objects are first looked up in the cache
        if (req.url ~ ".(png|gif|jpg|swf|css|js)(?.*|)$") {
                return (lookup);
        }

        # if we arrive here, we look for the object in the cache
        return (lookup);
}

sub vcl_hash {
        hash_data(req.url);
        if (req.http.host) {
                hash_data(req.http.host);
        } else {
                hash_data(server.ip);
        }
        return (hash);
}

sub vcl_hit {
        # Purge
        if (req.request == "PURGE") {
                set obj.ttl = 0s;
                error 200 "Purged.";
        }

        return (deliver);
}

sub vcl_miss {
        # Purge
        if (req.request == "PURGE") {
                error 404 "Not in cache.";
        }

        return (fetch);
}

sub vcl_fetch {
        # Stale content delivery
        set beresp.grace = 1d;

        # Hide Server information
        unset beresp.http.Server;

        # Store compressed objects in memory
        # They would be uncompressed on the fly by Varnish if the client doesn't support compression
        if (beresp.http.content-type ~ "(text|application)") {
                set beresp.do_gzip = true;
        }

        # remove any cookie on static or pseudo-static objects
        unset beresp.http.set-cookie;

        return (deliver);
}

sub vcl_deliver {
        unset resp.http.via;
        unset resp.http.x-varnish;

        # could be useful to know if the object was in cache or not
        if (obj.hits > 0) {
                set resp.http.X-Cache = "HIT";
        } else {
                set resp.http.X-Cache = "MISS";
        }

        return (deliver);
}

sub vcl_error {
        # Health check
        if (obj.status == 751) {
                set obj.status = 200;
                return (deliver);
        }
}
  

Related links

Links

Web traffic limitation

Synopsis

For different reason, we may want to limit the number of connections or the number of requests we allow to a web farm.
In example:

  • give more capacity to authenticated users compared to anonymous one
  • limit web farm users per virtualhost
  • protect your website from spiders
  • etc…

Basically, we’ll manage two webfarm, one with as much as capacity as we need, and an other one where we’ll redirect people we want to slow down.
The routing decision can be taken using a header, a cookie, a part of the url, source IP address, etc…

Configuration

The configuration below would do the job.

There are only two webservers in the farm, but we want to slow down some virtual host or old and almost never used applications in order to protect and let more capacity to the regular traffic.

you can play with the inspect-delay time to be more or less aggressive.

frontend www
  bind :80
  mode http
  acl spiderbots hdr_cnt(User-Agent) eq 0
  acl personnal hdr(Host) www.personnalwebsite.tld www.oldname.tld
  acl oldies path_beg /old /foo /bar
  use_backend limited_www if spiderbots or personnal or oldies
  default_backend www

backend www
 mode http
 server be1  192.168.0.1:80 check maxconn 100
 server be1  192.168.0.2:80 check maxconn 100

backend limited_www
 mode http
 acl too_fast be_sess_rate gt 10
 acl too_many be_conn gt 10
 tcp-request inspect-delay 3s
 tcp-request content accept if ! too_fast or ! too_many
 tcp-request content accept if WAIT_END
 server be1  192.168.0.1:80 check maxconn 100
 server be1  192.168.0.2:80 check maxconn 100

Results

Without the example above, an apache bench would be able to go up to 3600 req/s on the regular farm and only 9 req/s on the limited one.

Related articles

Links

Aloha Load-balancer as a reverse proxy

Synopsis

You own small public subnet and want to be able to access multiple web sites or application behind a single public IP address.
Basically, you want to use your Aloha load-balancer as a reverse proxy.

Diagram

The diagram below shows how the reverse proxy works.
In our case, we have 2 domains pointing to the Aloha IP address.
Depending on the domain name, the Aloha will decide which farm it will use.
reverse_proxy

Configuration

On the Aloha, the reverse-proxy configuration is achieved by HAProxy.
HAProxy configuration can be done in the “layer 7” tab of the GUI or through the CLI command “service haproxy edit”.

First, the Frontend definition.
This is where HAProxy will take rooting decision based on layer 7 information.

frontend ft_websites
   mode http
   bind 0.0.0.0:80
   log global
   option httplog
# Capture Host header is important to know whether rules matches or not
   capture request header host len 64
# mysite configuration
   acl site1 hdr_sub(host) site1.com
   acl site1 hdr_sub(host) site1.eu
   use_backend bk_site1 if site1
# yoursite configuration
   acl site2 hdr_sub(host) site2.com
   acl site2 hdr_sub(host) site2.ie
   use_backend bk_site2 if site2
# default configuration
   default_backend bk_default

And now, we can define our backend sections for each website or application:

# First site backend configuration
backend bk_site1
   mode http
   balance roundrobin
   cookie SERVERID insert indirect nocache    # persistence cookie
   option forwardfor # add X-Forwarded-For
   option httpchk HEAD / HTTP/1.0rnHost: www.site1.com
   default-server inter 3s rise 2 fall 3 slowstart 0 # servers default parameters
   server srv1 192.168.10.11:80 cookie s1 weight 10 maxconn 1000 check
   server srv2 192.168.10.12:80 cookie s2 weight 10 maxconn 1000 check

# Second site backend configuration
backend bk_site2
   mode http
   balance roundrobin
   cookie SERVERID insert indirect nocache    # persistence cookie
   option forwardfor # add X-Forwarded-For
   option httpchk HEAD / HTTP/1.0rnHost: www.site2.com
   default-server inter 3s rise 2 fall 3 slowstart 0 # servers default parameters
   server srv1 192.168.10.13:80 cookie s1 weight 10 maxconn 1000 check
   server srv2 192.168.10.14:80 cookie s2 weight 10 maxconn 1000 check

And finally, the “garbage collector”, the default backend which hosts all the traffic that has not match any other rules.
It may be important to watch logs from this backend in order to ensure there is no mis-configuration.

backend bk_default
   mode http
   balance roundrobin
   option forwardfor # add X-Forwarded-For
   option httpchk HEAD /
   default-server inter 3s rise 2 fall 3 slowstart 0 # servers default parameters
   server srv1 192.168.10.8:80 weight 10 maxconn 1000 check
   server srv2 192.168.10.9:80 weight 10 maxconn 1000 check

Links

layer 7 load-balancing proxy mode

This mode is sometimes called reverse-proxy.

The load-balancer is in the middle of all transactions between the user and the server.
It maintains two separated TCP connections:

  • One with the user: the load-balancer acts as a server. It takes requests and forward responses
  • One with the server: the load-balancer acts as a user: it forward requests and get responses

This mode is the less intrusive in an architecture: nothing to change on any server or router.

TCP connection overview

layer7_reverse_proxy_tcp_connection
The diagram shows clearly the two TCP connections maintained by the load-balancer.

Data flow

layer7_reverse_proxy_data_flow
As a consequence of the proxy mode, the server gets a connection from the load-balancer IP address and the server can only get the client IP address at the application level.
IE: HTTP X-Forwarded-For header.

Pros and cons

Pros

  • not intrusive
  • more secure: server aren’t reached directly
  • allows protocol inspection and validation
  • can load-balance services even if both client and server are in the same subnet

Cons

  • backend servers only see one IP address: the one from the LB
  • “slower” than layer 4 load-balancing (we speak about micro-seconds)

When use this mode?

  • when you need application layer intelligence (content switching, etc…)
  • in order to protect an application
  • when the application does not care about the user IP (or can use the X-Forwarded-For header)

Links

layer 7 load-balancing transparent proxy mode

The load-balancer is in the middle of all transactions between the user and the server.
It maintains two separated TCP connections:

  • With the user: the load-balancer acts as a server. It takes requests and forward responses
  • With the server: the load-balancer acts as a user: it forward requests and get responses

It is really close to the proxy mode, but has one main difference: the load-balancer opens the connection to the server using the client IP address as source IP.

Of course, the backend server default gateway must be the load-balancer.

TCP connection overview

layer7_transparent_proxy_tcp_connection
The diagram shows clearly the two TCP connections maintained by the load-balancer.

Data flow

layer7_transparent_proxy_data_flow
Since the load-balancer opens the TCP connection to the server with the user IP address, the server must use the load-balancer as its default gateway.
Otherwise, the server would forward response directly to the client and the client would drop it…

Pros and cons

Pros

  • servers see the client IP address at network layer
  • secure: server aren’t reached directly
  • allows protocol inspection and validation

Cons

  • intrusive: must change the default gateway of the server.
  • “slower” than layer 4 load-balancing (we speak about micro-seconds)
  • clients and servers must be in two different subnets.

When use this mode?

  • when the load-balanced service needs client IP at network layer (IE: anti-spam services)
  • when you need application layer intelligence (content switching, etc…)
  • in order to protect an application

Links

Smart content switching for news website

Purpose

Build a scalable architecture for news website using below components:

  • load balancer with content switching capability
  • cache server
  • application server

Definition

  • Content switching: the ability to route traffic based on the content of the HTTP request: URI, parameters, headers, etc…
    HAproxy is a good example of OpenSource reverse proxy load-balancer with content switching capability.
  • cache server: a server able to quickly deliver static content.
    Squid, Varnish and Apache Traffic Server are OpenSource cache reverse proxy.
  • application server: the server which build the pages for your news website.
    This can be either Apache+PHP, Tomcat+Java, IIS+asp .net, etc…

Target Network Diagram

Architecture

Context

All the traffic pass through the Aloha load-balancer.
HAproxy, the layer 7 load-balancer included in Aloha, will do the content switching to route request either to cache servers or to application servers.
If cache server misses an object, it will get it from the application servers.

Haproxy configuration

Service configuration:

frontend public
	bind :80
	acl DYN path_beg /user
	acl DYN path_beg /profile
	acl DYN method POST
	use_backend APPLICATION if DYN
	default_backend CACHE

The content switching is achieved by the few lines beginning with the keyword acl.
If a URI starts with /user or /profile or if the method is a POSTthen, the traffic will be redirected to the APPLICATION server pool, otherwise the CACHE pool will be used.

Application pool configuration:

backend APPLICATION
	balance roundrobin
	cookie PHPSESSID prefix
	option httpchk /health
	http-check expect string GOOD
	server APP1 1.1.1.1:80 cookie app1 check
	server APP2 1.1.1.2:80 cookie app2 check

We maintain backend server persistence using the cookie sent by the application server, named PHPSESSID in this example. You can change this cookie name to the cookie provided by your application, like JSESSIONID, ASP.NET_SessionId or anything else.
Note the health check URL: /health. The script executed on the backend will check server health (database availability, CPU usage, memory usage, etc…) and will return a GOOD if everything looks fine and a WRONG if not. With this, HAproxy will consider the server as ready only if the server returns a GOOD.

Cache pool configuration

backend CACHE
	balance url
	hash-type consistent
	option httpchk /
	http-check expect KEYWORD
	reqidel ^Accept-Encoding unless { hdr_sub(Accept-Encoding) gzip }
	reqirep ^Accept-Encoding: .*gzip.* Accept-Encoding: gzip
	server CACHE1 1.1.1.3:80 check
	server CACHE2 1.1.1.4:80 check

Here, we balance requests according to the URL. The purpose of this metric is to “force” a single URL to always be retrieved from the same server.
Main benefits are :

  1. less objects in caches memory
  2. less requests to the application server for static and pseudo-static content

In order to lower the impact of the Vary: header on Content Encoding, we added the two lines reqidel / reqirep to normalize a bit the Accept-Encoding header.

Links