Tag Archives: transparent proxy

Howto transparent proxying and binding with HAProxy and ALOHA Load-Balancer

Transparent Proxy

HAProxy works has a reverse-proxy. Which means it maintains 2 connections when allowing a client to cross it:
  – 1 connection between HAProxy and the client
  – 1 connection between HAProxy and the server

HAProxy then manipulate buffers between these two connections.
One of the drawback of this mode is that HAProxy will let the kernel to establish the connection to the server. The kernel is going to use a local IP address to do this.
Because of this, HAProxy “hides” the client IP by its own one: this can be an issue in some cases.
Here comes the transparent proxy mode: HAProxy can be configured to spoof the client IP address when establishing the TCP connection to the server. That way, the server thinks the connection comes from the client directly (of course, the server must answer back to HAProxy and not to the client, otherwise it can’t work: the client will get an acknowledge from the server IP while it has established the connection on HAProxy‘s IP).

Transparent binding

By default, when one want HAProxy to get traffic, we have to tell it to bind an IP address and a port.
The IP address must exist on the operating system (unless you have setup the sysctl net.ipv4.ip_nonlocal_bind) and the OS must announce the availability to the other devices on the network through ARP protocol.
Well, in some cases we want HAProxy to be able to catch traffic on the fly without configuring any IP address or VRRP or whatever…
This is where transparent binding comes in: HAProxy can be configured to catch traffic on the fly even if the destination IP address is not configured on the server.
These IP addresses will never be pingable, but they’ll deliver the services configured in HAProxy.

HAProxy and the Linux Kernel

Unfortunately, HAProxy can’t do transparent binding or proxying alone. It must stand on a compiled and tuned Linux Kernel and operating system.
Below, I’ll explain how to do this in a standard Linux distribution.
Here is the check list to meet:
  1. appropriate HAProxy compilation option
  2. appropriate Linux Kernel compilation option
  3. sysctl settings
  4. iptables rules
  5. ip route rules
  6. HAProxy configuration

HAProxy compilation requirements


First of all, HAProxy must be compiled with the option TPROXY enabled.
It is enabled by default when you use the target LINUX26 or LINUX2628.

Linux Kernel requirements

You have to ensure your kernel has been compiled with the following options:
  – CONFIG_NETFILTER_TPROXY
  – CONFIG_NETFILTER_XT_TARGET_TPROXY

Of course, iptables must be enabled as well in your kernel 🙂

sysctl settings

The following sysctls must be enabled:
  – net.ipv4.ip_forward
  – net.ipv4.ip_nonlocal_bind

iptables rules


You must setup the following iptables rules:

iptables -t mangle -N DIVERT
iptables -t mangle -A PREROUTING -p tcp -m socket -j DIVERT
iptables -t mangle -A DIVERT -j MARK --set-mark 1
iptables -t mangle -A DIVERT -j ACCEPT

Purpose is to mark packets which matches a socket bound locally (by HAProxy).

IP route rules


Then, tell the Operating System to forward packets marked by iptables to the loopback where HAProxy can catch them:

ip rule add fwmark 1 lookup 100
ip route add local 0.0.0.0/0 dev lo table 100

HAProxy configuration


Finally, you can configure HAProxy.
  * Transparent binding can be configured like this:

[...]
frontend ft_application
  bind 1.1.1.1:80 transparent
[...]

  * Transparent proxying can be configured like this:

[...]
backend bk_application
  source 0.0.0.0 usesrc clientip
[...]

Transparent mode in the ALOHA Load-Balancer


Now, the same steps in the ALOHA Load-balancer, which is an HAProxy based load-balancing appliance:
  1-5. not required, the ALOHA kernel is deeply tuned for this purpose
  6. HAProxy configuration

LB Admin tab (AKA click mode)

  * Transparent binding can be configured like this, when editing a Frontend listener:
frontend_listener_transparent

  * Transparent proxying can be configured like this when editing a farm:
backend_transparent

LB Layer 7 tab (vi in a browser mode)


  * Transparent binding can be configured like this:

[...]
frontend ft_application
  bind 1.1.1.1:80 transparent
[...]

  * Transparent proxying can be configured like this:

[...]
backend bk_application
  source 0.0.0.0 usesrc clientip
[...]

Links

Scalable WAF protection with HAProxy and Apache with modsecurity

Greeting to Thomas Heil, from our German partner Olanis, for his help in Apache and modsecurity configuration assistance.

What is a Web Application Firewall (WAF)?

Years ago, it was common to protect networks using a firewall… Well known devices which filter traffic at layer 3 and 4…
Now, the failures have moved from the network stack to the application layer, making the old firewall useless and obsolete (for protection purpose I mean). We used then to deploy IDS or IPS, which tried to match attacks at a packet level. These products are usually very hard to tune.
Then the Web Application Firewall arrived: it’s a firewall aware of the layer 7 protocol in order to be more efficient when deciding to block requests (or responses).
This is because the attacks became more complicated, like the SQL Injection, Cross Site scripting, etc…

One of the most famous opensource WAF is mod_security, which works as a module on Apache webserver and IIS for a long time and has been announced recently for nginx too.
A very good alternative is naxsi, a module for nginx, still young but very promising.

On today’s article, I’ll focus on modsecurity for Apache. In a next article, I’ll build the same platform with naxsi and nginx.

Scalable WAF platform


The main problem with WAF, is that they require a lot of resources to analyse each requests headers and body. (it can even be configured to analyze the response). If you want to be able to protect all your upcoming traffic, then you must think scalability.
In the present article I’m going to explain how to build a reliable and scalable platform where WAF capacity won’t be an issue. I could add “and where WAF maintenance could be done during business hours).
Here are the basic purpose to achieve:

  • Web Application Firewall: achieved by Apache and modsecurity
  • High-availability: application server and WAF monitoring, achieved by HAProxy
  • Scalability: ability to adapt capacity to the upcoming volume of traffic, achieved by HAProxy

It would be good if the platform would achieve the following advanced features:

  • DDOS protection: blind and brutal attacks protection, slowloris protection, achieved by HAProxy
  • Content-Switching: ability to route only dynamic requests to the WAF, achieved by HAProxy
  • Reliability: ability to detect capacity overusage, this is achieved by HAProxy
  • Performance: deliver response as fast as possible, achieved by the whole platform

Web platform with WAF Diagram

The diagram below shows the platform with HAProxy frontends (prefixed by ft_) and backends (prefixed by bk_). Each farm is composed by 2 servers.

As you can see, at first, it seems all the traffic goes to the WAFs, then comes back in HAProxy before being routed to the web servers. This would be the basic configuration, meeting the following basic requirements: Web Application Firewall, High-Availability, Scalability.

Platform installation


As load-balancer, I’m going to use our well known ALOHA 🙂
The web servers are standard debian with apache and PHP, the application used on top of it is dokuwiki. I have no procedure for this one, this is very straight forward!
The WAF run on CentOS 6.3 x86_64, using modsecurity 2.5.8. The installation procedure is outside of the scope of this article, so I documented it on my personal wiki.
All of these servers are virtualized on my laptop using KVM, so NO, I won’t run performance benchmark, it would be ridiculous!

Configuration

WAF configuration


Basic configuration here, no tuning at all. The purpose is not to explain how to configure a WAF, sorry.

Apache Configuration


Modification to the file /etc/httpd/conf/httpd.conf:

Listen 192.168.10.15:81
[...]
LoadModule security2_module modules/mod_security2.so
LoadModule unique_id_module modules/mod_unique_id.so
[...]
NameVirtualHost 192.168.10.15:81
[...]
<IfModule mod_security2.c>
        SecPcreMatchLimit 1000000
        SecPcreMatchLimitRecursion 1000000
        SecDataDir logs/
</IfModule>
<VirtualHost 192.168.10.15:81>
        ServerName *
        AddDefaultCharset UTF-8

        <IfModule mod_security2.c>
                Include modsecurity.d/modsecurity_crs_10_setup.conf
                Include modsecurity.d/aloha.conf
                Include modsecurity.d/rules/*.conf

                SecRuleEngine On
                SecRequestBodyAccess On
                SecResponseBodyAccess On
        </IfModule>

        ProxyPreserveHost On
        ProxyRequests off
        ProxyVia Off
        ProxyPass / http://192.168.10.2:81/
        ProxyPassReverse / http://192.168.10.2:81/
</VirtualHost>

Basically, we just turned Apache into a reverse-proxy, accepting traffic for any server name, applying modsecurity rules before routing traffic back to HAProxy frontend dedicated to web servers.

Client IP


HAProxy works has a reverse proxy and so will use its own IP address to get connected on the WAF server. So you have to install mod_rpaf to get the client IP in the WAF for both tracking and logging.
To install mod_rpaf, follow these instructions: apache mod_rpaf installation.
Concerning its configuration, we’ll do it as below, edit the file /etc/httpd/conf.d/mod_rpaf.conf:

LoadModule rpaf_module modules/mod_rpaf-2.0.so

<IfModule rpaf_module>
        RPAFenable On
        RPAFproxy_ips 192.168.10.1 192.168.10.3
        RPAFheader X-Client-IP
</IfModule>

modsecurity custom rules

In the Apache configuration there is a directive which tells modsecurity to load a file called aloha.conf. The purpose of this file is to tell to modsecurity to deny the health check requests from HAProxy and to prevent logging them.
HAProxy will consider the WAF as operational only if it gets a 403 response to this request. (see HAProxy configuration below).
Content of the file /etc/httpd/modsecurity.d/aloha.conf:

SecRule REQUEST_FILENAME "/waf_health_check" "nolog,deny"

Load-Balancer (HAProxy) configuration for basic usage


The configuration below is the first shoot we do when deploying such platform, it is basic, simple and straight forward:

######## Default values for all entries till next defaults section
defaults
  option  http-server-close
  option  dontlognull
  option  redispatch
  option  contstats
  retries 3
  timeout connect 5s
  timeout http-keep-alive 1s
  # Slowloris protection
  timeout http-request 15s
  timeout queue 30s
  timeout tarpit 1m          # tarpit hold tim
  backlog 10000

# public frontend where users get connected to
frontend ft_waf
  bind 192.168.10.2:80 name http
  mode http
  log global
  option httplog
  timeout client 25s
  maxconn 1000
  default_backend bk_waf

# WAF farm where users' traffic is routed first
backend bk_waf
  balance roundrobin
  mode http
  log global
  option httplog
  option forwardfor header X-Client-IP
  option httpchk HEAD /waf_health_check HTTP/1.0
  # Specific WAF checking: a DENY means everything is OK
  http-check expect status 403
  timeout server 25s
  default-server inter 3s rise 2 fall 3
  server waf1 192.168.10.15:81 maxconn 100 weight 10 check
  server waf2 192.168.10.16:81 maxconn 100 weight 10 check

# Traffic secured by the WAF arrives here
frontend ft_web
  bind 192.168.10.2:81 name http
  mode http
  log global
  option httplog
  timeout client 25s
  maxconn 1000
  # route health check requests to a specific backend to avoid graph pollution in ALOHA GUI
  use_backend bk_waf_health_check if { path /waf_health_check }
  default_backend bk_web

# application server farm
backend bk_web
  balance roundrobin
  mode http
  log global
  option httplog
  option forwardfor
  cookie SERVERID insert indirect nocache
  default-server inter 3s rise 2 fall 3
  option httpchk HEAD /
  timeout server 25s
  server server1 192.168.10.11:80 maxconn 100 weight 10 cookie server1 check
  server server2 192.168.10.12:80 maxconn 100 weight 10 cookie server2 check

# backend dedicated to WAF checking (to avoid graph pollution)
backend bk_waf_health_check
  balance roundrobin
  mode http
  log global
  option httplog
  option forwardfor
  default-server inter 3s rise 2 fall 3
  timeout server 25s
  server server1 192.168.10.11:80 maxconn 100 weight 10 check
  server server2 192.168.10.12:80 maxconn 100 weight 10 check

Advanced Load-Balancing (HAProxy) configuration


We’re going now to improve a bit the platform. The picture below shows which type of protection is achieved by the load-balancer and the WAF:

The configuration below adds a few more features:

  • DDOS protection on the frontend
  • abuser or attacker detection in bk_waf and blocking on the public interface (ft_waf)
  • Bypassing WAF when overusage or unavailable

Which will allow to meet the advanced requirements: DDOS protection, Content-Switching, Reliability, Performance.

######## Default values for all entries till next defaults section
defaults
  option  http-server-close
  option  dontlognull
  option  redispatch
  option  contstats
  retries 3
  timeout connect 5s
  timeout http-keep-alive 1s
  # Slowloris protection
  timeout http-request 15s
  timeout queue 30s
  timeout tarpit 1m          # tarpit hold tim
  backlog 10000

# public frontend where users get connected to
frontend ft_waf
  bind 192.168.10.2:80 name http
  mode http
  log global
  option httplog
  timeout client 25s
  maxconn 10000

  # DDOS protection
  # Use General Purpose Couter (gpc) 0 in SC1 as a global abuse counter
  # Monitors the number of request sent by an IP over a period of 10 seconds
  stick-table type ip size 1m expire 1m store gpc0,http_req_rate(10s),http_err_rate(10s)
  tcp-request connection track-sc1 src
  tcp-request connection reject if { sc1_get_gpc0 gt 0 }
  # Abuser means more than 100reqs/10s
  acl abuse sc1_http_req_rate(ft_web) ge 100
  acl flag_abuser sc1_inc_gpc0(ft_web)
  tcp-request content reject if abuse flag_abuser

  acl static path_beg /static/ /dokuwiki/images/
  acl no_waf nbsrv(bk_waf) eq 0
  acl waf_max_capacity queue(bk_waf) ge 1
  # bypass WAF farm if no WAF available
  use_backend bk_web if no_waf
  # bypass WAF farm if it reaches its capacity
  use_backend bk_web if static waf_max_capacity
  default_backend bk_waf

# WAF farm where users' traffic is routed first
backend bk_waf
  balance roundrobin
  mode http
  log global
  option httplog
  option forwardfor header X-Client-IP
  option httpchk HEAD /waf_health_check HTTP/1.0

  # If the source IP generated 10 or more http request over the defined period,
  # flag the IP as abuser on the frontend
  acl abuse sc1_http_err_rate(ft_waf) ge 10
  acl flag_abuser sc1_inc_gpc0(ft_waf)
  tcp-request content reject if abuse flag_abuser

  # Specific WAF checking: a DENY means everything is OK
  http-check expect status 403
  timeout server 25s
  default-server inter 3s rise 2 fall 3
  server waf1 192.168.10.15:81 maxconn 100 weight 10 check
  server waf2 192.168.10.16:81 maxconn 100 weight 10 check

# Traffic secured by the WAF arrives here
frontend ft_web
  bind 192.168.10.2:81 name http
  mode http
  log global
  option httplog
  timeout client 25s
  maxconn 1000
  # route health check requests to a specific backend to avoid graph pollution in ALOHA GUI
  use_backend bk_waf_health_check if { path /waf_health_check }
  default_backend bk_web

# application server farm
backend bk_web
  balance roundrobin
  mode http
  log global
  option httplog
  option forwardfor
  cookie SERVERID insert indirect nocache
  default-server inter 3s rise 2 fall 3
  option httpchk HEAD /
  # get connected on the application server using the user ip
  # provided in the X-Client-IP header setup by ft_waf frontend
  source 0.0.0.0 usesrc hdr_ip(X-Client-IP)
  timeout server 25s
  server server1 192.168.10.11:80 maxconn 100 weight 10 cookie server1 check
  server server2 192.168.10.12:80 maxconn 100 weight 10 cookie server2 check

# backend dedicated to WAF checking (to avoid graph pollution)
backend bk_waf_health_check
  balance roundrobin
  mode http
  log global
  option httplog
  option forwardfor
  default-server inter 3s rise 2 fall 3
  timeout server 25s
  server server1 192.168.10.11:80 maxconn 100 weight 10 check
  server server2 192.168.10.12:80 maxconn 100 weight 10 check

Detecting attacks


On the load-balancer


The ft_waf frontend stick table tracks two information: http_req_rate and http_err_rate which are respectively the http request rate and the http error rate generated by a single IP address.
HAProxy would automatically block an IP which has generated more than 100 requests over a period of 10s or 10 errors (WAF detection 403 responses included) in 10s. The user is blocked for 1 minute as long as he keeps on abusing.
Of course, you can setup above values to whatever you need: it is fully flexible.

To know the status of IPs in your load-balancer, just run the command below:

echo show table ft_waf | socat /var/run/haproxy.stat - 
# table: ft_waf, type: ip, size:1048576, used:1
0xc33304: key=192.168.10.254 use=0 exp=4555 gpc0=0 http_req_rate(10000)=1 http_err_rate(10000)=1

Note: The ALOHA Load-balancer does not provide watch, but you can monitor the content of the table in live with the command below:

while true ; do echo show table ft_waf | socat /var/run/haproxy.stat - ; sleep 2 ; clear ; done

On the Waf


I have not setup anything particular on WAF logging, so every errors appears in /var/log/httpd/error_log. IE:

[Fri Oct 12 10:48:21 2012] [error] [client 192.168.10.254] ModSecurity: Access denied with code 403 (phase 2). Pattern match "(?:(?:[\\;\\|\\`]\\W*?\\bcc|\\b(wget|curl))\\b|\\/cc(?:[\\'\"\\|\\;\\`\\-\\s]|$))" at REQUEST_FILENAME. [file "/etc/httpd/modsecurity.d/rules/modsecurity_crs_40_generic_attacks.conf"] [line "25"] [id "950907"] [rev "2.2.5"] [msg "System Command Injection"] [data "/cc-"] [severity "CRITICAL"] [tag "WEB_ATTACK/COMMAND_INJECTION"] [tag "WASCTC/WASC-31"] [tag "OWASP_TOP_10/A1"] [tag "PCI/6.5.2"] [hostname "mywiki"] [uri "/dokuwiki/lib/images/license/button/cc-by-sa.png"] [unique_id "UHfZVcCoCg8AAApVAzsAAAAA"]

Seems to be a false positive 🙂

Conclusion


Today, we saw it’s easy to build a scalable and well performing WAF platform in front of our web application.
The WAF is able to communicate to HAProxy which IPs to automatically blacklist (throuth error rate monitoring), which is convenient since the attacker won’t bother the WAF for a certain amount of time 😉
The platform allows to detect WAF farm availability and to bypass it in case of total failure, we even saw it is possible to bypass the WAF for static content if the farm is running out of capacity. Purpose is to deliver a good end-user experience without dropping too much the security.
Note that it is possible to route all the static content to the web servers (or a static farm) directly, whatever the status of the WAF farm.
This make me say that the platform is fully scallable and flexible.
Also, bear in mind to monitor your WAF logs, as shown in the example above, there was a false positive preventing an image to be loaded from dokuwiki.

Related links

Links

Preserve source IP address despite reverse proxies

What is a Reverse-Proxy?

A Reverse-proxy is a server which get connected on upstream servers on behalf of users.
Basically, it usually maintain two TCP connections: one with the client and one with the upstream server.
The upstream server can be either an application server, a load-balancer or an other proxy/reverse-proxy.

For more details, please consult the page about the proxy mode of the Aloha load balancer.

Why using a reverse-proxy?

It can be used for different purpose:

  • Improve security, performance and scalability
  • Prevent direct access from a user to a server
  • Share a single IP for multiple services

Reverse-proxies are commonly deployed in DMZs to give access to servers located in a more secured area of the infrastructure.
That way, the Reverse-Proxy hides the real servers and can block malicious requests, choose a server based on the protocol or application information (IE: URL, HTTP header, SNI, etc…), etc…
Of course, a Reverse-proxy can act as a load-balancer 🙂

Drawback when using a reverse-proxy?


The main drawback when using a reverse-proxy is that it will hide the user IP: when acting on behalf of the user, it will use its own IP address to get connected on the server.
There is a workaround: using a transparent proxy, but this usage can hardly pass through firewalls or other reverse-proxies: the default gateway of the server must be the reverse-proxy.

Unfortunately, it is sometimes very useful to know the user IP when the connections comes in to the application server.
It can be mandatory for some applications and it can ease troubleshooting.

Diagram

The Diagram below shows a common usage of a reverse-proxy: it is isolated into a DMZ and handles the users traffic. Then it gets connected to the LAN where an other reverse-proxy act as a load-balancer.

Here is the flow of the requests and responses:

  1. The client gets connected through the firewall to the reverse-proxy in the DMZ and send it its request.
  2. The Reverse-Proxy validates the request, analyzes it to choose the right farm then forward it to the load-balancer in the LAN, through the firewall.
  3. The Load-balancer choose a server in the farm and forward the request to it
  4. The server processes the request then answers to the load-balancer
  5. The load-balancer forward the response to the reverse-proxy
  6. The reverse-proxy forward the response to the client

Bascially, the source IP is modified twice on this kind of architecture: during the steps 2 and 3.
And of course, the more you chain load-balancer and reverse proxies, the more the source IP will be changed.

The Proxy protocol


The proxy protocol was designed by Willy Tarreau, HAProxy developper.
It is used between proxies (hence its name) or between a proxy and a server which could understand it.

The main purpose of the proxy protocol is to forward information from the client connection to the next host.
These information are:

  • L4 and L3 protocol information
  • L3 source and destination addresses
  • L4 source and destination ports

That way, the proxy (or server) receiving these information can use them exactly as if the client was connected to it.

Basically, it’s a string the first proxy would send to the next one when connecting to it.
In example, the proxy protocol applied to an HTTP request:

PROXY TCP4 192.168.0.1 192.168.0.11 56324 80rn
GET / HTTP/1.1rn
Host: 192.168.0.11rn
rn

Note: no need to change anything in the architecture, since the proxy protocol is just a string sent over the TCP connection used by the sender to get connected on the service hosted by the receiver.

Now, I guess you understand how we can take advantage of this protocol to pass through the firewall, preserving the client IP address between the reverse proxy (which is able to generate the proxy protocol string) and the load-balancer (which is able to analyze the proxy protocol string).

Stud and stunnel are two SSL reverse proxy software which can send proxy protocol information.

Configuration

Between the Reverse-Proxy and the Load-Balancer

Since both devices must understand the proxy protocol, we’re going to consider both the LB and the RP are Aloha Load-Balancers.

Configuring the Reverse-proxy to send proxy protocol information


In the reverse proxy configuration, just add the keyword “send-proxy” on the server description line.
In example:

server srv1 192.168.10.1 check send-proxy

Configuring the Load-balancer to receive proxy protocol information

In the load-balancer configuration, just add the keyword “accept-proxy” on the bind description line.
In example:

bind 192.168.1.1:80 accept-proxy

What’s happening?


The reverse proxy will open a connection to the address binded by the load-balancer (192.168.1.1:80). This does not change from a regular connection flow.
Once the TCP connection is established, then the reverse proxy would send a string with client connection information.
The load-balancer can now use the client information provided through the proxy protocol exactly as if the connection was opened directly by the client itself. For example, we can match the client IP address in ACLs for white/black listing, stick-tables, etc…
This would also make the “balance source” algorithm much more efficient 😉

Between the Load-Balancer and the server

Here we are!!!!
Since the LB knows the client IP, we can use it to get connected on the server. Yes, this is some kind of spoofing :).

In your HAProxy configuration, just use the source parameter in your backend:

backend bk_app
[...]
  source 0.0.0.0 usesrc clientip
  server srv1 192.168.11.1 check

What’s happening?


The Load-balancer uses the client IP information provided by the proxy protocol to get connected to the server. (the server sees the client ip as source ip)
Since the server will receive a connection from the client IP address, it will try to reach it through its default gateway.
In order to process the reverse-NAT, the traffic must pass through the load-balancer, that’s why the server’s default gateway must be the load-balancer.
This is the only architecture change.

Compatibility


The kind of architecture in the diagram will only work with Aloha Load-balancers or HAProxy.

Links