Using HAProxy with the Proxy Protocol to Better Secure Your Database

The Proxy protocol is a widely used invention of our CTO at HAProxy Technologies, Willy Tarreau, to solve the problem of TCP connection parameters being lost when relaying TCP connections through proxies. Its primary purpose is to chain proxies and reverse-proxies without losing client information, and it’s used and supported by AWS ELB, Apache, NGINX, Varnish, Citrix, and many more (here’s a longer list of Proxy protocol supported technologies).

Why use the Proxy protocol? Well, when you lose client information like IP address when relaying connections through proxies, this tends to prevent you from being able to implement some pretty basic IP-level security and logging.

So you may not really know who you’re letting in to access your data:

Intruder-Revealed

In this post, I’ll show you how to use the Proxy protocol with HAProxy to enhance the security of your database.

Background

Let’s take a step back and explain a bit (Proxy protocol experts can skip this section). A proxy uses its IP stack to connect to remote servers, but this process normally loses the initial TCP connection data, including the source IP address, destination IP address and port number.

Proxy-protocol-figure-1

Traditional workarounds to this problem include Tproxy, which requires you to compile your kernel and set your proxy as your server’s default gateway. Furthermore, Tproxy can’t pass IP packets through firewalls that use NAT. You can also use HTTP X-Forwarded-For headers to retain TCP connection data, although this approach only works for HTTP. Many of these workarounds also require either a specific protocol or architectural changes, which prevents you from scaling up.

The Proxy protocol is protocol independent, meaning that it works with any layer-7 protocol. It doesn’t require infrastructure changes, works with NAT firewalls, and is scalable. The Proxy protocol’s only technical requirement is that both of the connection’s endpoints must be compatible with the Proxy protocol. These endpoints could include proxies, reverse-proxies, load-balancers, application servers and WAFs.

Proxy-protocol-figure-2

The Database Security Problem

An HAProxy server running without the Proxy protocol can create a couple security problems when you grant users access to the MySQL servers behind it.

First, Slow query logs and “show full processlist” commands on a MySQL server behind a HAProxy server that isn’t running the Proxy protocol won’t show the correct IP addresses of the clients, making it more difficult to identify hosts which are sending unoptimized, incorrect or queries injected via SQLi. Consider the following entry from a slow query log:


# Time: 2017-03-09T21:56:07.640875Z
# User@Host: test[test] @ centos7vert [192.168.122.64] Id: 11
# Schema: Last_errno: 0 Killed: 0
# Query_time: 7.026677 Lock_time: 0.025969 Rows_sent: 0 Rows_examined: 1 Rows_affected: 0
# Bytes_sent: 104
SET timestamp=1489096567;
SELECT * FROM test.test WHERE sleep(7);

The above example shows that a host on or behind the HAProxy server at 192.168.122.64 has issued a query that is suspicious or has performance issues, but we can’t identify the host. You could review the HAProxy logs for activity that occurred around the indicated time to identify the host, but this approach is usually impractical due to the delay between the timestamps of the HAProxy and MySQL logs.

The second security issue you may face from seeing the incorrect IP address is that MySQL grants can no longer allow just one client to use a given username if it is allowed to access MySQL through the HAProxy server via its firewall/acl’s, as to MySQL all IP’s look the same.  With the Proxy protocol you can maintain better privilege segregation between databases even if an attacker manages to get the password for a MySQL user dedicated for another client.

Using the Proxy protocol for a more secure database

Now here’s the good part: how to install, configure and use the Proxy protocol with a MySQL database. Note: you can use either HAProxy Community Edition or HAProxy Enterprise Edition with these instructions.

1. Percona installation

The instructions on the Percona website provide the details for installing Percona Server.

2. HAProxy configuration

Add a section to the HAProxy configuration file like the following:

frontend fe_mysqld
bind *:3306
mode tcp
log global
option tcplog
use_backend be_mysqld
backend be_mysqld
mode tcp
option mysql-check user haproxy post-41
server percona_server 192.168.122.185:3306 check

Add a grant to the Percona server so it will allow health checks with a MySQL command as follows:

mysql> CREATE USER 'haproxy'@'192.168.122.64';

Restart HAProxy after adding the grant statements to the Percona server.

3. New grant statements

A HAProxy server that doesn’t use the Proxy protocol requires you to add grant statements just for the IP address of the HAProxy server. The Proxy protocol requires you to add grants for servers that will be behind an HAProxy server as if the proxy wasn’t there. For example, assume that you want the host at IP address 192.168.122.1 to connect to the database. The following MySQL statement will add said grant:

mysql> CREATE USER 'test'@'192.168.122.1';

The HAProxy server will use these grants instead of the existing grants once the Proxy protocol has been enabled on the HAProxy and Percona servers.

4. Percona configuration

In the mysqld section of /etc/mysql/percona-server.conf.d/mysqld.cnf add the following line:

proxy_protocol_networks="192.168.122.64"

The above statement will accept a single IP address, multiple IP addresses separated by commas or a masked IP address like 192.168.122.0/24. An asterisk (“*”) for the IP address will cause the server to accept the Proxy protocol from any host, although this isn’t recommended for security reasons. These IPs must be trusted to send correct Proxy protocol information before adding them to the list.
Note that Percona won’t accept the Proxy protocol from 127.0.0.1 even if it is in proxy_protocol_networks; so if you are running HAProxy on the same server that you are running Percona on this issue may rear its ugly head.

Add an entry for each HAProxy server that will handle traffic to a Percona server. You must enable the Proxy protocol for the Percona and HAProxy servers or disable it for both at the same time. If the Proxy protocol configurations of the HAProxy and Percona servers don’t match, the connections will fail with a “packets received out of order” error. You may also want to configure the firewall on the Percona server to only accept connections from a HAProxy server, which ensures that clients must use a HAProxy server to access the Percona server.

The status of the Percona server will be marked as “down” in HAProxy after you enable the proxy_protocol_networks statement on the Percona server, provided you didn’t manually disable it during transmission (and are following this guide in order and haven’t added send-proxy yet). This problem occurs because Percona now requires the Proxy protocol but HAProxy isn’t yet sending it. You’ll fix this in the next step. Health checks will mark the node as down until send-proxy is added (and regular connections through it will encounter the same error).

5. Send-proxy

Append the following bolded argument to the server line (of the Percona server) in the backend section of the HAProxy configuration:

server percona_server 192.168.122.185:3306 check send-proxy

The above statement tells HAProxy to send the Proxy protocol packet for both health checks and normal connections.

If you are chaining multiple HAProxy servers together you may wish to use “send-proxy-v2” instead of “send-proxy”.  Its a newer version of the protocol that is binary rather than text so is faster to parse.  The downside is that servers need to support v2 of the protocol (Percona server does, so you can just put send-proxy-v2 in and it will work as expected) and that debugging its parsing is a bit more difficult from a packet capture.

With this added HAProxy will also send the Proxy protocol for health checks, but the result won’t be used by Percona as it won’t accept 127.0.0.1; as such your existing grants for health checks will continue to be used.

6. Rest easier with better security in place

Proxy protocol makes a Percona server more secure and easier to manage as shown by the now-correct bold text in the following Percona slow log entry:

# Time: 2017-03-09T22:41:20.195387Z
# User@Host: test[test] @ [192.168.122.1] Id: 540
# Schema: Last_errno: 0 Killed: 0
# Query_time: 7.039918 Lock_time: 0.038505 Rows_sent: 0 Rows_examined: 1 Rows_affected: 0
# Bytes_sent: 104
SET timestamp=1489099280;
SELECT * FROM test.test WHERE sleep(7);

The previous output is produced by the same query (and from the same node) that was shown in The Database Security Problem section, but we can now see that the client’s actual IP address is 192.168.122.1.

The IP addresses in the process list table are also more useful with the Proxy protocol as can be seen here:
mysql-processlist

You can also run grants on the MySQL server with the client’s real IP address instead of just the HAProxy server’s IP address. This change is primarily useful to better understand the items in the grant tables during later changes/audits and to ensure that users can only access select user accounts from select IP addresses.

So you go from letting anyone in who seems to have a legitimate credentials (with possible malicious intent):

Intruder-Accepted

To keeping out any users not coming from authorized IP addresses:

Intruder-Denied

Congratulations on your new level of data security!

Additional Resources

Want more information? Here’s another guide to using the Proxy protocol with a Percona server.

Load Balancing and Application Delivery for the Enterprise [Webinar]

Do you know what makes HAProxy Enterprise Edition different from HAProxy Community Edition?

HAPEE isn’t just HAProxy Community with paid support, and unlike some other products based on open source projects, HAPEE doesn’t strip away any of the capabilities of HAProxy Community.

You can learn what HAPEE is all about and how it can provide additional benefits to enterprises during our webinar on March 2nd.

In this webinar, we’ll present:

  • How HAProxy Enterprise Edition (HAPEE) is different from HAProxy Community Edition
  • Why HAPEE is the most up-to-date, secure, and stable version of HAProxy
  • How enterprises leverage HAPEE to scale-out environments in the cloud
  • How enterprises can increase their admin productivity using HAPEE
  • How HAPEE enables advanced DDOS protection and helps mitigate other attacks

Sign up here.

Welcome Tim!

Folks,

I’m realizing that it’s been a while since last post. We’ve been quite busy working on tons of cool stuff that you’ll discover soon and time flies fast.. very fast in fact. In the mean time our team was reinforced with Tim Hinds who will help me animate the blog and communicate on more general subjects than just bits and bytes, so I now have great hopes that many of our regular readers will more naturally share some links without the fear of looking like an alien who reads haproxy directives all the day 🙂

Welcome Tim, it’s really nice to have you on board, and stay tuned!

 

HAProxy Technologies offers free hardware load balancers to students / interns

How to start with load balancers ?

All of us playing with load balancers started the same way. First you prepare a configuration to alternate between two web servers returning different contents, and you verify with your browser that you indeed get a different response on each reload. This is the most basic load balancing setup one can imagine (and the worst one at the same time). Then you start all these “what if” questions. “What if a server dies”. You manage to address this by enabling health checks., but you feel like you’re cheating when you stop the server by killing the process. “What if now the cable is pulled off while transfering data”. This becomes difficult, you need to start a VM and you’re not completely sure you model the proper failure by stopping the VM, maybe the hypervisor will send some resets or ICMP messages.  Then “What if I the load balancer itself  fails”. Cheating with VMs that are started and stopped by hand doesn’t make you feel comfortable in that what you’re doing will work in real production. “What if someone by mistake configures two servers with the same IP address”. “What if I enable VRRP to cover hardware failures and my checks are wrong and I end up with two masters”. “What if I provoke a multicast loop”. “What if I setup multiple VRRP instances in multiple VLANs”.

All of these questions may sound very advanced to newcomers but are in fact quite common. In the world of load balancing, you definitely need to imagine a lot of failure cases and how to deal with them. Playing with pure software helps understand the basic concepts. Playing with VMs helps going a little bit further and already becomes quite painful. Add VLANs to the mix, asymmetric routing, DSR, transparent proxying and you’re quickly screwed.

Most of us started by plugging and pulling off cables , both network and power supply. I personally tortured a lot of Alteon AD3 in 2000-2001, but I was extremely lucky to have access to a lab where there were plenty of them (and they were really great devices to learn load balancing by the way).

But since it remains the best way to learn the concepts and to develop skills, HAProxy Technologies wanted to offer this opportunity to beginners again.

What if you could set up a complete load balancing lab on your desk, at school or at home  ?

load balancing on your desk
Complete Load Balancing platform on a desk

The photo above shows a complete load balancing platform deployed on a desk, with real machines and a real switch. And it’s just an example of what can be achieved. In order to make this possible, we ported our flagship product, the ALOHA load balancer, to a miniature MIPS-based platform which forms the ALOHA Pocket. This platform is designed and built by maker GL-Inet and is originally a WiFi router. But it has quite decent specs (2 FastEthernet ports, a 400 MHz 32-bit CPU, 64 MB of RAM, 16 MB of flash, and powered over USB), which are quite sufficient to run an ALOHA, is very convenient to manipulate, is rock solid and it is affordable. So it’s perfect to mass-produce an ALOHA that we can send around the world to interested participants. For now we have enough in stock to cover several tens of projects, we may order new ones if the stock goes away too quickly.

A box of 50 ALOHAs!
A box of 50 ALOHA Pocket Load balancers!

With this ALOHA Pocket, our goal is very simple : we want to introduce more people to load balancing., because we believe the future is there (load balancing, application delivery, content switching, function chaining, call it as you want, all these concepts are tightly coupled).  Thus we are willing to offer a couple of load balancers for free to any student or intern who can describe a project they are working on that involves load balancing, in exchange for their promise to regularly publish their progress, on a blog for example. We’re not going to verify who writes and when (though we welcome links), we bet that most participants will be honnest and will respect their engagement. We estimate that the more people who show what can be achieved using a load balancer, the more people will be attracted and will fall into that addiction in turn. It is even possible that some participants will face some bugs or will suggest improvements, we’re definitely willing to hear about this as well. Don’t be ashamed, suggest and criticize on your articles, we’re not asking people to send us flowers, just to be open! And maybe your comments will make some companies notice your skills and propose you a job after you finish your classes 🙂

This ALOHA Pocket is full-featured. It runs at quite a decent speed (around 450 connections per second with all features turned on, this is more than the vast majority of web sites). We could not put our anti-DDoS protection, PacketShield, in it because it requires more memory than this device contains. But we don’t consider that this is important for beginners. The devices will be shipped with the last version of our ALOHA platform, 8.0. We will intentionally not provide frequent software updates because we want to be sure that they will be used for educational purposes only and not to run production! But we’ll issue updates if users are facing bugs because we want them to learn in comfortable conditions.

Now, if you are interested, please send an email to contact at haproxy dot com with the subject “ALOHA Pocket”. Introduce yourself, what you intend to do (eg: load balance Apache/Nginx web servers, load balance Postfix mail servers, set up a cloud platform involving haproxy,  design a CDN involving HAProxy and Varnish, etc). Please provide enough details so that we may advise you if we detect some well-known traps in what you’re describing. Please indicate in what context you’re going to do this (at school, at university, for a company), and where you’re going to publish your progress, We don’t ask you to publicly disclose on your blog the name of the organization you’re working for, we know that some large ones still have problems with this. Indicate an estimated time frame for your project, and of course your shipping address. Try to be descriptive, as we consider that people not willing to write a few lines to get two free load balancers do not even deserve a response. By the way, if you believe you already have a compatible hardware and you’d only need the software image and procedure to flash it, contact us as well, we’ll welcome your demand as it means more people will be able to get one.

For logistics reasons, we’re going to send them in batches, so they will take a bit of time to arrive. Please bear with us, we’re handling all this by hand and installing all of them ourselves, just because we believe that this project is cool.

Let’s hope you’ll have as much fun using it as we had creating it 🙂

First HAProxy workshop at Zenika Paris was a success

HAProxy gets more and more contributors. That’s a good thing. There’s a side effect to this, which is that the maintainer (myself) spends quite some time reviewing submissions. I wanted to have the opportunity to exchange with various contributors to give them more autonomy, to present how haproxy works internally, how it’s maintained, what things are acceptable and which ones are not, and more generally to get their feedback as contributors.

Since I had never done this before, I didn’t want to force people to come from far away in case it would be a failure, so I wanted to contact only local contributors for this first round and that we talked french so that everyone would be totally at ease. A few of them couldn’t attend but no less than 8 people responded present! Given that our meeting room in Jouy-en-josas is too small for such a team, we started to consult a few partners. Zenika was kind enough to respond immediately (phone call in the evening, 3 proposals the next morning, who can beat that ?).

So Baptiste, Emeric, William, Thierry, Cyril, Christopher, Emmanuel and I met there in one of Zenika’s training rooms in Paris last Friday. The place was obviously much better than our meeting room, large, fully equipped, silent, and we could spend the whole day there chatting and presenting stuff.

I talked a lot. I’m always said to talk a lot anyway, so I guess nobody was surprized. I presented the overall internal architecture. It was not in great details, but I know the attendees are skilled enough to find their way through the code with these few entry points. What matters to me is that they know where to start from. Emeric talked a bit about the peers protocol. Cyril proposed that the HTML version of the doc be integrated into the official web site instead of as an external link. Then Christopher presented the filters, how they work, the choices he had to make. William explained some limitations he faced with the current design and there was a discussion on the best ways to overcome them. In short, some hooks need to be added to the filters, and proabably an analyzer mask as well. Then Thierry talked about various stuff such as lunch, Lua, lunch, maps, lunch, stats and how he intends to try to exploit the possibilities offered by the new filters. He also talked about lunch. He explained how he managed to implement some inter-process stats aggregation in Lua, which may deserve a rewrite in C.

It was also interesting to discuss the opportunity to use filters to develop the small stupid RAM-based cache that has been present in the roadmap for a few years (the “favicon cache” as I often call it). Thierry explained his first attempt at doing such a thing in Lua and the shortcomings he faced in part due to the Lua implementation and in part due to the uselessness of such a cache which ignored the Vary header. Also he complained about the limits he reached with such a permissive language when it comes to refactoring some existing code.

Emmanuel explained that for his use case (haproxy serves as an SSL offloader in front of Varnish), even a small object cache would bring very limited benefit and that he would probably not use it this way as he prefers to use it in plain TCP mode and deal with HTTP at a single place. He was suggested to run a test with HTTP multiplexing enabled between haproxy and Varnish (possible since 1.6) to estimate any possible performance gains compared to raw TCP. Emmanuel also discussed the possibility of exporting some histogram information for some metrics (eg: response sizes and times).

The question about how haproxy should make better use of the information it receives from the PROXY protocol header surfaced again, especially regarding SSL this time. It turns out that we almost froze the protocol some time ago and that everyone implemented it as it is specified, while haproxy skips the SSL parts. Something probably needs to be done, how is a different story.

The issue of external library dependencies was brought, such as Lua 5.3 and SLZ, which are not packaged in mainstream distros. There wasn’t a broad adoption of the principle of including them in the source tree, but rather to see them packaged and shipped by distros even if that’s in unofficial repos.

I explained how I intend to chain two layers of streams belonging to the same session with a protocol converter in the middle to implement HTTP/2 to HTTP/1 gatewaying, and some of the issues that will come from doing this.

We also discussed about what is still missing to go multithread. In short, still a lot but good practices are already mandatory if we want to make our life easier in the future.

Interestingly, for most users there, there are almost no more local patches except the usual few things that need to bake a bit before being submitted upstream. This is another proof that we need to make the code even easier to deal with for newcomers, to encourage users to develop their own code and submit it once they feel at ease with it.

Well, at the end of the day everyone seemed very satisfied and expressed interest for doing this again if possible at the same place (the place is nice, easily accessible and people were really nice with us).

We learned quite a bit for next rounds. First, everyone must participate and it seems that 10 persons is the maximum for a workshop. We need to make pauses as well. Next time we do it, we’ll have to be better organized (though everyone was good at improvising). We should prepare some rough presentations and ensure everyone has enough time to present their stuff. It’s also possible that we’d need a first part with everyone and a second part cut into small groups by centers of interests.

So thanks again to Zenika for helping us set this up, thanks to all participants for coming, now looking forward to doing this again with more people.

ALOHA Pocket is coming…

Well, this project is not really a secret anymore and people start to ask about it, so let me present the beast :

front_smThis is the ALOHA Pocket. Probably the smallest load balancer you have ever seen from any vendor. It is a full-featured ALOHA with layer 4/7, SSL, VRRP, the complete web interface with templates, the logs… It consumes less than a watt (0.75W to be precise) and is powered over USB.  It can run for about ten hours from a single 2200mAh battery. Still it achieves more than a thousand connections per second and forward 70 Mbps between the two ports. Yes, this is more than what some applications we’ve seen in field deliver on huge servers consuming 1000 times this power and running with 4000 times its amount of RAM. This is made possible thanks to our highly optimized, lightweight products which are so energy efficient and need so little resource that they can run on almost anything (and of course, they are magnified when running on powerful hardware).

Obviously nobody wants to run their production on this, it would not look serious! But we found that this is the ideal format to bring your machine everywhere, for demos, for tests, to develop in the train, or even just to tease friends. And it’s so cool that I have several of them  on my desk and others in my bag and am using them all the day for various tests. And while using it I found that it was so much more convenient to use than a VM when explaining high availability to someone that we realized that it’s the format of choice for students discovering load balancing and high availability. Another nice thing is that since it has two ports, it’s perfect for plugging between your PC and the LAN to observe the HTTP communications between your browser and the application you’re developing.

So we decided to prepare one hundred of them that we’ll offer to students and interns working on a load balancing project, in exchange for their promise to blog about their project’s progress.  If they need we can even send them a cluster of two.  And who knows, maybe among these, someone will have a great idea and develop a worldwide successful project, and then we’ll be very proud to have provided the initial spark that made this possible. And if it helps students get a career around load balancing, we’ll be quite proud to transmit this passion as well!

We still have a few things to complete before it can go wild, such as a bit of documentation to explain how to start with it. But if you think you’re going to work on a load balancing project or are joining a company as an intern and will be doing some stuff with web servers, this can be the perfect way to discover this new amazing world to design solutions which resist to real failures caused by pulling off a cable and not just the clean “power down” button pressed in a VM. Start thinking about it to reserve one (or a pair) when we launch it in the upcoming weeks. Conversely if you absolutely want one, you just have to find a load balancing project to work on 🙂

In any case, don’t wait too much to think about your project, because the stock is limited, so if there is too much demand, we’ll have to selective on the projects we’re going to support for  the last ones.

Stay tuned!