HAProxy advanced Redis health check

Introduction

Redis is an opensource nosql database working on a key/value model.
One interesting feature in Redis is that it is able to write data to disk as well as a master can synchronize many slaves.

HAProxy can load-balance Redis servers with no issues at all.
There is even a built-in health check for redis in HAProxy.
Unfortunately, there was no easy way for HAProxy to detect the status of a redis server: master or slave node. Hence people usually hacks this part of the architecture.

As written in the title of this post, we’ll learn today how to make a simple Redis infrastructure thanks to newest HAProxy advanced send/expect health checks.
This feature is available in HAProxy 1.5-dev20 and above.

Purpose is to make the redis infrastructure as simple as possible and ease fail over for the web servers. HAProxy will have to detect which node is MASTER and route all the connection to it.

Redis high availability diagram with HAProxy

Below, an ascii art diagram of HAProxy load-balancing Redis servers:

+----+ +----+ +----+ +----+
| W1 | | W2 | | W3 | | W4 |   Web application servers
+----+ +----+ +----+ +----+
          |   |     /
          |   |    /
          |   |   /
        +---------+
        | HAProxy |
        +---------+
           /   
       +----+ +----+
       | R1 | | R2 |           Redis servers
       +----+ +----+

The scenario is simple:
  * 4 web application servers need to store and retrieve data to/from a Redis database
  * one (better using 2) HAProxy servers which load-balance redis connections
  * 2 (at least) redis servers in an active/standby mode with replication

Configuration

Below, is the HAProxy configuration for the

defaults REDIS
 mode tcp
 timeout connect  4s
 timeout server  30s
 timeout client  30s

frontend ft_redis
 bind 10.0.0.1:6379 name redis
 default_backend bk_redis

backend bk_redis
 option tcp-check
 tcp-check send PINGrn
 tcp-check expect string +PONG
 tcp-check send info replicationrn
 tcp-check expect string role:master
 tcp-check send QUITrn
 tcp-check expect string +OK
 server R1 10.0.0.11:6379 check inter 1s
 server R2 10.0.0.12:6379 check inter 1s

The HAProxy health check sequence above allows to consider the Redis master server as UP in the farm and redirect connections to it.
When the Redis master server fails, the remaining nodes elect a new one. HAProxy will detect it thanks to its health check sequence.

It does not require third party tools and make fail over transparent.

Links

This entry was posted in architecture, HAProxy, optimization and tagged . Bookmark the permalink.

13 Responses to HAProxy advanced Redis health check

  1. Koen says:

    You are missing the “tcp-check connect” statement that make the connection. Took me a few minutes to figure that out.

  2. whaa says:

    funny how a haproxy 1.5.22 on ubuntu 12.04 worked -without- tcp-check connect, and the same haproxy on a centos 6.5 didn’t work without tcp-check connect.

    thanks for the tip.

  3. adichiru says:

    Hi,

    I have been testing a very similar scenarios for quite a while and there is a problem that haproxy needs to handle properly. The main idea is that haproxy is looking to detect redis masters by querying each back-end server, which is expected for load-balancing but in the same time problematic for HA in the special case of redis. Redis has a master-slave sync option which combined with Sentinel works pretty good. However, haproxy does not get the current master details from sentinel so if a net partition occurs, a slave is being promoted to master while the old master is isolated; if the old master comes back it will come back as master and haproxy will see it as a valid back-end so it will send queries to it for a few good seconds until sentinel will reconfigure it to be slave.

    I believe this can be “easily” solved by making haproxy get ip and port for the current master from sentinel, since sentinel is the authority in that redis infrastructure. Just playing with inter and rise to allow sentinel enough time to fix this 2 masters in the same time problem is not reliable and inserts a huge delay in the fail-over scenario.

    What do you think about this? Is there something I am missing?

    • Serge says:

      Hi adichiru,

      You raise a valid point. I have been experiencing the same issue, were you able to configure haproxy check successfully with sentinel ?

    • Ivan says:

      Thanks Adichiru,
      A very valid point, split brain is a nightmare, you may loose a significant number of writes. Here we are facing the same problem. We could not find a solution on HAProxy so we are asking directly the sentinel information about the master, that works ok if you have a smart client like Jedis in Java.
      Does anyone found a solution.

  4. Pingback: binary health check with HAProxy 1.5: php-fpm/fastcgi probe example | HAProxy Technologies – Aloha Load Balancer

  5. Kumar says:

    Adichiru brings up a valid concern. An option that we are experimenting with for such cases is to check for number of masters in the backend pool and do a tcp-request reject if it is not equal to one. I think this should work in most cases, as the safest choice in split brain situation is to not write anything to redis. If you do continue to write, it is hard to avoid data loss.

    For the above configuration, the required tweak would be
    frontend ft_redis
    bind 10.0.0.1:6379 name redis
    acl single_master nbsrv(bk_redis) eq 1
    tcp-request connection reject if !single_master
    use_backend bk_redis if single_master

  6. James says:

    ## Check 3 sentinels to see if they think redisA is master
    backend check_master_redisA
    mode tcp
    option tcp-check
    tcp-check send PINGrn
    tcp-check expect string +PONG
    tcp-check send SENTINEL master myharedisrn
    tcp-check expect string 192.168.1.13
    tcp-check send QUITrn
    tcp-check expect string +OK

    server sentinelA 192.168.1.10:26379 check inter 2s
    server sentinelB 192.168.1.11:26379 check inter 2s
    server sentinelC 192.168.1.12:26379 check inter 2s

    ## Check 3 sentinels to see if they think redisB is master
    backend check_master_redisB
    mode tcp
    option tcp-check
    tcp-check send PINGrn
    tcp-check expect string +PONG
    tcp-check send SENTINEL master myharedisrn
    tcp-check expect string 192.168.1.14
    tcp-check send QUITrn
    tcp-check expect string +OK

    server sentinelA 192.168.1.10:26379 check inter 2s
    server sentinelB 192.168.1.11:26379 check inter 2s
    server sentinelC 192.168.1.12:26379 check inter 2s

    ## Check redisA to see if it thinks it is master
    backend redisA
    mode tcp
    option tcp-check
    tcp-check send PINGrn
    tcp-check expect string +PONG
    tcp-check send info replicationrn
    tcp-check expect string role:master
    tcp-check send QUITrn
    tcp-check expect string +OK

    server redisA 192.168.1.13:6379 check inter 2s

    ## Check redisB to see if it thinks it is master
    backend redisB
    mode tcp
    option tcp-check
    tcp-check send PINGrn
    tcp-check expect string +PONG
    tcp-check send info replicationrn
    tcp-check expect string role:master
    tcp-check send QUITrn
    tcp-check expect string +OK

    server redisB 192.168.1.14:6379 check inter 2s

    ## If at least 2 sentinels agree with the redis host that it is master, use it.
    listen redis_master :6379
    mode tcp
    use_backend redisA if { srv_is_up(redisA/redisA) } { nbsrv(check_master_redisA) ge 2 }
    use_backend redisB if { srv_is_up(redisB/redisB) } { nbsrv(check_master_redisB) ge 2 }

    • Hi James,

      Thanks for the tip about sentinel.
      I’ll write later a full article about redis/sentinel using your example.

      Baptiste

      • Mulloy says:

        Have you written this full article about redis/sentinel? I have a similar problem currently, with the added complexity that I’m deploying a redis/sentinel cluster using kubernetes, in which IPs are dynamic. I have a Sentinel service to discover Redis masters. However, is it possible to dynamically update the HAProxy configs with a new master IP?
        e.g. /code src/redis-cli -h kubelb.host.name.com -p 26379 SENTINEL masters

  7. Rich says:

    It appears the ‘info replication’ blocks while the master is saving, can cause your HA to mark the master as offline if you have a large dataset (there’s about 21GB in this one).

    redis-server.log (2.6.13) :

    [2102] 02 Apr 11:23:19.087 * 10 changes in 300 seconds. Saving…
    [2102] 02 Apr 11:23:26.915 * Background saving started by pid 30215

    Redis client :

    $ date; time echo -ne “INFO REPLICATIONrnQUITrn” | nc redis.master.host 6379
    Thu Apr 2 11:23:19 UTC 2015
    $116
    # Replication
    role:master
    connected_slaves:2

    +OK

    real 0m7.382s
    user 0m0.000s
    sys 0m0.008s

    Note that the command took 7 seconds, which is also the difference between the two redis log lines.

    I’m not saying that this is a bad way of automatic master failover, but it’s just something to be aware of and your HA proxy needs to be tuned to be able to cope with how long your bgsave takes to fork.

Leave a Reply