It appears this configuration wasn’t working with HAProxy version 1.3.20 due to the “option nolinger” feature. I have removed it from the configuration and can confirm it works well with HAProxy v.1.3.15 to v.1.3.20. Because of this, you’ll also notice a significant increase in TIME_WAIT sessions, as well as ip_conntrack_count increasing from ~150 to ~925.

This post summarizes my reflections on failover, redundancy, and ultimately scaling MySQL databases using load-balancing software known as HAProxy.

haproxy-01

At my current employer, we have been using HAProxy to build very simple server clusters to help clients scale their databases. It works for most people assuming their application:

  • Has an acceptable ratio of reads/writes (i.e: 100:1)
  • Can separate reads and writes at the application level

If your read/write ratio is lower, that’s when you need to look into different scaling solutions such as sharding.

I’ve designed a slightly more complex HAProxy configuration file which load-balances requests to MySQL databases. It detects failures such as broken replication and offline servers, and adjusts the availability of servers accordingly.

Each database server is running an xinetd daemon. Port 9201 is used to monitor replication and port 9200 is used to monitor mysql status. These ports are monitored by HAProxy as you will see in the configuration file below.

HAProxy backend to monitor replication

  1. backend db01_replication  #第128行
  2.     mode tcp  
  3.     balance roundrobin  
  4.     option tcpka  
  5.     option httpchk  
  6.     server db01 172.16.0.60:3306 check port 9201 inter 1s rise 1 fall 1 

HAProxy backend to monitor mysql status

  1. backend db01_status  #第168行
  2.     mode tcp  
  3.     balance roundrobin  
  4.     option tcpka  
  5.     option httpchk  
  6.     server db01 172.16.0.60:3306 check port 9200 inter 1s rise 2 fall 2 

I modified the mysqlchk_status.sh script found at SysBible with my own.

The mysqlchk_replication.sh script is similar to the one above, except it checks a few other variables such as Slave_IO_Running, Slave_SQL_Running and Seconds Behind Master. Success will always return a ‘200 OK’ and failures will always return a ‘503 Service Unavailable’.

servers-01My test setup

  • 2 HAProxy load-balancers in Active-Passive mode (VRRP using Keepalived)
  • 2 MySQL database servers with Master-Master replication in Active-Passive mode
  • 3 MySQL database servers with slave replication (read-only)

Failure scenarios

Based on a small set of failure scenarios, we’re able to determine how the load balancers should direct traffic. We obviously don’t want read requests from a database server who’s not replicating its master. We also don’t want to send writes to a server who’s offline. The examples below describe how HAProxy will react in those scenarios.

1. Replication breaks, lags, or stops working on DB02

haproxy-02

servers-02

  • DB01 becomes the master database server.
  • HAProxy stops sending requests to DB02 and DB05 (its slave).
  • Despite this, DB01 and DB05 are still able to receive replicated data from DB02.

2. Replication breaks, lags, or stops working on DB01

haproxy-03

servers-03

  • DB02 becomes the master database server.
  • HAProxy stops sending requests to DB01, DB03 and DB04 (its slaves).
  • Despite this, DB02, DB03 and DB04 are still able to receive replicated data from DB01.

3. Replication breaks, lags, or stops working on DB01 & DB02

haproxy-04

servers-04

  • There is no writable master database server. Service is severely degraded and action should be taken to bring one master server back into replication.
  • This is a split-brain problem. Both servers are online, but they aren’t replicating each other.
  • HAProxy only sends read requests to DB01 and DB02
  • HAProxy stops sending requests to DB03, DB04 and DB05 (the slaves).
  • Despite this, DB03 and DB04 are still able to receive replicated data from DB01.
  • Despite this, DB05 is still able to receive replicated data from DB02.

4. DB02 is offline, due to a server crash or something similar

haproxy-05

servers-05

  • DB01 becomes the master database server.
  • HAProxy stops sending requests to DB02 and DB05 (its slave).
  • DB05 can’t receive replicated data from DB02.
  • DB01 goes into backup mode which can have different settings to support more concurrency, send alerts, etc.

5. DB01 is offline, due to a server crash or something similar

haproxy-06

servers-06

  • DB02 becomes the master database server.
  • HAProxy stops sending requests to DB01, DB03 and DB04 (its slaves).
  • DB03 and DB04 can’t receive replicated data from DB01.
  • DB02 goes into backup mode which can have different settings to support more concurrency, send alerts, etc.

6. DB01 and DB02 are offline, due to a server crash or something similar

haproxy-07

servers-07

  • There is no master database server.
  • HAProxy stops sending requests to all DB servers.
  • Call your sysadmin because your website is probably down.

Download

WARNING / DISCLAIMER
This configuration has not been tested in a production environment and should be used at your own risk.

Here are the scripts and config files, or scroll down to view the code:

The xinetd config file

  1. #  
  2. # /etc/xinetd.d/mysqlchk  
  3. #  
  4. service mysqlchk_write  
  5. {  
  6.         flags           = REUSE 
  7.         socket_type     = stream 
  8.         port            = 9200 
  9.         wait            = no 
  10.         user            = nobody 
  11.         server          = /opt/mysqlchk_status.sh  
  12.         log_on_failure  += USERID  
  13.         disable         = no 
  14.         only_from       = 172.16.0.0/24 # recommended to put the IPs that need  
  15.                                     # to connect exclusively (security purposes)  
  16. }  
  17.    
  18. service mysqlchk_read  
  19. {  
  20.         flags           = REUSE 
  21.         socket_type     = stream 
  22.         port            = 9201 
  23.         wait            = no 
  24.         user            = nobody 
  25.         server          = /opt/mysqlchk_replication.sh  
  26.         log_on_failure  += USERID  
  27.         disable         = no 
  28.         only_from       = 172.16.0.0/24 # recommended to put the IPs that need  
  29.                                     # to connect exclusively (security purposes)  

The mysqlchk_status script

  1. #!/bin/bash  
  2. #  
  3. # /opt/mysqlchk_status.sh  
  4. #  
  5. # This script checks if a mysql server is healthy running on localhost. It will  
  6. # return:  
  7. #  
  8. # "HTTP/1.x 200 OK\r" (if mysql is running smoothly)  
  9. #  
  10. # - OR -  
  11. #  
  12. # "HTTP/1.x 500 Internal Server Error\r" (else)  
  13. #  
  14. # The purpose of this script is make haproxy capable of monitoring mysql properly  
  15. #  
  16. # Author: Unai Rodriguez  
  17. #  
  18. # It is recommended that a low-privileged-mysql user is created to be used by  
  19. # this script. Something like this:  
  20. #  
  21. # mysql> GRANT SELECT on mysql.* TO 'mysqlchkusr'@'localhost' \  
  22. #     -> IDENTIFIED BY '257retfg2uysg218' WITH GRANT OPTION;  
  23. # mysql> flush privileges;  
  24. #  
  25. # Script modified by Alex Williams - August 4, 2009  
  26. #       - removed the need to write to a tmp file, instead store results in memory  
  27.    
  28. MYSQL_HOST="172.16.0.60" 
  29. MYSQL_PORT="3306" 
  30. MYSQL_USERNAME="replication_user" 
  31. MYSQL_PASSWORD="replication_pass" 
  32.    
  33. #  
  34. # We perform a simple query that should return a few results :-p  
  35.    
  36. ERROR_MSG=`/usr/bin/mysql --host=$MYSQL_HOST --port=$MYSQL_PORT --user=$MYSQL_USERNAME --password=$MYSQL_PASSWORD -e "show databases;" 2>/dev/null`  
  37.    
  38. #  
  39. # Check the output. If it is not empty then everything is fine and we return  
  40. # something. Else, we just do not return anything.  
  41. #  
  42. if [ "$ERROR_MSG" != "" ]  
  43. then  
  44.         # mysql is fine, return http 200  
  45.         /bin/echo -e "HTTP/1.1 200 OK\r\n"  
  46.         /bin/echo -e "Content-Type: Content-Type: text/plain\r\n"  
  47.         /bin/echo -e "\r\n"  
  48.         /bin/echo -e "MySQL is running.\r\n"  
  49.         /bin/echo -e "\r\n"  
  50. else  
  51.         # mysql is fine, return http 503  
  52.         /bin/echo -e "HTTP/1.1 503 Service Unavailable\r\n"  
  53.         /bin/echo -e "Content-Type: Content-Type: text/plain\r\n"  
  54.         /bin/echo -e "\r\n"  
  55.         /bin/echo -e "MySQL is *down*.\r\n"  
  56.         /bin/echo -e "\r\n"  
  57. fi 

The HAProxy config file

WARNING / DISCLAIMER
This configuration has not been tested in a production environment and should be used at your own risk.

  1. configuration - haproxy-db.cfg  
  2.    
  3. ##  
  4. ## FRONTEND ##  
  5. ##  
  6.    
  7. # Load-balanced IPs for DB writes and reads  
  8. #  
  9. frontend db_write  
  10.     bind 172.16.0.50:3306  
  11.     default_backend cluster_db_write  
  12.    
  13. frontend db_read  
  14.     bind 172.16.0.51:3306  
  15.     default_backend cluster_db_read  
  16.    
  17. # Monitor DB server availability  
  18. #  
  19. frontend monitor_db01  
  20.     #  
  21.     # set db01_backup to 'up' or 'down'  
  22.     #  
  23.     bind 127.0.0.1:9301  
  24.     mode http  
  25.     #option nolinger  
  26.    
  27.     acl no_repl_db01 nbsrv(db01_replication) eq 0  
  28.     acl no_repl_db02 nbsrv(db02_replication) eq 0  
  29.     acl no_db01 nbsrv(db01_status) eq 0  
  30.     acl no_db02 nbsrv(db02_status) eq 0  
  31.    
  32.     monitor-uri /dbs  
  33.     monitor fail unless no_repl_db01 no_repl_db02 no_db02  
  34.     monitor fail if no_db01 no_db02  
  35.    
  36. frontend monitor_db02  
  37.     #  
  38.     # set db02_backup to 'up' or 'down'  
  39.     #  
  40.     bind 127.0.0.1:9302  
  41.     mode http  
  42.     #option nolinger  
  43.    
  44.     acl no_repl_db01 nbsrv(db01_replication) eq 0  
  45.     acl no_repl_db02 nbsrv(db02_replication) eq 0  
  46.     acl no_db01 nbsrv(db01_status) eq 0  
  47.     acl no_db02 nbsrv(db02_status) eq 0  
  48.    
  49.     monitor-uri /dbs  
  50.     monitor fail unless no_repl_db01 no_repl_db02 no_db01  
  51.     monitor fail if no_db01 no_db02  
  52.    
  53. frontend monitor_db03  
  54.     #  
  55.     # set db03 read-only slave to 'down'  
  56.     #  
  57.     bind 127.0.0.1:9303  
  58.     mode http  
  59.     #option nolinger  
  60.    
  61.     acl no_repl_db03 nbsrv(db03_replication) eq 0  
  62.     acl no_repl_db01 nbsrv(db01_replication) eq 0  
  63.     acl db02 nbsrv(db02_status) eq 1  
  64.    
  65.     monitor-uri /dbs  
  66.     monitor fail if no_repl_db03  
  67.     monitor fail if no_repl_db01 db02  
  68.    
  69. frontend monitor_db04  
  70.     #  
  71.     # set db04 read-only slave to 'down'  
  72.     #  
  73.     bind 127.0.0.1:9304  
  74.     mode http  
  75.     #option nolinger  
  76.    
  77.     acl no_repl_db04 nbsrv(db04_replication) eq 0  
  78.     acl no_repl_db01 nbsrv(db01_replication) eq 0  
  79.     acl db02 nbsrv(db02_status) eq 1  
  80.    
  81.     monitor-uri /dbs  
  82.     monitor fail if no_repl_db04  
  83.     monitor fail if no_repl_db01 db02  
  84.    
  85. frontend monitor_db05  
  86.     #  
  87.     # set db05 read-only slave to 'down'  
  88.     #  
  89.     bind 127.0.0.1:9305  
  90.     mode http  
  91.     #option nolinger  
  92.    
  93.     acl no_repl_db05 nbsrv(db05_replication) eq 0  
  94.     acl no_repl_db02 nbsrv(db02_replication) eq 0  
  95.     acl db01 nbsrv(db01_status) eq 1  
  96.    
  97.     monitor-uri /dbs  
  98.     monitor fail if no_repl_db05  
  99.     monitor fail if no_repl_db02 db01  
  100.    
  101. # Monitor for split-brain syndrome  
  102. #  
  103. frontend monitor_splitbrain  
  104.     #  
  105.     # set db01_splitbrain and db02_splitbrain to 'up'  
  106.     #  
  107.     bind 127.0.0.1:9300  
  108.     mode http  
  109.     #option nolinger  
  110.    
  111.     acl no_repl01 nbsrv(db01_replication) eq 0  
  112.     acl no_repl02 nbsrv(db02_replication) eq 0  
  113.     acl db01 nbsrv(db01_status) eq 1  
  114.     acl db02 nbsrv(db02_status) eq 1  
  115.    
  116.     monitor-uri /dbs  
  117.     monitor fail unless no_repl01 no_repl02 db01 db02  
  118.    
  119. ##  
  120. ## BACKEND ##  
  121. ##  
  122.    
  123. # Check every DB server replication status  
  124. #   - perform an http check on port 9201 (replication status)  
  125. #   - set to 'down' if response is '503 Service Unavailable'  
  126. #   - set to 'up' if response is '200 OK'  
  127. #  
  128. backend db01_replication  
  129.     mode tcp  
  130.     balance roundrobin  
  131.     option tcpka  
  132.     option httpchk  
  133.     server db01 172.16.0.60:3306 check port 9201 inter 1s rise 1 fall 1  
  134.    
  135. backend db02_replication  
  136.     mode tcp  
  137.     balance roundrobin  
  138.     option tcpka  
  139.     option httpchk  
  140.     server db02 172.16.0.61:3306 check port 9201 inter 1s rise 1 fall 1  
  141.    
  142. backend db03_replication  
  143.     mode tcp  
  144.     balance roundrobin  
  145.     option tcpka  
  146.     option httpchk  
  147.     server db03 172.16.0.63:3306 check port 9201 inter 1s rise 1 fall 1  
  148.    
  149. backend db04_replication  
  150.     mode tcp  
  151.     balance roundrobin  
  152.     option tcpka  
  153.     option httpchk  
  154.     server db04 172.16.0.64:3306 check port 9201 inter 1s rise 1 fall 1  
  155.    
  156. backend db05_replication  
  157.     mode tcp  
  158.     balance roundrobin  
  159.     option tcpka  
  160.     option httpchk  
  161.     server db05 172.16.0.65:3306 check port 9201 inter 1s rise 1 fall 1  
  162.    
  163. # Check Master DB server mysql status  
  164. #   - perform an http check on port 9200 (mysql status)  
  165. #   - set to 'down' if response is '503 Service Unavailable'  
  166. #   - set to 'up' if response is '200 OK'  
  167. #  
  168. backend db01_status  
  169.     mode tcp  
  170.     balance roundrobin  
  171.     option tcpka  
  172.     option httpchk  
  173.     server db01 172.16.0.60:3306 check port 9200 inter 1s rise 2 fall 2  
  174.    
  175. backend db02_status  
  176.     mode tcp  
  177.     balance roundrobin  
  178.     option tcpka  
  179.     option httpchk  
  180.     server db02 172.16.0.61:3306 check port 9200 inter 1s rise 2 fall 2  
  181.    
  182. # DB write cluster  
  183. #   Failure scenarios:  
  184. #   - replication 'up' on db01 & db02   = writes to db01  
  185. #   - replication 'down' on db02        = writes to db01  
  186. #   - replication 'down' on db01        = writes to db02  
  187. #   - replication 'down' on db01 & db02 = go nowhere, split-brain, cluster FAIL!  
  188. #   - mysql 'down' on db02              = writes to db01_backup  
  189. #   - mysql 'down' on db01              = writes to db02_backup  
  190. #   - mysql 'down' on db01 & db02       = go nowhere, cluster FAIL!  
  191. #  
  192. backend cluster_db_write  
  193.     #  
  194.     # - max 1 db server available at all times  
  195.     # - db01 is preferred (top of list)  
  196.     # - db_backups set their 'up' or 'down' based on results from monitor_dbs  
  197.     #  
  198.     mode    tcp  
  199.     option  tcpka  
  200.     balance roundrobin  
  201.     option  httpchk GET /dbs  
  202.     server  db01 172.16.0.60:3306 weight 1 check port 9201 inter 1s rise 2 fall 1  
  203.     server  db02 172.16.0.61:3306 weight 1 check port 9201 inter 1s rise 2 fall 1 backup  
  204.     server  db01_backup 172.16.0.60:3306 weight 1 check port 9301 inter 1s rise 2 fall 2 addr 127.0.0.1 backup  
  205.     server  db02_backup 172.16.0.61:3306 weight 1 check port 9302 inter 1s rise 2 fall 2 addr 127.0.0.1 backup  
  206.    
  207. # DB read cluster  
  208. #   Failure scenarios  
  209. #   - replication 'up' on db01 & db02   = reads on db01, db02, all db_slaves  
  210. #   - replication 'down' on db02        = reads on db01, slaves of db01  
  211. #   - replication 'down' on db01        = reads on db02, slaves of db02  
  212. #   - replication 'down' on db01 & db02 = reads on db01_splitbrain and db01_splitbrain only  
  213. #   - mysql 'down' on db02              = reads on db01_backup, slaves of db01  
  214. #   - mysql 'down' on db01              = reads on db02_backup, slaves of db02  
  215. #   - mysql 'down' on db01 & db02       = go nowhere, cluster FAIL!  
  216. #  
  217. backend cluster_db_read  
  218.     #  
  219.     # - max 2 master db servers available at all times  
  220.     # - max N slave db servers available at all times except during split-brain  
  221.     # - dbs track 'up' and 'down' of dbs in the cluster_db_write  
  222.     # - db_backups track 'up' and 'down' of db_backups in the cluster_db_write  
  223.     # - db_splitbrains set their 'up' or 'down' based on results from monitor_splitbrain  
  224.     #  
  225.     mode    tcp  
  226.     option  tcpka  
  227.     balance roundrobin  
  228.     option  httpchk GET /dbs  
  229.     server  db01 172.16.0.60:3306 weight 1 track cluster_db_write/db01  
  230.     server  db02 172.16.0.61:3306 weight 1 track cluster_db_write/db02  
  231.     server  db01_backup 172.16.0.60:3306 weight 1 track cluster_db_write/db01_backup  
  232.     server  db02_backup 172.16.0.61:3306 weight 1 track cluster_db_write/db02_backup  
  233.     server  db01_splitbrain 172.16.0.60:3306 weight 1 check port 9300 inter 1s rise 1 fall 2 addr 127.0.0.1  
  234.     server  db02_splitbrain 172.16.0.61:3306 weight 1 check port 9300 inter 1s rise 1 fall 2 addr 127.0.0.1  
  235.     #  
  236.     #   Scaling & redundancy options  
  237.     #   - db_slaves set their 'up' or 'down' based on results from monitor_dbs  
  238.     #   - db_slaves should take longer to rise  
  239.     #  
  240.     server  db03_slave 172.16.0.63:3306 weight 1 check port 9303 inter 1s rise 5 fall 1 addr 127.0.0.1  
  241.     server  db04_slave 172.16.0.64:3306 weight 1 check port 9304 inter 1s rise 5 fall 1 addr 127.0.0.1  
  242.     server  db05_slave 172.16.0.65:3306 weight 1 check port 9305 inter 1s rise 5 fall 1 addr 127.0.0.1 
此文章由 flyinweb 于 2009-10-14 11:46:32 编辑

本日志由 flyinweb 于 2009-09-21 10:34:18 发表,目前已经被浏览 4961 次,评论 0 次;

作者添加了以下标签: HAProxyMySQL

引用通告:http://www.517sou.net/Article/249/Trackback.ashx

评论订阅:http://www.517sou.net/Article/249/Feeds.ashx

评论列表

    暂时没有评论
(必填)
(必填,不会被公开)