HAProxy - сервер работает без tcp-проверки, если он разрешается снова
У меня проблема с экземпляром haproxy (1.9.4) перед кластером redis (3 узла), все внутри k8s.
Я настроил haproxy для tcp-проверки следующим образом:
backend bk_redis
option tcp-check
tcp-check send AUTH\ RedisTest\r\n
tcp-check expect string +OK
tcp-check send PING\r\n
tcp-check expect string +PONG
tcp-check send info\ replication\r\n
tcp-check expect string role:master
tcp-check send QUIT\r\n
tcp-check expect string +OK
default-server check resolvers kubedns inter 1s downinter 1s fastinter 1s fall 1 rise 30 maxconn 330 no-agent-check on-error mark-down
server redis-0 redis-ha-server-0.redis-ha.redis-ha.svc.cluster.local:6379
server redis-1 redis-ha-server-1.redis-ha.redis-ha.svc.cluster.local:6379
server redis-2 redis-ha-server-2.redis-ha.redis-ha.svc.cluster.local:6379
Когда главный узел выходит из строя, он работает нормально, реплика превращается в master и haproxy перенаправляет трафик на него. Проблема в том, что старый мастер возвращается с новым ip, потому что haproxy не проверяет снова роль хозяина, а вместо этого сразу ставит старый узел как UP.
это журнал:
[NOTICE] 058/125637 (1) : New worker #1 (6) forked
[WARNING] 058/125637 (6) : Health check for server bk_redis/redis-0 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 0ms, status: 1/1 UP.
[WARNING] 058/125639 (6) : Health check for server bk_redis/redis-1 failed, reason: Layer7 timeout, info: " at step 6 of tcp-check (expect string 'role:master')", check duration: 1001ms, status: 0/30 DOWN.
[WARNING] 058/125639 (6) : Server bk_redis/redis-1 is DOWN. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[WARNING] 058/125639 (6) : Health check for server bk_redis/redis-2 failed, reason: Layer7 timeout, info: " at step 6 of tcp-check (expect string 'role:master')", check duration: 1001ms, status: 0/30 DOWN.
[WARNING] 058/125639 (6) : Server bk_redis/redis-2 is DOWN. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[WARNING] 058/125657 (6) : Health check for server bk_redis/redis-0 failed, reason: Layer4 timeout, info: " at step 1 of tcp-check (send)", check duration: 1001ms, status: 0/30 DOWN.
[WARNING] 058/125657 (6) : Server bk_redis/redis-0 is DOWN. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[ALERT] 058/125657 (6) : backend 'bk_redis' has no server available!
[WARNING] 058/125706 (6) : Health check for server bk_redis/redis-2 failed, reason: Layer7 invalid response, info: "TCPCHK did not match content 'role:master' at step 6", check duration: 532ms, status: 0/30 DOWN.
[WARNING] 058/125706 (6) : Health check for server bk_redis/redis-1 failed, reason: Layer7 invalid response, info: "TCPCHK did not match content 'role:master' at step 6", check duration: 835ms, status: 0/30 DOWN.
[WARNING] 058/125707 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 2ms, status: 1/30 DOWN.
[WARNING] 058/125708 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 2ms, status: 2/30 DOWN.
[WARNING] 058/125708 (6) : Health check for server bk_redis/redis-1 failed, reason: Layer7 timeout, info: " at step 6 of tcp-check (expect string 'role:master')", check duration: 1001ms, status: 0/30 DOWN.
[WARNING] 058/125709 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 4ms, status: 3/30 DOWN.
[WARNING] 058/125710 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 4ms, status: 4/30 DOWN.
[WARNING] 058/125711 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 2ms, status: 5/30 DOWN.
[WARNING] 058/125712 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 2ms, status: 6/30 DOWN.
[WARNING] 058/125713 (6) : Server bk_redis/redis-0 was DOWN and now enters maintenance (DNS NX status).
[WARNING] 058/125713 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 2ms, status: 7/30 DOWN.
[WARNING] 058/125714 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 3ms, status: 8/30 DOWN.
[WARNING] 058/125715 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 3ms, status: 9/30 DOWN.
[WARNING] 058/125716 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 4ms, status: 10/30 DOWN.
[WARNING] 058/125717 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 3ms, status: 11/30 DOWN.
[WARNING] 058/125718 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 4ms, status: 12/30 DOWN.
[WARNING] 058/125719 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 2ms, status: 13/30 DOWN.
[WARNING] 058/125720 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 2ms, status: 14/30 DOWN.
[WARNING] 058/125721 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 2ms, status: 15/30 DOWN.
[WARNING] 058/125722 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 3ms, status: 16/30 DOWN.
[WARNING] 058/125723 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 3ms, status: 17/30 DOWN.
[WARNING] 058/125724 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 4ms, status: 18/30 DOWN.
[WARNING] 058/125725 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 3ms, status: 19/30 DOWN.
[WARNING] 058/125726 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 2ms, status: 20/30 DOWN.
[WARNING] 058/125727 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 2ms, status: 21/30 DOWN.
[WARNING] 058/125728 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 3ms, status: 22/30 DOWN.
[WARNING] 058/125729 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 3ms, status: 23/30 DOWN.
[WARNING] 058/125730 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 4ms, status: 24/30 DOWN.
[WARNING] 058/125731 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 4ms, status: 25/30 DOWN.
[WARNING] 058/125732 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 3ms, status: 26/30 DOWN.
[WARNING] 058/125733 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 3ms, status: 27/30 DOWN.
[WARNING] 058/125734 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 2ms, status: 28/30 DOWN.
[WARNING] 058/125735 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 2ms, status: 29/30 DOWN.
[WARNING] 058/125736 (6) : Health check for server bk_redis/redis-2 succeeded, reason: Layer7 check passed, code: 0, info: "(tcp-check)", check duration: 1ms, status: 1/1 UP.
[WARNING] 058/125736 (6) : Server bk_redis/redis-2 is UP. 1 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.
[WARNING] 058/125945 (6) : bk_redis/redis-0 changed its IP from 10.42.4.85 to 10.42.4.87 by kubedns/namesrv1.
[WARNING] 058/125945 (6) : Server bk_redis/redis-0 ('redis-ha-server-0.redis-ha.redis-ha.svc.cluster.local') is UP/READY (resolves again).
[WARNING] 058/125945 (6) : Server bk_redis/redis-0 administratively READY thanks to valid DNS answer.
[WARNING] 058/125947 (6) : Health check for server bk_redis/redis-0 failed, reason: Layer7 timeout, info: " at step 6 of tcp-check (expect string 'role:master')", check duration: 1000ms, status: 0/30 DOWN.
[WARNING] 058/125947 (6) : Server bk_redis/redis-0 is DOWN. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Если вы видите последние строки, когда bk_redis/redis-0 имеет новый ip (НО ЭТО БЫЛО ВНИЗ), он сразу же поднимается вверх без выполнения tcp-проверки (что он запускается через секунду и, конечно, не работает).
Как я могу избежать этого? Есть ли способ заставить это, когда это решает снова ip, это ждет tcp-проверки для того, чтобы пойти UP?