Problem:
A1 controller takes few more minutes more than the other components to start. But the liveness probe is configured for 60 seconds. In many deployment the chances of A1 liveness check fail is high.
NAME READY STATUS RESTARTS AGE
a1controller-7cc5f467c6-669mt 1/1 Running 7 35m
Log:
Warning Unhealthy 4s (x3 over 24s) k8s-01 Readiness probe failed: dial tcp 10.42.1.121:8181: connect: connection refused
Warning Unhealthy 3s (x3 over 23s) k8s-01 Liveness probe failed: dial tcp 10.42.1.121:8181: connect: connection refused
Normal Killing 3s kubelet, k8s-01 Container container-a1controller failed liveness probe, will be restarted
Fix:
Increase the initialDelaySeconds
Note: This is an intermittent issue and may occur in few deployment especially when the imagePullPolicy is set to Always
# | Subject | Branch | Project | Status | CR | V |
---|---|---|---|---|---|---|
3994,1 | Bug Fix: Increased the initialdelay for liveness probe test | master | it/dep | Status: MERGED | +2 | +1 |