kubernetes errors
April 23, 2023
Liveness probe failed and Readiness probe failed
Возникает из-за того что контейнер в поде не проходит проверку на живость и доступность.
Ошибки Readiness probe failed/Liveness probe failed и последующая за ним Back-off restarting failed container, говорят о том, что контейнер не прошёл проверку на живость и доступность. И kubelet перезапускает его в надежде оживить, т.к. отрабатывает политика restartPolicy.
В качестве варианта решения можно попробовать увеличить таймаут ожидания ответа от контейнера в два раза - timeoutSeconds: x2. Если это не поможет, то проблема в нехватке ресурсов у воркер нод. И прежде всего cpu.
Пример того, как выглядит блок events у такого пода:
Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 30m default-scheduler Successfully assigned ingress-nginx/ingress-nginx-controller-5578bd689b-lhrt4 to cl14vuhr5k8ulcdnhd42-uder Normal Pulling 30m kubelet Pulling image "k8s.gcr.io/ingress-nginx/controller:v1.1.1@sha256:0bc88eb15f9e7f84e8e56c14fa5735aaa488b840983f87bd79b1054190e660de" Normal Pulled 30m kubelet Successfully pulled image "k8s.gcr.io/ingress-nginx/controller:v1.1.1@sha256:0bc88eb15f9e7f84e8e56c14fa5735aaa488b840983f87bd79b1054190e660de" in 12.213938341s Normal Created 30m kubelet Created container controller Normal Started 30m kubelet Started container controller Normal RELOAD 30m nginx-ingress-controller NGINX reload triggered due to a change in configuration Normal Killing 28m kubelet Container controller failed liveness probe, will be restarted Warning Unhealthy 28m (x5 over 29m) kubelet Readiness probe failed: Get "http://10.113.160.12:10254/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers) Normal Pulled 27m kubelet Container image "k8s.gcr.io/ingress-nginx/controller:v1.1.1@sha256:0bc88eb15f9e7f84e8e56c14fa5735aaa488b840983f87bd79b1054190e660de" already present on machine Normal RELOAD 27m nginx-ingress-controller NGINX reload triggered due to a change in configuration Normal RELOAD 24m nginx-ingress-controller NGINX reload triggered due to a change in configuration Normal RELOAD 20m nginx-ingress-controller NGINX reload triggered due to a change in configuration Normal RELOAD 17m nginx-ingress-controller NGINX reload triggered due to a change in configuration Normal RELOAD 13m nginx-ingress-controller NGINX reload triggered due to a change in configuration Warning Unhealthy 10m (x70 over 28m) kubelet Readiness probe failed: HTTP probe failed with statuscode: 500 Normal RELOAD 9m22s nginx-ingress-controller NGINX reload triggered due to a change in configuration Normal RELOAD 6m15s nginx-ingress-controller NGINX reload triggered due to a change in configuration Warning Unhealthy 5m27s (x36 over 29m) kubelet Liveness probe failed: Get "http://10.113.160.12:10254/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers) Warning BackOff 20s (x18 over 3m31s) kubelet Back-off restarting failed container