There are currently two ways to run health checks in Rancher: HTTP and TCP. The former is relatively straightforward, however it requires that you have a web server with an available route able to respond with a 2xx/3xx.
TCP checks are a nice option for services without a web server, as you can do it in a lightweight fashion.
We built a very simple TCP server that we use in a lot of our services.
Py`as a background process and then continues with whatever else your container’s entrypoint should do: Health Checks in the Wild This method has been particularly useful to us in the case of spot instances on AWS. Currently, when a host disappears, it stays in a ‘reconnecting‘ state in Rancher and services that exist on that host do not automatically move elsewhere.
Imagine the following scenario: You have a service with a scale of 2 that schedules onto spot instances.
Two new spot instances connect to Rancher, your services schedule correctly.
Rancher now has 3 hosts: Host A, Host B, Host C. Your service now has only 1 container running on Host A. If you deploy your services with health checks and the recreate strategy, then after your unhealthy threshold has been met, Rancher will try to recreate the second container, find that Host C is available, and schedule the container accordingly.