How to do Service Discovery in Kubernetes

The Role of Service Discovery

When you're dealing with microservices in a Kubernetes setup, one of the key things you need to sort out is service discovery. Think of it as the way your services find and talk to each other in this complex web of interactions. In simple terms, service discovery is what helps your services connect and communicate within the Kubernetes environment.

The challenge here is that these environments are always changing. Services come and go, they scale up and down, and they move around. That's where Kubernetes shines. It's like the conductor of an orchestra, keeping everything in sync. But to really make your microservices work smoothly, you need to nail down how they discover and connect with each other.

Service discovery is like the phone book for your microservices. It helps services find each other in a Kubernetes environment. Think of it as a way for one service to ask, "Hey, where's the email service?" and get a quick, accurate answer so it can start sending emails.

Kubernetes Service Discovery via Environment variables

Using Pod IPs directly is risky because they can change. If a pod dies and a new one takes its place, the IP changes. It's like trying to call a friend who keeps changing their phone number.

Kubernetes Services are the main way services discover each other. They act like a constant address for a service, even when the underlying pods change.

Environment variables are a way to get around this problem. Kubernetes automatically creates environment variables for each service and injects them into all pods within the same namespace.

For example, imagine if you have a email-dispatcher service that was exposed on TCP port 8080 and has been allocated cluster IP address 10.0.0.11, the following environment variables would be created:

EMAIL_DISPATCHER_SERVICE_HOST=10.0.0.11
EMAIL_DISPATCHER_SERVICE_PORT=8080
EMAIL_DISPATCHER_PORT=tcp://10.0.0.11:8080
EMAIL_DISPATCHER_PORT_8080_TCP=tcp://10.0.0.11:8080
EMAIL_DISPATCHER_PORT_8080_TCP_PROTO=tcp
EMAIL_DISPATCHER_PORT_8080_TCP_PORT=8080
EMAIL_DISPATCHER_PORT_8080_TCP_ADDR=10.0.0.11

Using the above, you can connect to the service without knowing it's IP or port. Here's an example in JavaScript:

fetch(
  `${process.env.EMAIL_DISPATCHER_SERVICE_HOST}:${process.env.EMAIL_DISPATCHER_SERVICE_PORT}`
)
  .then(console.log)
  .catch(console.error);

However, there are some limitations to using environment variables for service discovery.

You can only use environment variables to connect to services in the same namespace.
The service must be created before the pod that uses it, otherwise the environment variables won't be available.
The example above won't work if the service exposes multiple ports.
If the service is deleted and recreated, the IP may change and the environment variables will be out of date.

For these reasons, environment variables are not the preferred way to do service discovery in Kubernetes. It's better to use DNS, which we'll cover next.

Kubernetes Service Discovery via DNS

Kubernetes DNS is a built-in service that provides DNS for all the services in a cluster. It's a great way to find services by name, and it's the best way to do service discovery in Kubernetes.

The DNS follows a very common pattern that is service-name.namespace.svc.cluster.local. It's a structured way to ensure every service can be uniquely identified and found.

Let's say you have a service named email-dispatcher on the notifications namespace. The DNS name for that service would be email-dispatcher.notifications.svc.cluster.local and you can reach it from any other service in the cluster.

Here's an example in JavaScript:

fetch("http://email-dispatcher.notifications.svc.cluster.local")
  .then(console.log)
  .catch(console.error);

The limitation of using DNS instead of EnvVar is that you must know what port the service is exposed on, which usually is not a problem.

ProTip: Disable Environment Variables for Service Discovery

In some cases, when you have a lot of services in a single namespace, the environment variables can become an issue. Some platforms and frameworks have a limit on the size of environment variables, or it may cause performance issues when there are too many.

DNS is usually the preferred way to do service discovery in Kubernetes, so you should avoid using environment variables unless you have a good reason to do so. If you're not using it, then you might as well disable it.

You can disable environment variables for service discovery by setting the enableServiceLinks to false in the pod spec, here's an example:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      enableServiceLinks: false
      containers:
        - name: nginx
          image: nginx

Conclusion

To wrap it up, service discovery is super important in Kubernetes to help different services find and talk to each other. Using environment variables is one way, but it has some drawbacks, like being limited to the same area (namespace) and getting outdated quickly.

DNS is the way to go for making sure your services in Kubernetes can find each other easily and we recommend using it instead of environment variables where possible.