Fix Endpoints Kubernetes Not Found Error – Pro Guide 2025

The Scenario: A Fresh Cluster Gone Wrong

Picture this: You’ve just finished setting up a new Kubernetes cluster using kubeadm on your development environment. Everything seemed to go smoothly during the installation, but when you run your first kubectl get svc command to verify the setup, something’s off. The default kubernetes service shows no endpoints, or worse, your CoreDNS pods are throwing errors like endpoints "kubernetes" not found in their logs.

This exact scenario happened to me last month while setting up a staging environment. Applications couldn’t resolve the API server through DNS, pods were failing to communicate with the control plane, and the entire cluster felt unstable. If you’re reading this, chances are you’re facing the same frustrating issue.

The endpoints kubernetes not found error is more common than you might think, especially in freshly installed clusters or after configuration changes. Let’s dive deep into understanding, debugging, and permanently fixing this critical Kubernetes issue.

Kubernetes API Communication Flow - Endpoints Kubernetes Not Found - thedevopstooling.com
Kubernetes API Communication Flow – Endpoints Kubernetes Not Found – thedevopstooling.com

What Does “Endpoints ‘kubernetes’ Not Found” Actually Mean?

Before jumping into fixes, it’s crucial to understand what this error represents in your Kubernetes ecosystem.

In every healthy Kubernetes cluster, there’s a default service called kubernetes that lives in the default namespace. This service acts as a stable endpoint for pods to communicate with the kube-apiserver. When you see kubernetes endpoints not found, it means:

  1. The kubernetes service has no healthy endpoints registered
  2. Pods cannot resolve the API server through DNS
  3. The control plane connectivity is compromised
  4. CoreDNS cannot locate the API server service

This isn’t just a minor inconvenience—it’s a sign that your cluster’s fundamental networking and service discovery mechanisms are broken. Applications relying on service discovery will fail, and your cluster’s overall stability is at risk.

Step-by-Step Debugging Process

Step 1: Verify the Kubernetes Service Exists

Start by confirming whether the default kubernetes service is present:

kubectl get svc

You should see output similar to:

NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   5d

If the service is missing entirely, that’s your first clue. If it exists but shows no endpoints, continue to the next step.

Step 2: Examine the Endpoint Objects

Check the actual endpoint object for the kubernetes service:

kubectl get endpoints kubernetes -o yaml

A healthy endpoint should look like:

apiVersion: v1
kind: Endpoints
metadata:
  name: kubernetes
  namespace: default
subsets:
- addresses:
  - ip: 192.168.1.10
  ports:
  - name: https
    port: 6443
    protocol: TCP

If the subsets section is empty or missing, your API server isn’t registering as a healthy endpoint.

Step 3: Check API Server Pod Status

Verify that the kube-apiserver pod is running and healthy:

kubectl get pods -n kube-system | grep apiserver

For kubeadm clusters, also check the static pod manifests:

sudo ls -la /etc/kubernetes/manifests/
sudo cat /etc/kubernetes/manifests/kube-apiserver.yaml

Step 4: Analyze CoreDNS Logs

CoreDNS logs often provide the clearest indication of the coredns endpoints error:

# Get CoreDNS pod names
kubectl get pods -n kube-system | grep coredns

# Check logs for errors
kubectl logs -n kube-system coredns-xxxxx-xxxxx

Look for error messages like:

[ERROR] plugin/kubernetes: endpoints "kubernetes" not found
[ERROR] plugin/ready: Still waiting on: "kubernetes"

Step 5: Verify Cluster Connectivity

Test basic cluster connectivity:

kubectl cluster-info
kubectl get nodes
curl -k https://$(kubectl config view --minify -o jsonpath='{.clusters[0].cluster.server}')

Common Causes and Their Root Issues

1. API Server Pod Not Running or Unhealthy

Symptoms:

  • No kube-apiserver pods in kube-system namespace
  • API server pod stuck in CrashLoopBackOff or Pending state
  • Kube-apiserver endpoints not found errors

Root Causes:

  • Misconfigured static pod manifests
  • Certificate issues
  • Resource constraints
  • Port conflicts

2. Default Service Object Corruption

Symptoms:

  • Kubernetes default service missing from kubectl get svc
  • Service exists but has wrong cluster IP or ports
  • Endpoints object is missing or empty

Root Causes:

  • Accidental deletion of the default service
  • Corrupted etcd data
  • Service mesh interference

3. DNS and CoreDNS Misconfigurations

Symptoms:

  • CoreDNS pods failing readiness checks
  • DNS resolution failures cluster-wide
  • CoreDNS endpoints error in logs

Root Causes:

  • Incorrect CoreDNS ConfigMap
  • Missing or wrong upstream DNS servers
  • Firewall blocking DNS queries

4. CNI Plugin and Networking Issues

Symptoms:

  • Pod-to-pod communication failures
  • Service discovery not working
  • Network policy blocking traffic

Root Causes:

  • CNI plugin not installed or misconfigured
  • IP range conflicts
  • Missing network routes

Comprehensive Fixes with Examples

Fix 1: Restart Unhealthy API Server

For kubeadm clusters, restart the API server static pod:

# Check API server pod status
kubectl get pods -n kube-system | grep apiserver

# If using static pods, delete to force restart
sudo mv /etc/kubernetes/manifests/kube-apiserver.yaml /tmp/
sleep 10
sudo mv /tmp/kube-apiserver.yaml /etc/kubernetes/manifests/

# For cloud providers, delete the pod
kubectl delete pod -n kube-system kube-apiserver-<node-name>

Fix 2: Recreate Missing Service/Endpoints

If the default kubernetes service is missing, recreate it:

# Create the default kubernetes service
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
  name: kubernetes
  namespace: default
  labels:
    component: apiserver
    provider: kubernetes
spec:
  type: ClusterIP
  clusterIP: 10.96.0.1
  ports:
  - name: https
    port: 443
    protocol: TCP
    targetPort: 6443
EOF

Fix 3: Restart and Reconfigure CoreDNS

Restart CoreDNS deployment:

kubectl rollout restart deployment coredns -n kube-system

# Wait for rollout to complete
kubectl rollout status deployment coredns -n kube-system

# Verify CoreDNS configuration
kubectl get configmap coredns -n kube-system -o yaml

Fix 4: Reinstall CNI Plugin

For Flannel:

kubectl delete -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml
kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml

For Calico:

kubectl delete -f https://docs.projectcalico.org/manifests/calico.yaml
kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml

Fix 5: Check and Configure Firewall Rules

Ensure required ports are open:

# For API server
sudo iptables -A INPUT -p tcp --dport 6443 -j ACCEPT

# For CoreDNS
sudo iptables -A INPUT -p udp --dport 53 -j ACCEPT
sudo iptables -A INPUT -p tcp --dport 53 -j ACCEPT

# Save rules (Ubuntu/Debian)
sudo iptables-save > /etc/iptables/rules.v4

Troubleshooting Checklist Table

Error SymptomLikely CauseImmediate FixPrevention
kubectl get svc shows no kubernetes serviceDefault service deletedRecreate kubernetes serviceNever delete default services
kubectl get endpoints kubernetes returns emptyAPI server unhealthyRestart kube-apiserver podMonitor API server health
CoreDNS logs “endpoints kubernetes not found”DNS configuration issueRestart CoreDNS podsRegular CoreDNS config validation
kubectl cluster-info failsNetwork connectivity issueCheck CNI plugin statusMonitor network plugin health
API server pod in CrashLoopBackOffConfiguration/certificate issueReview API server logsUse proper certificate management
Service exists but no endpointsAPI server not registeringCheck API server readinessImplement readiness probes

Best Practices for Prevention

1. Never Delete Default Services

The kubernetes service in the default namespace is critical for cluster operations. Always exclude it from any cleanup scripts:

# Safe service cleanup example
kubectl get svc --all-namespaces | grep -v "kubernetes.*default" | awk '{print $2 " -n " $1}' | xargs kubectl delete svc

2. Implement Monitoring and Alerting

Set up Prometheus monitoring for critical services:

# prometheus-rule.yaml
groups:
- name: kubernetes.rules
  rules:
  - alert: KubernetesServiceMissing
    expr: up{job="kubernetes-apiservers"} == 0
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "Kubernetes API server is down"
      description: "The kubernetes service has no healthy endpoints"

3. Use Readiness Probes for Control Plane

Ensure your API server has proper health checks:

# In kube-apiserver.yaml
readinessProbe:
  httpGet:
    host: 127.0.0.1
    path: /readyz
    port: 6443
    scheme: HTTPS
  initialDelaySeconds: 10
  periodSeconds: 1
  timeoutSeconds: 15
  failureThreshold: 3

4. Regular Network Validation

Create a post-deployment validation script:

#!/bin/bash
# cluster-validation.sh

echo "Validating Kubernetes cluster health..."

# Check default service
if ! kubectl get svc kubernetes &>/dev/null; then
    echo "ERROR: Default kubernetes service missing"
    exit 1
fi

# Check endpoints
if ! kubectl get endpoints kubernetes | grep -q "10\|192\|172"; then
    echo "ERROR: No endpoints for kubernetes service"
    exit 1
fi

# Check CoreDNS
if ! kubectl get pods -n kube-system | grep coredns | grep -q Running; then
    echo "ERROR: CoreDNS pods not running"
    exit 1
fi

echo "Cluster validation passed!"

Frequently Asked Questions

What does endpoints ‘kubernetes’ not found mean?

The “endpoints ‘kubernetes’ not found” error indicates that the default kubernetes service in your cluster has no healthy backend endpoints registered. This means pods cannot discover or communicate with the Kubernetes API server through the service DNS name, breaking service discovery and cluster functionality.

How do I fix endpoints ‘kubernetes’ not found error?

To fix this error: 1) Verify the kubernetes service exists with kubectl get svc, 2) Check endpoint registration with kubectl get endpoints kubernetes, 3) Ensure the API server pod is running with kubectl get pods -n kube-system, 4) Restart CoreDNS if needed with kubectl rollout restart deployment coredns -n kube-system, and 5) Recreate the service if missing.

Can CoreDNS cause endpoints not found errors?

Yes, CoreDNS misconfigurations can cause endpoints not found errors. If CoreDNS cannot properly query the Kubernetes API for service and endpoint information, it will log “endpoints ‘kubernetes’ not found” errors. This often happens due to incorrect CoreDNS ConfigMap settings, network policies blocking DNS queries, or CoreDNS pods being unable to reach the API server.

How to verify if the API server is healthy?

Verify API server health by running: kubectl cluster-info to check connectivity, kubectl get pods -n kube-system | grep apiserver to confirm pod status, curl -k https://API_SERVER_IP:6443/healthz to test the health endpoint, and kubectl get endpoints kubernetes -o wide to confirm the API server is registered as a healthy endpoint.

Why would the default kubernetes service disappear?

The default kubernetes service can disappear due to: accidental deletion during cleanup operations, corrupted etcd data affecting service objects, cluster upgrade issues that reset default services, service mesh installations that interfere with default services, or manual modifications to system namespaces that remove critical resources.

Conclusion: Systematic Approach to Cluster Health

The endpoints kubernetes not found error is a clear indicator of control plane or DNS misconfiguration in your Kubernetes cluster. While it can seem daunting initially, following a systematic debugging approach—checking services, endpoints, API server health, and CoreDNS functionality—will lead you to the root cause.

Remember that this error typically points to one of four areas: API server health, service object integrity, DNS configuration, or network connectivity. By methodically working through each possibility and implementing the fixes outlined above, you can restore your cluster to full functionality.

The key to preventing future occurrences lies in proper monitoring, avoiding modifications to default system services, and maintaining robust health checks for your control plane components. With these practices in place, your Kubernetes clusters will remain stable and reliable.

For deeper insights into Kubernetes troubleshooting, explore the official kubeadm troubleshooting documentation and consider implementing comprehensive CoreDNS monitoring and alerting strategies.


Related crash scenario troubleshooting:

Similar Posts

Leave a Reply