What is the purpose of kubectl proxy?

ghz 1years ago ⋅ 8847 views

Question

In order to access the Kubernetes dashboard you have to run kubectl proxy on your local machine, then point your web browser to the proxy. Similarly, if you want to submit a Spark job you again run kubectl proxy on your local machine then run spark-submit against the localhost address.

My question is, why does Kubernetes have this peculiar arrangement? The dashboard service is running on the Kubernetes cluster, so why am I not pointing my web browser at the cluster directly? Why have a proxy? In some cases the need for proxy is inconvenient. For example, from my Web server I want to submit a Spark job. I can't do that--I have to run a proxy first, but this ties me to a specific cluster. I may have many Kubernetes clusters.

Why was Kubernetes designed such that you can only access it through a proxy?


Answer

You can access your application in the cluster in different ways:

  1. by using apiserver as a proxy, but you need to pass authentication and authorization stage.
  2. by using hostNetwork. When a pod is configured with hostNetwork: true, the applications running in such a pod can directly see the network interfaces of the host machine where the pod was started.
  3. by using hostPort. The container port will be exposed to the external network at hostIP:hostPort, where the hostIP is the IP address of the Kubernetes node where the container is running and the hostPort is the port requested by the user.
  4. by using Services with type: ClusterIP. ClusterIP Services accessible only for pods in the cluster and cluster nodes.
  5. by using Services with type: NodePort. In addition to ClusterIP, this service gets random or specified by user port from range of 30000-32767. All cluster nodes listen to that port and forward all traffic to corresponding Service.
  6. by using Services with type: LoadBalancer. It works only with supported Cloud Providers and with Metallb for On Premise clusters. In addition to opening NodePort, Kubernetes creates cloud load balancer that forwards traffic to NodeIP:Nodeport for that service.

So, basically: [[[ Kubernetes Service type:ClusterIP] + NodePort ] + LoadBalancer ]

  1. by using Ingress (ingress-controller+Ingress object). Ingress-controller is exposed by Nodeport or LoadBalancer service and works as L7 reverse-proxy/LB for the cluster Services. It has access to ClusterIP Services so, you don't need to expose Services if you use Ingress. You can use it for SSL termination and for forwarding traffic based on URL path. The most popular ingress-controllers are:

Now, about kubectl proxy. It uses the first way to connect to the cluster. Basically, it reads the cluster configuration in .kube/config and uses credentials from there to pass cluster API Server authentication and authorization stage. Then it creates communication channel from local machine to API-Server interface, so, you can use local port to send requests to Kubernetes cluster API without necessity to specify credentials for each request.