Kubernetes - Jenkins slaves are offline

ghz 1years ago ⋅ 903 views

Question

I am trying to run jenkins with kubernetes. I am able to make a successful connection to kubernetes using jenkins kubernetes plugin. Now, I am running a pipeline example, but while running, I always get an error saying:

Still waiting to schedule task
‘default-amd64-cm2rx’ is offline

And it hangs there. If I check pods using kubectl get pods, I see that the pod default-amd64-cm2rx was running, then state changed to completed and then it was gone. Then another pod with similar name, started and finished and the cycle continues. The last state of these pods come as:

Normal  Created    10s   kubelet, xx.xx.xx.xx  Created container
Normal  Started    10s   kubelet, xx.xx.xx.xx  Started container

If I check the jenkins logs, I get an error as:

Mar 09, 2019 8:47:42 AM org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch
WARNING: Error in provisioning; agent=KubernetesSlave name: default-amd64-g5bgh, template=PodTemplate{inheritFrom='', name='default-amd64', namespace='', label='jenkins-latest-jenkins-slave-amd64', nodeSelector='beta.kubernetes.io/arch=amd64', nodeUsageMode=NORMAL, workspaceVolume=EmptyDirWorkspaceVolume [memory=false], volumes=[HostPathVolume [mountPath=/var/run/docker.sock, hostPath=/var/run/docker.sock]], containers=[ContainerTemplate{name='jnlp', image='myregistry;8500/jenkins-slave:latest', workingDir='/home/jenkins', command='/bin/sh -c', args='cat', resourceRequestCpu='200m', resourceRequestMemory='256Mi', resourceLimitCpu='200m', resourceLimitMemory='256Mi', livenessProbe=org.csanchez.jenkins.plugins.kubernetes.ContainerLivenessProbe@1e7ac0a6}], yaml=}
java.lang.IllegalStateException: Pod has terminated containers: default/default-amd64-g5bgh (jnlp)
    at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.periodicAwait(AllContainersRunningPodWatcher.java:149)
    at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.periodicAwait(AllContainersRunningPodWatcher.java:170)
    at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.await(AllContainersRunningPodWatcher.java:122)
    at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.launch(KubernetesLauncher.java:121)
    at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:293)
    at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
    at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

Here is my kuebrnetes plugin config:

enter image description
here

As, you can see the connection is successful, and the pod is spawning.

Any idea why it stays offline? TIA.


Answer

jnlp might be trying to use the same port for slave and Kubernetes, since it looks like your pod is terminating and you’re running this job on a slave. Set jnlp to use random port so that you can guarantee no collisions.

In Jenkins, it’s under Configure Security.

From Jenkins Documentation: https://jenkins.io/doc/book/managing/security/

JNLP TCP Port

Jenkins uses a TCP port to communicate with agents launched via the JNLP protocol, such as Windows-based agents. As of Jenkins 2.0, by default this port is disabled.

For administrators wishing to use JNLP-based agents, the two port options are:

Random: The JNLP port is chosen random to avoid collisions on the Jenkins master. The downside to randomized JNLP ports is that they’re chosen during the boot of the Jenkins master, making it difficult to manage firewall rules allowing JNLP traffic.

Fixed: The JNLP port is chosen by the Jenkins administrator and is consistent across reboots of the Jenkins master. This makes it easier to manage firewall rules allowing JNLP-based agents to connect to the master.