[ZEPPELIN-6144] Add Helm chart for deploying Zeppelin on Kubernetes #4896

ChenYi015 · 2024-11-05T05:13:00Z

What is this PR for?

A few sentences describing the overall goals of the pull request's commits.
First time? Check out the contributing guide - https://zeppelin.apache.org/contribution/contributions.html

What type of PR is it?

Feature

What is the Jira issue?

https://issues.apache.org/jira/browse/ZEPPELIN-6144

How should this be tested?

Strongly recommended: add automated unit tests for any new or changed behavior
Outline any manual steps to test the PR here.

Screenshots (if appropriate)

Questions:

Does the license files need to update?
Is there breaking changes for older versions?
Does this needs documentation?

Signed-off-by: Yi Chen <[email protected]>

charts/zeppelin/templates/server/deployment.yaml

Signed-off-by: Yi Chen <[email protected]>

Armadik · 2024-11-07T08:33:11Z

charts/zeppelin/templates/server/deployment.yaml

+        envFrom:
+        - configMapRef:
+            name: {{ include "zeppelin.envConfigMapName" . }}
+        volumeMounts:


Zeppelin also has a configuration directory where it stores the system state. When you restart pod, these files will be reset and return to their original state. It is necessary to provide the ability to use pvc to store configs

@Armadik Yes, you can configure the helm chart to mount a PVC to the server pod in order to persist config files like shiro.ini and interpreter.json.

In my opinion, the shiro.ini should be mounted via ConfigMap just like the logging configuration.
Only credentials.json, interpreter.json and notebook-authorization.json should be outsourced. This can be done with ZEPPELIN_CONFIG_FS_DIR.

Other environment variables which should link to a pvc are ZEPPELIN_NOTEBOOK_DIR and ZEPPELIN_SEARCH_INDEX_PATH.

I currently use the following environment variables. The PVC is mounted under /data.

ZEPPELIN_CONFIG_FS_DIR=/data/zeppelin-config ZEPPELIN_NOTEBOOK_DIR=/data/zeppelin-notebook ZEPPELIN_SEARCH_INDEX_PATH=/data/lucene-index

Signed-off-by: Yi Chen <[email protected]>

ChenYi015 · 2024-12-20T13:51:43Z

@Reamer Please take a look again when you have time, thank you! We have already used this Helm chart in the production environment to do interactive analysis with Spark.

Reamer

I just had a quick look at the change and left you a few comments. I'm more of a friend of Kustomize, but I also know that Helm is more powerful.
The changes definitely look much better than before with the dns and nginx.

Reamer · 2024-12-20T15:28:16Z

charts/zeppelin/templates/server/configmap.yaml

+    export ZEPPELIN_K8S_SERVICE_NAME={{ include "zeppelin.server.serviceName" . }}
+    export ZEPPELIN_K8S_SPARK_CONTAINER_IMAGE={{ include "zeppelin.interpreter.spark.image" . }}
+
+  zeppelin-site.xml: |


Please carry out a configuration on environment variables. This configuration is easier to understand and does not require an additional xsl file.

I think it would be better to provide both methods. One can choose to configure the Zeppelin conf with server.conf or configure the environment variables with server.env / server.envFrom value, and let users to choose whatever they like.

Reamer · 2024-12-20T15:30:12Z

charts/zeppelin/files/conf/interpreter-list

+# limitations under the License.
+#
+#
+# [name]  [maven artifact]  [description]


For which functionality is this file required?

Reamer · 2024-12-20T15:34:32Z

charts/zeppelin/templates/interpreter/role.yaml

+limitations under the License.
+*/ -}}
+
+apiVersion: rbac.authorization.k8s.io/v1


Please note that Role, ServiceAccount and Role binding are only required for the Spark interpreter. How is this implemented?

This logic is implemented in interpreter-spec.yaml file, we will not mount the service account token except for the interpreter group name is spark, see : https://github.com/ChenYi015/zeppelin/blob/beea9837983b16e50e1be9ca56c49d828148ffeb/charts/zeppelin/templates/interpreter/configmap.yaml#L165-L170.

Reamer · 2024-12-20T15:35:53Z

charts/zeppelin/templates/interpreter/spark/configmap.yaml

+    {{ $key }} {{ $value }}
+    {{- end }}
+
+  driver-pod-template.yaml: |


The driver pod is created directly by Zeppelin. I think this file has no effect on an already existing pod. Zeppelin starts the Driver Pod in client mode.

You are right, the driver pod template will not take effect actually.

Reamer · 2024-12-20T15:37:38Z

charts/zeppelin/templates/interpreter/spark/configmap.yaml

+        {{- toYaml . | nindent 8 }}
+      {{- end }}
+
+  executor-pod-template.yaml: |


This configuration could be used in the Spark driver. I have not yet tried such a template. I currently configure the executors and drivers via the Spark-Config.

Create the executor pod template file for that some fields of the executor pods (e.g. affinity, tolerations) cannot be configured using Spark configuration. Additionally, I believe it is more straightforward to configure the executor pods using a pod template rather than through Spark conf.

Reamer · 2024-12-20T15:39:02Z

charts/zeppelin/values.yaml

+  # -- Security context for Zeppelin interpreter containers.
+  securityContext:
+    runAsNonRoot: true
+    runAsUser: 1000


Please do not enter a fixed UID. Openshift, for example, uses a random UID by default.

I have not tried whether it can work with a random UID. The uid 1000 comes from the zeppelin server Dockerfile, ref:

zeppelin/scripts/docker/zeppelin-server/Dockerfile

Lines 79 to 85 in b6e40d4

USER 1000

EXPOSE 8080

ENTRYPOINT [ "/usr/bin/tini", "--" ]

WORKDIR ${ZEPPELIN_HOME}

CMD ["bin/zeppelin.sh"]

.

Add Helm chart for deploying Zeppelin on Kubernetes

dc4a8e8

Signed-off-by: Yi Chen <[email protected]>

ChenYi015 force-pushed the feature/helm-chart branch from 70e0979 to dc4a8e8 Compare November 5, 2024 06:07

Update Helm chart README

be34c5b

Signed-off-by: Yi Chen <[email protected]>

Reamer reviewed Nov 6, 2024

View reviewed changes

charts/zeppelin/templates/server/deployment.yaml Outdated Show resolved Hide resolved

Reamer reviewed Nov 6, 2024

View reviewed changes

charts/zeppelin/templates/server/deployment.yaml Outdated Show resolved Hide resolved

ChenYi015 added 8 commits November 11, 2024 11:01

Add support for configuring Zeppelin

37d0ce9

Signed-off-by: Yi Chen <[email protected]>

Update

20ec7f3

Signed-off-by: Yi Chen <[email protected]>

Remove zeppelin server gateway

df75a16

Signed-off-by: Yi Chen <[email protected]>

Remove dnsmasq

a88ac10

Signed-off-by: Yi Chen <[email protected]>

Update license comments

e4bced5

Signed-off-by: Yi Chen <[email protected]>

Update zeppelin server deployment when configuration changes

cee2547

Signed-off-by: Yi Chen <[email protected]>

Mount conf files via subPath

5bd0dfc

Signed-off-by: Yi Chen <[email protected]>

Update

0fa9436

Signed-off-by: Yi Chen <[email protected]>

ChenYi015 marked this pull request as ready for review November 13, 2024 13:32

ChenYi015 requested a review from Reamer November 13, 2024 13:32

ChenYi015 added 7 commits November 14, 2024 11:05

Update

b6de489

Signed-off-by: Yi Chen <[email protected]>

Update

43f5be8

Signed-off-by: Yi Chen <[email protected]>

Add more options for configuring Spark executor pods

bc85537

Signed-off-by: Yi Chen <[email protected]>

fix: interpreter envFrom does not work

a69de89

Signed-off-by: Yi Chen <[email protected]>

feat: add support for configuring Spark pod labels and annotations

640d91b

Signed-off-by: Yi Chen <[email protected]>

Update helm chart README

0ce9d5e

Signed-off-by: Yi Chen <[email protected]>

Add pod deletecollection permission to interpreter role

57bf578

Signed-off-by: Yi Chen <[email protected]>

Armadik reviewed Dec 20, 2024

View reviewed changes

Add license

beea983

Signed-off-by: Yi Chen <[email protected]>

Reamer requested changes Dec 20, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ZEPPELIN-6144] Add Helm chart for deploying Zeppelin on Kubernetes #4896

[ZEPPELIN-6144] Add Helm chart for deploying Zeppelin on Kubernetes #4896

ChenYi015 commented Nov 5, 2024

Armadik Nov 7, 2024

ChenYi015 Dec 20, 2024

Reamer Dec 20, 2024

ChenYi015 commented Dec 20, 2024

Reamer left a comment

Reamer Dec 20, 2024

ChenYi015 Dec 21, 2024

Reamer Dec 20, 2024

Reamer Dec 20, 2024

ChenYi015 Dec 21, 2024 •

edited

Loading

Reamer Dec 20, 2024

ChenYi015 Dec 21, 2024

Reamer Dec 20, 2024

ChenYi015 Dec 21, 2024

Reamer Dec 20, 2024

ChenYi015 Dec 21, 2024

	USER 1000

	EXPOSE 8080

	ENTRYPOINT [ "/usr/bin/tini", "--" ]
	WORKDIR ${ZEPPELIN_HOME}
	CMD ["bin/zeppelin.sh"]

[ZEPPELIN-6144] Add Helm chart for deploying Zeppelin on Kubernetes #4896

Are you sure you want to change the base?

[ZEPPELIN-6144] Add Helm chart for deploying Zeppelin on Kubernetes #4896

Conversation

ChenYi015 commented Nov 5, 2024

What is this PR for?

What type of PR is it?

What is the Jira issue?

How should this be tested?

Screenshots (if appropriate)

Questions:

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ChenYi015 commented Dec 20, 2024

Reamer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ChenYi015 Dec 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ChenYi015 Dec 21, 2024 •

edited

Loading