
- Two GCS storage buckets for gazette and druid data.
- A GKE cluster with a minimum of two node pools: base pool and druid pool.
-
The base node pool should be labeled with
arize=trueandarize-base=true. -
The druid node pool should be labeled with
arize=trueanddruid-historical=true. -
A GCP service account attached to a role with the following permissions:
- bigquery.jobs.create
- storage.objects.create
- storage.objects.delete
- storage.objects.get
- storage.objects.list
- artifactregistry.repositories.downloadArtifacts
- aiplatform.endpoints.predict
-
If using Workload Identity (default):
-
The service account must grant permissions to these GKE namespace/service-account pairs with role
Workload Identity User:- arize/arize
- arize-operator/arize-operator
- arize-spark/spark
-
The service account must grant permissions to these GKE namespace/service-account pairs with role
-
If not using Workload Identity:
- A JSON key from the GCP service account is required
-
Storage classes
premium-rwoandstandard-rwoare preferred and used by default. - A GCR or docker registry is optional as Arize pulls images from Arize AX’s central image registry by default.
-
Namespaces
arize,arize-operator, andarize-sparkcan be pre-existing or created later by the helm chart. -
If using workload identity, the GCP service account must have role bindings to
<namespace>/<k8-service-account>pairs. - If not using workload identity, a JSON key from the service account is required.
small1b or medium2b.
values.yaml: