Installation

Installing Kueue to a Kubernetes Cluster

Before you begin

Make sure the following conditions are met:

  • A Kubernetes cluster with version 1.25 or newer is running. Learn how to install the Kubernetes tools.
  • The SuspendJob feature gate is enabled. In Kubernetes 1.22 or newer, the feature gate is enabled by default.
  • (Optional) The JobMutableNodeSchedulingDirectives feature gate (available in Kubernetes 1.22 or newer) is enabled. In Kubernetes 1.23 or newer, the feature gate is enabled by default.
  • The kubectl command-line tool has communication with your cluster.

Kueue publishes metrics to monitor its operators. You can scrape these metrics with Prometheus. Use kube-prometheus if you don’t have your own monitoring system.

The webhook server in kueue uses an internal cert management for provisioning certificates. If you want to use a third-party one, e.g. cert-manager, follow these steps:

  1. Set internalCertManagement.enable to false in config file.
  2. Comment out the internalcert folder in config/default/kustomization.yaml.
  3. Enable cert-manager in config/default/kustomization.yaml and uncomment all sections with ‘CERTMANAGER’.

Install a released version

To install a released version of Kueue in your cluster, run the following command:

kubectl apply --server-side -f https://github.com/kubernetes-sigs/kueue/releases/download/v0.9.1/manifests.yaml

To wait for Kueue to be fully available, run:

kubectl wait deploy/kueue-controller-manager -nkueue-system --for=condition=available --timeout=5m

Add metrics scraping for prometheus-operator

To allow prometheus-operator to scrape metrics from kueue components, run the following command:

kubectl apply --server-side -f https://github.com/kubernetes-sigs/kueue/releases/download/v0.9.1/prometheus.yaml

Add API Priority and Fairness configuration for the visibility API

See Configure API Priority and Fairness for more details.

Uninstall

To uninstall a released version of Kueue from your cluster, run the following command:

kubectl delete -f https://github.com/kubernetes-sigs/kueue/releases/download/v0.9.1/manifests.yaml

Install a custom-configured released version

To install a custom-configured released version of Kueue in your cluster, execute the following steps:

  1. Download the release’s manifests.yaml file:
wget https://github.com/kubernetes-sigs/kueue/releases/download/v0.9.1/manifests.yaml
  1. With an editor of your preference, open manifests.yaml.
  2. In the kueue-manager-config ConfigMap manifest, edit the controller_manager_config.yaml data entry. The entry represents the default KueueConfiguration. The contents of the ConfigMap are similar to the following:
apiVersion: v1
kind: ConfigMap
metadata:
  name: kueue-manager-config
  namespace: kueue-system
data:
  controller_manager_config.yaml: |
    apiVersion: config.kueue.x-k8s.io/v1beta1
    kind: Configuration
    namespace: kueue-system
    health:
      healthProbeBindAddress: :8081
    metrics:
      bindAddress: :8080
      # enableClusterQueueResources: true
    webhook:
      port: 9443
    manageJobsWithoutQueueName: true
    internalCertManagement:
      enable: true
      webhookServiceName: kueue-webhook-service
      webhookSecretName: kueue-webhook-server-cert
    waitForPodsReady:
      enable: true
      timeout: 10m
    integrations:
      frameworks:
      - "batch/job"    

The integrations.externalFrameworks field is available in Kueue v0.7.0 and later.

  1. Apply the customized manifests to the cluster:
kubectl apply --server-side -f manifests.yaml

Install the latest development version

To install the latest development version of Kueue in your cluster, run the following command:

kubectl apply --server-side -k "github.com/kubernetes-sigs/kueue/config/default?ref=main"

The controller runs in the kueue-system namespace.

Uninstall

To uninstall Kueue, run the following command:

kubectl delete -k "github.com/kubernetes-sigs/kueue/config/default?ref=main"

Build and install from source

To build Kueue from source and install Kueue in your cluster, run the following commands:

git clone https://github.com/kubernetes-sigs/kueue.git
cd kueue
IMAGE_REGISTRY=registry.example.com/my-user make image-local-push deploy

Add metrics scraping for prometheus-operator

To allow prometheus-operator to scrape metrics from kueue components, run the following command:

make prometheus

Uninstall

To uninstall Kueue, run the following command:

make undeploy

Install via Helm

To install and configure Kueue with Helm, follow the instructions.

Change the feature gates configuration

Kueue uses a similar mechanism to configure features as described in Kubernetes Feature Gates.

In order to change the default of a feature, you need to edit the kueue-controller-manager deployment within the kueue installation namespace and change the manager container arguments to include

--feature-gates=...,<FeatureName>=<true|false>

For example, to enable PartialAdmission, you should change the manager deployment as follows:

kind: Deployment
...
spec:
  ...
  template:
    ...
    spec:
      containers:
      - name: manager
        args:
        - --config=/controller_manager_config.yaml
        - --zap-log-level=2
+       - --feature-gates=PartialAdmission=true

The currently supported features are:

Feature Default Stage Since Until
FlavorFungibility true Beta 0.5
MultiKueue false Alpha 0.6 0.8
MultiKueue true Beta 0.9
MultiKueueBatchJobWithManagedBy false Alpha 0.8
PartialAdmission false Alpha 0.4 0.4
PartialAdmission true Beta 0.5
ProvisioningACC false Alpha 0.5 0.6
ProvisioningACC true Beta 0.7
QueueVisibility false Alpha 0.5 0.9
QueueVisibility false Deprecated 0.9
VisibilityOnDemand false Alpha 0.6 0.8
VisibilityOnDemand true Beta 0.9
PrioritySortingWithinCohort true Beta 0.6
LendingLimit false Alpha 0.6 0.8
LendingLimit true Beta 0.9
MultiplePreemptions false Alpha 0.8 0.8
MultiplePreemptions true Beta 0.9
TopologyAwareScheduling false Alpha 0.9
ConfigurableResourceTransformations false Alpha 0.9 0.9
ConfigurableResourceTransformations true Beta 0.10
WorkloadResourceRequestsSummary false Alpha 0.9 0.9
WorkloadResourceRequestsSummary true Beta 0.10
AdmissionCheckValidationRules false Deprecated 0.9 0.9
KeepQuotaForProvReqRetry false Deprecated 0.9 0.9

What’s next