Monitoring the DevWorkspace operator

This chapter describes how to configure an example monitoring stack to process metrics exposed by the DevWorkspace operator. You must enable the DevWorkspace operator to follow the instructions in this chapter. See Enabling DevWorkspace operator.

Collecting DevWorkspace operator metrics with Prometheus

This section describes how to use the Prometheus to collect, store, and query metrics about the DevWorkspace operator.

  • The devworkspace-controller-metrics service is exposing metrics on port 8443.

  • The devworkspace-webhookserver service is exposing metrics on port 9443. By default, the service exposes metrics on port 9443.

  • Prometheus 2.26.0 or later is running. The Prometheus console is running on port 9090 with a corresponding service and route. See First steps with Prometheus.

  1. Create a ClusterRoleBinding to bind the ServiceAccount associated with Prometheus to the devworkspace-controller-metrics-reader ClusterRole. Without the ClusterRoleBinding, you cannot access DevWorkspace metrics because they are protected with role-based access control (RBAC).

    Example 1. ClusterRole example
    kind: ClusterRole
      name: devworkspace-controller-metrics-reader
    - nonResourceURLs:
      - /metrics
      - get
    Example 2. ClusterRoleBinding example
    kind: ClusterRoleBinding
      name: devworkspace-controller-metrics-binding
      - kind: ServiceAccount
        name: <ServiceAccount name associated with the Prometheus Pod>
        namespace: <Prometheus namespace>
      kind: ClusterRole
      name: devworkspace-controller-metrics-reader
  2. Configure Prometheus to scrape metrics from the 8443 port exposed by the devworkspace-controller-metrics service, and 9443 port exposed by the devworkspace-webhookserver service.

    Example 3. Prometheus configuration example
    apiVersion: v1
    kind: ConfigMap
      name: prometheus-config
      prometheus.yml: |-
            scrape_interval:     5s             (1)
            evaluation_interval: 5s             (2)
          scrape_configs:                       (3)
            - job_name: 'DevWorkspace'
                type: Bearer
                credentials_file: '/var/run/secrets/'
                insecure_skip_verify: true
                - targets: ['devworkspace-controller-metrics:8443']  (4)
            - job_name: 'DevWorkspace webhooks'
                type: Bearer
                credentials_file: '/var/run/secrets/'
                insecure_skip_verify: true
                - targets: ['devworkspace-webhookserver:9443']  (5)
1 Rate at which a target is scraped.
2 Rate at which recording and alerting rules are re-checked.
3 Resources that Prometheus monitors. In the default configuration, two jobs (DevWorkspace and DevWorkspace webhooks), scrape the time series data exposed by the devworkspace-controller-metrics and devworkspace-webhookserver services.
4 Scrape metrics from the 8443 port.
5 Scrape metrics from the 9443 port.
Verification steps

DevWorkspace-specific metrics

This section describes the DevWorkspace-specific metrics exposed by the devworkspace-controller-metrics service.

Table 1. Metrics
Name Type Description Labels



Number of DevWorkspace starting events.

source, routingclass



Number of DevWorkspaces successfully entering the Running phase.

source, routingclass



Number of failed DevWorkspaces.

source, reason



Total time taken to start a DevWorkspace, in seconds.

source, routingclass

Table 2. Labels
Name Description Values


The label of the DevWorkspace.



The spec.routingclass of the DevWorkspace.



The workspace startup failure reason.


Table 3. Startup failure reasons
Name Description


Startup failure due to an invalid devfile used to create a DevWorkspace.


Startup failure due to the following errors: CreateContainerError, RunContainerError, FailedScheduling, FailedMount.


Unknown failure reason.

Viewing DevWorkspace operator metrics on Grafana dashboards

This section describes how to view DevWorkspace operator metrics on Grafana with the example dashboard. Grafana version 7.5.3 or later is required to support all panels in the example dashboard.

  1. Add the data source for the Prometheus instance. See Creating a Prometheus data source.

  2. Import the example grafana-dashboard.json dashboard.

Verification steps

Grafana dashboards for the DevWorkspace operator

This section describes the example Grafana dashboard, see grafana-dashboard.json, which displays metrics collected from the DevWorkspace operator.

Grafana dashboard panels that contain metrics related to `DevWorkspace startup
Figure 1. The DevWorkspace-specific metrics panel

The DevWorkspace-specific metrics panel contains information related to DevWorkspace startup.

Average workspace start time

The average start time of a workspace.

Workspace starts

The number successful and failed workspace starts.

Workspace startup duration

A heatmap that displays workspace startup duration.

DevWorkspace successes / failures

A comparison between successful and failed DevWorkspace startups

DevWorkspace failure rate

The ratio between the number of failed workspace startups and the number of total workspace startups.

DevWorkspace startup failure reasons

A pie chart that displays the distribution of workspace startup failures. The possible failure reasons are:

  • BadRequest

  • InfrastructureFailure

  • Unknown

Grafana dashboard panels that contain Operator metrics part 1
Figure 2. The Operator metrics panel, part 1
Webhooks in flight

A comparison between the number of different webhook requests.

Work queue duration

A heatmap that displays how long the reconcile requests stay in the work queue before they are handled.

Webhooks latency (/mutate)

A heatmap that displays /mutate webhook latency.

Reconcile time

A heatmap that displays the reconcile duration.

Grafana dashboard panels that contain Operator metrics part 2
Figure 3. The Operator metrics panel, part 2
Webhooks latency (/convert)

A heatmap that displays /convert webhook latency.

Work queue depth

The number of reconcile requests that are in the work queue.


Memory usage for the DevWorkspace controller and the DevWorkspace webhook server.

Reconcile counts (DWO)

The average per-second number of reconcile counts for the DevWorkspace controller.