Monitoring the DevWorkspace operator
This chapter describes how to configure an example monitoring stack to process metrics exposed by the DevWorkspace operator. You must enable the DevWorkspace operator to follow the instructions in this chapter. See Enabling DevWorkspace operator.
Collecting DevWorkspace operator metrics with Prometheus
This section describes how to use the Prometheus to collect, store, and query metrics about the DevWorkspace operator.
-
The
devworkspace-controller-metrics
service is exposing metrics on port8443
. -
The
devworkspace-webhookserver
service is exposing metrics on port9443
. By default, the service exposes metrics on port9443
. -
Prometheus 2.26.0 or later is running. The Prometheus console is running on port
9090
with a corresponding service and route. See First steps with Prometheus.
-
Create a
ClusterRoleBinding
to bind theServiceAccount
associated with Prometheus to the devworkspace-controller-metrics-readerClusterRole
. Without theClusterRoleBinding
, you cannot access DevWorkspace metrics because they are protected with role-based access control (RBAC).Example 1. ClusterRole exampleapiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: devworkspace-controller-metrics-reader rules: - nonResourceURLs: - /metrics verbs: - get
Example 2. ClusterRoleBinding exampleapiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: devworkspace-controller-metrics-binding subjects: - kind: ServiceAccount name: <ServiceAccount name associated with the Prometheus Pod> namespace: <Prometheus namespace> roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: devworkspace-controller-metrics-reader
-
Configure Prometheus to scrape metrics from the
8443
port exposed by thedevworkspace-controller-metrics
service, and9443
port exposed by thedevworkspace-webhookserver
service.Example 3. Prometheus configuration exampleapiVersion: v1 kind: ConfigMap metadata: name: prometheus-config data: prometheus.yml: |- global: scrape_interval: 5s (1) evaluation_interval: 5s (2) scrape_configs: (3) - job_name: 'DevWorkspace' authorization: type: Bearer credentials_file: '/var/run/secrets/kubernetes.io/serviceaccount/token' tls_config: insecure_skip_verify: true static_configs: - targets: ['devworkspace-controller-metrics:8443'] (4) - job_name: 'DevWorkspace webhooks' authorization: type: Bearer credentials_file: '/var/run/secrets/kubernetes.io/serviceaccount/token' tls_config: insecure_skip_verify: true static_configs: - targets: ['devworkspace-webhookserver:9443'] (5)
1 | Rate at which a target is scraped. |
2 | Rate at which recording and alerting rules are re-checked. |
3 | Resources that Prometheus monitors. In the default configuration, two jobs (DevWorkspace and DevWorkspace webhooks ), scrape the time series data exposed by the devworkspace-controller-metrics and devworkspace-webhookserver services. |
4 | Scrape metrics from the 8443 port. |
5 | Scrape metrics from the 9443 port. |
-
Use the Prometheus console to view targets and metrics.
For more information, see Using the expression browser.
DevWorkspace-specific metrics
This section describes the DevWorkspace-specific metrics exposed by the devworkspace-controller-metrics
service.
Name | Type | Description | Labels |
---|---|---|---|
|
Counter |
Number of DevWorkspace starting events. |
|
|
Counter |
Number of DevWorkspaces successfully entering the |
|
|
Counter |
Number of failed DevWorkspaces. |
|
|
Histogram |
Total time taken to start a DevWorkspace, in seconds. |
|
Name | Description | Values |
---|---|---|
|
The |
|
|
The |
|
|
The workspace startup failure reason. |
|
Name | Description |
---|---|
|
Startup failure due to an invalid devfile used to create a DevWorkspace. |
|
Startup failure due to the following errors: |
|
Unknown failure reason. |
Viewing DevWorkspace operator metrics on Grafana dashboards
This section describes how to view DevWorkspace operator metrics on Grafana with the example dashboard. Grafana version 7.5.3 or later is required to support all panels in the example dashboard.
-
Prometheus is collecting metrics. See Collecting DevWorkspace operator metrics with Prometheus.
-
Grafana is running on port
3000
with a corresponding service and route. See Installing Grafana.
-
Add the data source for the Prometheus instance. See Creating a Prometheus data source.
-
Import the example grafana-dashboard.json dashboard.
-
Use the Grafana console to view the DevWorkspace operator metrics dashboard. See Grafana dashboards for the DevWorkspace operator.
Grafana dashboards for the DevWorkspace operator
This section describes the example Grafana dashboard, see grafana-dashboard.json, which displays metrics collected from the DevWorkspace operator.

The DevWorkspace-specific metrics panel contains information related to DevWorkspace
startup.
- Average workspace start time
-
The average start time of a workspace.
- Workspace starts
-
The number successful and failed workspace starts.
- Workspace startup duration
-
A heatmap that displays workspace startup duration.
- DevWorkspace successes / failures
-
A comparison between successful and failed DevWorkspace startups
- DevWorkspace failure rate
-
The ratio between the number of failed workspace startups and the number of total workspace startups.
- DevWorkspace startup failure reasons
-
A pie chart that displays the distribution of workspace startup failures. The possible failure reasons are:
-
BadRequest
-
InfrastructureFailure
-
Unknown
-

- Webhooks in flight
-
A comparison between the number of different webhook requests.
- Work queue duration
-
A heatmap that displays how long the reconcile requests stay in the work queue before they are handled.
- Webhooks latency (/mutate)
-
A heatmap that displays
/mutate
webhook latency. - Reconcile time
-
A heatmap that displays the reconcile duration.

- Webhooks latency (/convert)
-
A heatmap that displays
/convert
webhook latency. - Work queue depth
-
The number of reconcile requests that are in the work queue.
- Memory
-
Memory usage for the DevWorkspace controller and the DevWorkspace webhook server.
- Reconcile counts (DWO)
-
The average per-second number of reconcile counts for the DevWorkspace controller.