How to monitor Kubernetes + Docker with Datadog for the Keep-Network project.

4 min readOct 29, 2020

Collect Kubernetes and Docker metrics

First, you will need to deploy the Datadog Agent [https://github.com/DataDog/datadog-agent] to collect key resource metrics and events from Kubernetes and Docker for monitoring in Datadog. In this section, we will show you one way to install the containerized Datadog Agent as a DaemonSet on every node in your Kubernetes cluster. Or, if you only want to install it on a specific subset of nodes, you can add a nodeSelector field to your pod configuration.

If your Kubernetes cluster uses role-based access control (RBAC), you can deploy the Datadog Agent’s RBAC manifest (rbac-agent.yaml) to grant it the necessary permissions to operate in your cluster. Doing this creates a ClusterRole, ClusterRoleBinding, and ServiceAccount for the Agent.

kubectl create -f "https://raw.githubusercontent.com/DataDog/datadog-agent/master/Dockerfiles/manifests/cluster-agent/rbac/rbac-agent.yaml"

Next, copy the following manifest to a local file and save it as datadog-agent.yaml.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: datadog-agent
  namespace: default
spec:
  selector:
    matchLabels:
      app: datadog-agent
  template:
    metadata:
      labels:
        app: datadog-agent
      name: datadog-agent
    spec:
      serviceAccountName: datadog-agent
      containers:
      - image: datadog/agent:latest
        imagePullPolicy: Always
        name: datadog-agent
        ports:
          - containerPort: 3919
            name: dogstatsdport
            protocol: UDP
          - containerPort: 3920
            name: traceport
            protocol: TCP
        env:
          - name: DD_API_KEY
            value: <YOUR_API_KEY>
          - name: DD_COLLECT_KUBERNETES_EVENTS
            value: "true"
          - name: DD_LEADER_ELECTION
            value: "true"
          - name: KUBERNETES
            value: "true"
          - name: DD_HEALTH_PORT
            value: "5555"
          - name: DD_KUBELET_TLS_VERIFY
            value: "false"
          - name: DD_KUBERNETES_KUBELET_HOST
            valueFrom:
              fieldRef:
                fieldPath: status.hostIP
          - name: DD_APM_ENABLED
            value: "true"
        resources:
          requests:
            memory: "256Mi"
            cpu: "200m"
          limits:
            memory: "256Mi"
            cpu: "200m"
        volumeMounts:
          - name: dockersocket
            mountPath: /var/run/docker.sock
          - name: procdir
            mountPath: /host/proc
            readOnly: true
          - name: cgroups
            mountPath: /host/sys/fs/cgroup
            readOnly: true
        livenessProbe:
          httpGet:
            path: /health
            port: 5555
          initialDelaySeconds: 15
          periodSeconds: 15
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 3
      volumes:
        - hostPath:
            path: /var/run/docker.sock
          name: dockersocket
        - hostPath:
            path: /proc
          name: procdir
        - hostPath:
            path: /sys/fs/cgroup
          name: cgroups

Replace <YOUR_API_KEY> with an API key from your Datadog account. Then run the following command to deploy the Agent as a DaemonSet:

kubectl create -f datadog-agent.yaml

Now you can verify that the Agent is collecting Docker and Kubernetes metrics by running the Agent’s status command. To do that, you first need to get the list of running pods so you can run the command on one of the Datadog Agent pods:

# Get the list of running pods
$ kubectl get pods
NAME             READY     STATUS    RESTARTS   AGE
datadog-agent-krrmd   1/1       Running   0          17d
...

# Use the pod name returned above to run the Agent's 'status' command
$ kubectl exec -it datadog-agent-krrmd agent status

In the output you should see sections resembling the following, indicating that Kubernetes and Docker metrics are being collected:

kubelet (4.1.0)
---------------
  Instance ID: kubelet:d884b5186b651429 [OK]
  Configuration Source: file:/etc/datadog-agent/conf.d/kubelet.d/conf.yaml.default
  Total Runs: 35
  Metric Samples: Last Run: 378, Total: 14,191
  Events: Last Run: 0, Total: 0
  Service Checks: Last Run: 4, Total: 140
  Average Execution Time : 817ms
  Last Execution Date : 2020-10-22 15:20:37.000000 UTC
  Last Successful Execution Date : 2020-10-22 15:20:37.000000 UTC

docker
------
  Instance ID: docker [OK]
  Configuration Source: file:/etc/datadog-agent/conf.d/docker.d/conf.yaml.default
  Total Runs: 35
  Metric Samples: Last Run: 290, Total: 15,537
  Events: Last Run: 1, Total: 4
  Service Checks: Last Run: 1, Total: 35
  Average Execution Time : 101ms
  Last Execution Date : 2020-10-22 15:20:30.000000 UTC
  Last Successful Execution Date : 2020-10-22 15:20:30.000000 UTC

Now you can glance at your built-in Datadog dashboards for Kubernetes and Docker to see what those metrics look like.

And, if you’re running a large-scale production deployment, you can also install the Datadog Cluster Agent — in addition to the node-based Agent — as a centralized and streamlined way to collect cluster-data for deep visibility into your infrastructure.

Add more Kubernetes metrics with kube-state-metrics

By default, the Kubernetes Agent check reports a handful of basic system metrics to Datadog, covering CPU, network, disk, and memory usage. You can easily expand on the data collected from Kubernetes by deploying the kube-state-metrics [https://github.com/kubernetes/kube-state-metrics] add-on to your cluster, which provides much more detailed metrics on the state of the cluster itself.

kube-state-metrics listens to the Kubernetes API and generates metrics about the state of Kubernetes logical objects: node status, node capacity (CPU and memory), number of desired/available/unavailable/updated replicas per deployment, pod status (e.g., waiting, running, ready), and so on. You can see the full list of metrics that Datadog collects from kube-state-metrics here.

To deploy kube-state-metrics as a Kubernetes service, copy the manifest here [https://github.com/kubernetes/kube-state-metrics/blob/master/examples/standard/service.yaml], paste it into a kube-state-metrics.yaml file, and deploy the service to your cluster:

kubectl create -f kube-state-metrics.yaml

Within minutes, you should see metrics with the prefix kubernetes_state. streaming into your Datadog account.

I used the website to compile the instructions — https://www.datadoghq.com/

How to monitor Kubernetes + Docker with Datadog for the Keep-Network project.

Collect Kubernetes and Docker metrics

Add more Kubernetes metrics with kube-state-metrics

Written by PurDasha