Delegating CNI requests to a gRPC server for better tracing

By Hirotaka Yamamoto (@ymmt2005)

Coil v2 is a Kubernetes network plugin that implements Container Networking Interface (CNI) 0.4.0.

CNI defines plugins as executables. In Kubernetes, kubelet or container runtimes such as containerd directly executes CNI plugin binary. The fact that CNI plugins are executables that exit immediately after they are launched means that it is difficult to collect logs and metrics continuously.

To work around this problem, Coil delegates CNI requests to a long-running gRPC server. This article describes the benefits of using a gRPC server to handle CNI requests and how to implement it with Go language.

Table of contents

Benefits of using gRPC server to handle CNI requests
Protocol schema
Handling namespace files
Logging
Metrics
Advanced caching
Conclusion

Benefits of using gRPC server to handle CNI requests

gRPC is a network protocol to call remote procedures. It usually uses Protocol Buffers as a schema language. The gRPC and Protocol Buffers implementations for Go language is particularly good. This is important because the CNI project provides libraries for implementing plugins only for Go.

The Go gRPC implementation allows communication on UNIX domain sockets, so the overhead incurred with gRPC is minimal. Unlike TCP, communication on UNIX domain sockets does not require a 3-way handshake and is natively reliable.

The CNI plugin executable of Coil is named coil. It delegates CNI requests to a gRPC server named coild running on the same Node over a UNIX domain socket.

f:id:cybozuinsideout:20201117121830p:plain — CNI request flow in Coil

Since coild is a long-running process, it can be run as a DaemonSet Pod. As such, it is easy to collect logs and export metrics related to CNI requests from the Pod.

Another advantage of having a long-running process is that it reduces the latency for processing CNI requests by caching the required data.

Protocol schema

The CNI project provides a Go package to handle requests as github.com/containernetworking/cni/pkg/skel.

We translated this API into a protocol buffer schema. For example, CmdArgs that represents the input parameters for all types of CNI requests:

type CmdArgs struct {
    ContainerID string
    Netns       string
    IfName      string
    Args        string
    Path        string
    StdinData   []byte
}

... is translated into the following protocol buffer message:

message CNIArgs {
  string container_id = 1;
  string netns = 2;
  string ifname = 3;
  map<string,string> args = 4;  // Key-Value pairs parsed from Args
  string path = 5;
  bytes stdin_data = 6;
}

The complete schema is available as a part of Coil source code: cnirpc.proto There is also a pre-compiled Go package of the schema: github.com/cybozu-go/coil/v2/pkg/cnirpc

Handling namespace files

In order to setup networking for a Pod, CNI passes a filesystem path for the network namespace of the Pod. netns parameter in CNIArgs described in the previous section is it.

The path points to a special file called namespace file (descriptor).

If the path is under /proc, netns points directly to the namespace file under proc special filesystem. In this case, coild Pod needs to share the host PID namespace.

If the path is under /run/netns, netns is a bind-mounted file. In this case, coild Pod needs to mount /run host directory with mount propagation from host to container.

The Pod manifest of coild therefore looks like:

apiVersion: v1
kind: Pod
spec:
  hostNetwork: true
  hostPID: true         # for netns file under /proc
  containers:
  - name: coild
    volumeMounts:
    - name: run
      mountPath: /run
      mountPropagation: HostToContainer  # for netns file under /run/netns
  volumes:
  - name: run
    hostPath:
      path: /run

Logging

We used github.com/grpc-ecosystem/go-grpc-middleware to add various features to the gRPC server. Request logging is one of them.

The cool thing about go-grpc-middleware is that it can automatically extract data from gRPC requests and add them to every log entry. This feature is called Request Context Tags.

The following snippet from Coil extracts information such as Pod name and namespace from the request and add it to logs.

func Start(logger *zap.Logger) {
    grpcServer := grpc.NewServer(grpc.UnaryInterceptor(
        grpc_middleware.ChainUnaryServer(
            grpc_ctxtags.UnaryServerInterceptor(grpc_ctxtags.WithFieldExtractor(fieldExtractor)),
            grpc_zap.UnaryServerInterceptor(logger),
        ),
    ))
    ...
}

func fieldExtractor(fullMethod string, req interface{}) map[string]interface{} {
    args, ok := req.(*cnirpc.CNIArgs)
    if !ok {
        return nil
    }
    ret := make(map[string]interface{})
    if name, ok := args.Args[constants.PodNameKey]; ok {
        ret["pod.name"] = name
    }
    if namespace, ok := args.Args[constants.PodNamespaceKey]; ok {
        ret["pod.namespace"] = namespace
    }
    ret["netns"] = args.Netns
    ret["ifname"] = args.Ifname
    ret["container_id"] = args.ContainerId
    return ret
}

The actual log contains the request context information as follows:

$ kubectl -n kube-system logs coild-sfp2v | tail -1 | jq .
{
  "level": "info",
  "ts": 1605246986.0687687,
  "logger": "grpc",
  "msg": "finished unary call with code OK",
  "grpc.start_time": "2020-11-13T05:56:26Z",
  "grpc.request.deadline": "2020-11-13T05:57:26Z",
  "system": "grpc",
  "span.kind": "server",
  "grpc.service": "pkg.cnirpc.CNI",
  "grpc.method": "Add",
  "grpc.request.netns": "/var/run/netns/cni-3bf363dd-6ad4-6a70-0bd5-a868cc11a150",
  "grpc.request.ifname": "eth0",
  "grpc.request.container_id": "3764257de70aaf34369dcb034d81869176c6c2938b3ee4c78f518ec96cc759d9",
  "grpc.request.pod.name": "sample-elasticsearch-es-master-nodes-1",
  "grpc.request.pod.namespace": "test",
  "peer.address": "@",
  "grpc.code": "OK",
  "grpc.time_ms": 34.23899841308594
}

Metrics

We used github.com/grpc-ecosystem/go-grpc-prometheus to export metrics for gRPC requests.

Since all the CNI requests are delegated to the gRPC server, they are metrics for CNI requests. Simple, eh?

Advanced caching

When a Node runs out of free IP addresses for Pods, Coil allocates a block of IP addresses to the Node. The address block is represented as a custom resource and stored in kube-apiserver.

coild watches kube-apiserver to check allocated blocks for the Node and caches them in memory. It also uses github.com/willf/bitset to manage local IP address assignments for Pods in memory .

Conclusion

Delegating CNI requests to a gRPC server is entirely possible and has many advantages. We encourage all future CNI plugin developers to put this into practice.

Kintone Engineering Blog

Learn about Kintone's engineering efforts. Kintone is provided by Cybozu Inc., a Tokyo-based public company founded in 1997.