By Hirotaka Yamamoto (@ymmt2005)
Coil v2 is a Kubernetes network plugin that implements Container Networking Interface (CNI) 0.4.0.
CNI defines plugins as executables. In Kubernetes, kubelet
or container runtimes such as containerd
directly executes CNI plugin binary.
The fact that CNI plugins are executables that exit immediately after they are launched means that it is difficult to collect logs and metrics continuously.
To work around this problem, Coil delegates CNI requests to a long-running gRPC server. This article describes the benefits of using a gRPC server to handle CNI requests and how to implement it with Go language.
Table of contents
- Benefits of using gRPC server to handle CNI requests
- Protocol schema
- Handling namespace files
- Logging
- Metrics
- Advanced caching
- Conclusion
Benefits of using gRPC server to handle CNI requests
gRPC is a network protocol to call remote procedures. It usually uses Protocol Buffers as a schema language. The gRPC and Protocol Buffers implementations for Go language is particularly good. This is important because the CNI project provides libraries for implementing plugins only for Go.
The Go gRPC implementation allows communication on UNIX domain sockets, so the overhead incurred with gRPC is minimal. Unlike TCP, communication on UNIX domain sockets does not require a 3-way handshake and is natively reliable.
The CNI plugin executable of Coil is named coil
.
It delegates CNI requests to a gRPC server named coild
running on the same Node over a UNIX domain socket.
Since coild
is a long-running process, it can be run as a DaemonSet Pod.
As such, it is easy to collect logs and export metrics related to CNI requests from the Pod.
Another advantage of having a long-running process is that it reduces the latency for processing CNI requests by caching the required data.
Protocol schema
The CNI project provides a Go package to handle requests as github.com/containernetworking/cni/pkg/skel
.
We translated this API into a protocol buffer schema.
For example, CmdArgs
that represents the input parameters for all types of CNI requests:
type CmdArgs struct { ContainerID string Netns string IfName string Args string Path string StdinData []byte }
... is translated into the following protocol buffer message:
message CNIArgs { string container_id = 1; string netns = 2; string ifname = 3; map<string,string> args = 4; // Key-Value pairs parsed from Args string path = 5; bytes stdin_data = 6; }
The complete schema is available as a part of Coil source code: cnirpc.proto
There is also a pre-compiled Go package of the schema: github.com/cybozu-go/coil/v2/pkg/cnirpc
Handling namespace files
In order to setup networking for a Pod, CNI passes a filesystem path for the network namespace of the Pod.
netns
parameter in CNIArgs
described in the previous section is it.
The path points to a special file called namespace file (descriptor).
If the path is under /proc
, netns
points directly to the namespace file under proc special filesystem.
In this case, coild
Pod needs to share the host PID namespace.
If the path is under /run/netns
, netns
is a bind-mounted file.
In this case, coild
Pod needs to mount /run
host directory with mount propagation from host to container.
The Pod manifest of coild
therefore looks like:
apiVersion: v1 kind: Pod spec: hostNetwork: true hostPID: true # for netns file under /proc containers: - name: coild volumeMounts: - name: run mountPath: /run mountPropagation: HostToContainer # for netns file under /run/netns volumes: - name: run hostPath: path: /run
Logging
We used github.com/grpc-ecosystem/go-grpc-middleware
to add various features to the gRPC server.
Request logging is one of them.
The cool thing about go-grpc-middleware
is that it can automatically extract data from gRPC requests and add them to every log entry. This feature is called Request Context Tags.
The following snippet from Coil extracts information such as Pod name and namespace from the request and add it to logs.
func Start(logger *zap.Logger) { grpcServer := grpc.NewServer(grpc.UnaryInterceptor( grpc_middleware.ChainUnaryServer( grpc_ctxtags.UnaryServerInterceptor(grpc_ctxtags.WithFieldExtractor(fieldExtractor)), grpc_zap.UnaryServerInterceptor(logger), ), )) ... } func fieldExtractor(fullMethod string, req interface{}) map[string]interface{} { args, ok := req.(*cnirpc.CNIArgs) if !ok { return nil } ret := make(map[string]interface{}) if name, ok := args.Args[constants.PodNameKey]; ok { ret["pod.name"] = name } if namespace, ok := args.Args[constants.PodNamespaceKey]; ok { ret["pod.namespace"] = namespace } ret["netns"] = args.Netns ret["ifname"] = args.Ifname ret["container_id"] = args.ContainerId return ret }
The actual log contains the request context information as follows:
$ kubectl -n kube-system logs coild-sfp2v | tail -1 | jq . { "level": "info", "ts": 1605246986.0687687, "logger": "grpc", "msg": "finished unary call with code OK", "grpc.start_time": "2020-11-13T05:56:26Z", "grpc.request.deadline": "2020-11-13T05:57:26Z", "system": "grpc", "span.kind": "server", "grpc.service": "pkg.cnirpc.CNI", "grpc.method": "Add", "grpc.request.netns": "/var/run/netns/cni-3bf363dd-6ad4-6a70-0bd5-a868cc11a150", "grpc.request.ifname": "eth0", "grpc.request.container_id": "3764257de70aaf34369dcb034d81869176c6c2938b3ee4c78f518ec96cc759d9", "grpc.request.pod.name": "sample-elasticsearch-es-master-nodes-1", "grpc.request.pod.namespace": "test", "peer.address": "@", "grpc.code": "OK", "grpc.time_ms": 34.23899841308594 }
Metrics
We used github.com/grpc-ecosystem/go-grpc-prometheus
to export metrics for gRPC requests.
Since all the CNI requests are delegated to the gRPC server, they are metrics for CNI requests. Simple, eh?
Advanced caching
When a Node runs out of free IP addresses for Pods, Coil allocates a block of IP addresses to the Node.
The address block is represented as a custom resource and stored in kube-apiserver
.
coild
watches kube-apiserver
to check allocated blocks for the Node and caches them in memory.
It also uses github.com/willf/bitset
to manage local IP address assignments for Pods in memory .
Conclusion
Delegating CNI requests to a gRPC server is entirely possible and has many advantages. We encourage all future CNI plugin developers to put this into practice.