Building a bitcoin controller for Kubernetes part IV – garbage collection and status updates

In our short series on implementing a bitcoin controller for Kubernetes, we have reached the point where the controller is actually bringing up bitcoin nodes in our network. Today, we will extend its logic to also cover deletions and we will start to add additional logic to monitor the state of our network.

Step 9: owner references and deletions

As already mentioned in the last post, Kubernetes has a built-in garbage collector which we can utilize to handle deletions so that our controller does not have to care about this.

The Kubernetes garbage collector is essentially a mechanism that is able to perform cascading deletes. Let us take a deployment as an example. When you create a deployment, the deployment will in turn create a replica set, and the replica set will bring up pods. When you delete the deployment, the garbage collector will make sure that the replica set is deleted, and the deletion of the replica set in turn will trigger the deletion of the pods. Thus we only have to maintain the top-level object of the hierarchy and the garbage collector will help us to clean up the dependent objects.

The order in which objects are deleted is controller by the propagation policy which can be selected when deleting an object. If “Foreground” is chosen, Kubernetes will mark the object as pending deletion by setting its deletionTimestamp and delete all objects that are owned by this object in the background before itself is eventually removed. For a “Background” deletion, the order is reversed – the object will be deleted right away, and the cleanup will be performed afterwards.

How does the garbage collector identify the objects that need to be deleted when we delete, say, a deployment? The ownership relation underlying this logic is captured by the ownership references in the object metadata. This structure contains all the information (API version, kind, name, UID) that Kubernetes needs to identify the owner of an object. An owner reference can conveniently be generated using the function NewControllerRef in the package k8s.io/apimachinery/pkg/apis/meta/v1

Thus, to allow Kubernetes to clean up our stateful set and the service when we delete a bitcoin network, we need to make two changes to our code.

  • We need to make sure that we add the owner reference to the metadata of the objects that we create
  • When reconciling the status of a bitcoin network with its specification, we should ignore networks for which the deletion timestamp is already set, otherwise we would recreate the stateful set while the deletion is in progress

For the sake of simplicity, we can also remove the delete handler from our code completely as it will not trigger any action anyway. When you now repeat the tests at the end of the last post and delete the bitcoin, you will see that the stateful set, the service and the pods are deleted as well.

At this point, let us also implement an additional improvement. When a service or a stateful set changes, we have so far been relying on the periodic resynchronisation of the cache. To avoid long synchronization times, we can also add additional handlers to our code to detect changes to our stateful set and our headless service. To distinguish changes that affect our bitcoin networks from other changes, we can again use the owner reference mechanism, i.e. we can retrieve the owner reference from the stateful set to figure out to which – if any – bitcoin network the stateful set belongs. Following the design of the sample controller, we can put this functionality into a generic method handleObject that works for all objects.

BitcoinControllerStructureII

Strictly speaking, we do not really react upon changes of the headless service at the moment as the reconciliation routine only checks that it exists, but not its properties, so changes to the headless service would go undetected at the moment. However, we add the event handler infrastructure for the sake of completeness.

Step 10: updating the status

Let us now try to add some status information to our bitcoin network which is updated regularly by the controller. As some of the status information that we are aiming at is not visible to Kubernetes (like the synchronization state of the blockchain), we will not add additional watches to capture for instance the pod status, but once more rely on the periodic updates that we do anyway.

The first step is to extend the API type that represents a bitcoin network to add some more status information. So let us add a list of nodes to our status. Each individual node is described by the following structure

type BitcoinNetworkNode struct {
	// a number from 0...n-1 in a deployment with n nodes, corresponding to
	// the ordinal in the stateful set
	Ordinal int32 `json:"ordinal"`
	// is this node ready, i.e. is the bitcoind RPC server ready to accept requests?
	Ready bool `json:"ready"`
	// the IP of the node, i.e. the IP of the pod running the node
	IP string `json:"ip"`
	// the name of the node
	NodeName string `json:"nodeName"`
	// the DNS name
	DNSName string `json:"dnsName"`
}

Correspondingly, we also need to update our definition of a BitcoinNetworkStatus – do not forget to re-run the code generation once this has been done.

type BitcoinNetworkStatus struct {
	Nodes []BitcoinNetworkNode `json:"nodes"`
}

The next question we have to clarify is how we determine the readiness of a bitcoin node. We want a node to appear as ready if the bitcoind representing the node is accepting JSON RPC requests. To achieve this, there is again a Kubernetes mechanism which we can utilize – readiness probes. In general, readiness probes can be defined by executing an arbitrary command or by running a HTTP request. As we are watching a server object, using HTTP requests seems to be the way to go, but there is a little challenge: the bitcoind RPC server uses HTTP POST requests, so we cannot use a HTTP GET request as a readiness probe, and Kubernetes does not allow us to configure a POST request. Instead, we use the exec-option of a readiness check and run the bitcoin CLI inside the container to determine when the node is ready. Specifically, we execute the command

/usr/local/bin/bitcoin-cli -regtest -conf=/bitcoin.conf getnetworkinfo

Here we use the configuration file that contains, among other things, the user credentials that the CLI will use. As a YAML structure, this readiness probe would be set up as follows (but of course we do this in Go in our controller programmatically).

   readinessProbe:
      exec:
        command:
        - /usr/local/bin/bitcoin-cli
        - -regtest
        - -conf=/bitcoin.conf
        - getnetworkinfo
      failureThreshold: 3
      initialDelaySeconds: 10
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 1

Note that we wait 10 seconds before doing the first probe, to give the bitcoind sufficient time to come up. It is instructive to test higher values of this, for instance 60 seconds – when the stateful set is created, you will see how the creation of the second pod is delayed for 60 seconds until the first readiness check for the first pod succeeds.

Now let us see how we can populate the status structure. Basically, we need to retrieve a list of all pods that belong to our bitcoin network. We could of course again use the ownership references to find those pods, or use the labels that we need anyway to define our stateful set. But in our case, there is even a an easier approach – as we use a stateful set, the names of the pods are completely predictable and we can easily retrieve them all by name. So to update the status information, we need to

  • Loop through the pods controlled by this stateful set, and for each pod
  • Find the status of the pod, using its conditions
  • Retrieve the pods IP address from its status
  • Assemble the DNS name (which, as we know, is the combination of the name of the pod and the name of the headless service)
  • Put all this into the above structure
  • Post this using the UpdateStatus method of our client object

Note that at this point, we follow the recommended best practise and update the status independent of the spec. It is instructive to extend the logging of the controller to log the generation of the bitcoin network and the resourceVersion. The generation (contained in the ObjectMeta structure) represents a version of the desired state, i.e. the spec, and is only updated (usually incremented by one) if we change the spec for the bitcoin network resource. In contrast to this, the resource version is updated for every change of the persisted state of the object and represents the etcd’s internal sequence number.

When you try to run our updated controller inside the cluster, however, you will find that there is again a problem – with the profile that we have created and to which our service account is linked, the update of the status information of a bitcoin network is not allowed. Thus we have to explicitly allow this by granting the update right on the subresource status, which is done by adding the following rule to our cluster role.

- apiGroups: ["bitcoincontroller.christianb93.github.com"]
  resources: ["bitcoinnetworks/status"]
  verbs: ["update"] 

We can now run a few more tests to see that our status updates work. When we bring up a new bitcoin network and use kubectl with “-o json” to retrieve the status of the bitcoin network, we can see that the node list populates as the pods are brought up and the status of the nodes changes to “Ready” one by one.

I have again created a tag v0.4 in the GitHub repository for this series to persist the state of the code at this point in time, so that you have the chance to clone the code and play with it. In the next post, we will move on and add the code needed to make sure that our bitcoin nodes detect each other at startup and build a real bitcoin network.

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s