Kubernetes: Building a Custom Log Exporter with NodeJS

By Sebastian Günther

—

24th August, 2020

—

n Kubernetes, pods are dynamic and ephemeral: Health checks can fail, the scaling parameters are adapted, or you need to make changes to a node and remove all workload from it. In all of these cases, the pods are recreated - and so are all log messages that are stored in files. The solution is to stream all log messages to stdout. But how can you conveniently access these messages?

Log file retention is a best practice: When an application dies, or when error occur, you need means to check application behavior at that specific time. When you search for Logging and Kubernetes, only what I would call enterprise solutions come up: Either the ELK Stack, a combination of Elastic Search, Logstash and Kibana, or the EFK stack, which switches Logstash with fluendt. The basic idea of these stacks is: Collect log messages from pods running applications, forward them into a database, and provide a log file analysis tool. These solutions offer a lot of features, at the price of a complex installation setup and configuration to get log data in the right format from the right pods.

What if you need a simple solution? A solution that just collects the log messages printed on stdout from specific pods, and store them in files? I had this goal, and came up with the Kube Log Exporter tool.

Kube Log Exporter is run against a list of namespaces and pod names matching a certain pattern. You run it either locally, from a machine on which your kubeconfig resides, or as a CronJob in your Kubernetes cluster.

In this article, I want to walk through the design and implementation of the KubeLogExporter. We will cover all functionality that is needed for the execution from your local machine. In the next article, we will cover the additional functionality for running KubeLogExporter in your cluster.

Architecture & Workflow

The Kube Log Exporter is based on the official Kubernetes Node.js Client. This client makes the heavy work, the log exporter adds the following convenience functions:

Get the pod name from a list of pod objects
Finding all prods in a namespace
Filter all pods of a namespace that match a certain name

With this, we get a list of pod names as the final result. Then, log data from this pod is read and stored in a file with the same name as the pod. If a file already exists, the file content and the (new) log data are merged and stored in the file.

Now let’s detail each functionality.

Part 1: Configuration

The central kubeconfig configuration file is defined by the KUBECONFIG environment variable. The log exporter simply uses this file.

const configure = () => {
  try {
    kc.loadFromDefault()
    k8sApi = kc.makeApiClient(k8s.CoreV1Api)
  } catch (e) {
    console.log(e)
  }
}

Part 2: Accessing and Filtering Pods

Three methods are used to get a list of pod names:

podNames: From the list of pod objects, return only the names
getPodsInNamespace: Get all pods of the namespaces, and return only their names
getPodsByName: Get a list of pod names that match the specified regex pattern and in the provided namespace.

const podNames = podObjectList => {
  return podObjectList.items.map(item => item.metadata.name)
}

const getPodsInNamespace = async (namespace = 'default') => {
  podObjectList = (await k8sApi.listNamespacedPod(namespace)).body
  return podNames(podObjectList)
}

const getPodsByName = async (pattern, namespace = 'default') => {
  const pods = await getPodsInNamespace(namespace)
  return pods.filter(item => item.match(pattern))
}

Part 3: Log File Merge

To read the log files, I'm again wrapping a function from the client library.

const getLogsFromPod = async (podName, namespace = 'default') => {
  return (await k8sApi.readNamespacedPodLog(podName, namespace)).body
}

To write the log files, I'm reading the content of the already exisiting file.

Log messages are stored in files that match the pod name. When the pod messages are loaded, the existing pod log file is read and the (new) messages are merged into this file.

const updateLogsForPod = async (podName, namespace = 'default') => {
  let storedLogText = ''
  try {
    storedLogText = fs.readFileSync(`logs/${podName}.log`, 'utf8')
  } catch (e) {
    // Do nothing
  }
  const liveLogText = await getLogsFromPod(podName, namespace)
  fs.writeFileSync(
    `logs/${podName}.log`,
    mergeLogTexts(storedLogText, liveLogText)
  )
  return true
}

Merging is basic: Creating a set of log file lines from the file content and the new log data, and then eliminating all duplicates.

const mergeLogTexts = (log1, log2) => {
  const unified = [...new Set(log1.split('\n').concat(log2.split('\n')))]
  return unified.join('\n')
}

Part 4: Local Export

With exportToLocalDir we store the log files on the local drive. This function receives a mapping of namespaces to pod names and calls the updateLogsForPod function.

async function exportToLocalDir (namespacePodMapping) {
  for (let [namespace, podNames] of Object.entries(namespacePodMapping)) {
    podNames.forEach(async podName => {
      const names = await kubeLogExporter.getPodsByName(podName, namespace)
      names.forEach(podName =>
        kubeLogExporter.updateLogsForPod(podName, namespace)
      )
    })
  }
}

An example to execute this function is found in the following snippet:

kubeLogExporter.exportToLocalDir({ default: [/redis/, /lighthouse/] })

Conclusion

Log file retention is important to have quickly accessible information when your application produces errors. With the KubeLogExporter, we can simply get log files from a set of configured pods and store them locally. Since the log buffer of a Kubernetes pod is limited, the only way to get the complete log is to run this job regularly. And for this, in the next article I will show how to use the KubeLogExporter as a cron job running inside the Kubernetes Cluster to regularly fetch entries.

Previous: Kubernetes: Protect Resource Consumption with Limits and Health Checks

Next: Kubernetes API: How Custom Service Accounts Work