Kubernetes: Building a Custom Log Exporter with NodeJS
n Kubernetes, pods are dynamic and ephemeral: Health checks can fail, the scaling parameters are adapted, or you need to make changes to a node and remove all workload from it. In all of these cases, the pods are recreated - and so are all log messages that are stored in files. The solution is to stream all log messages to stdout. But how can you conveniently access these messages?
Log file retention is a best practice: When an application dies, or when error occur, you need means to check application behavior at that specific time. When you search for Logging and Kubernetes, only what I would call enterprise solutions come up: Either the ELK Stack, a combination of Elastic Search, Logstash and Kibana, or the EFK stack, which switches Logstash with fluendt. The basic idea of these stacks is: Collect log messages from pods running applications, forward them into a database, and provide a log file analysis tool. These solutions offer a lot of features, at the price of a complex installation setup and configuration to get log data in the right format from the right pods.
What if you need a simple solution? A solution that just collects the log messages printed on stdout from specific pods, and store them in files? I had this goal, and came up with the Kube Log Exporter tool.
Kube Log Exporter is run against a list of namespaces and pod names matching a certain pattern. You run it either locally, from a machine on which your kubeconfig resides, or as a CronJob in your Kubernetes cluster.
In this article, I want to walk through the design and implementation of the KubeLogExporter. We will cover all functionality that is needed for the execution from your local machine. In the next article, we will cover the additional functionality for running KubeLogExporter in your cluster.
Architecture & Workflow
The Kube Log Exporter is based on the official Kubernetes Node.js Client. This client makes the heavy work, the log exporter adds the following convenience functions:
- Get the pod name from a list of pod objects
- Finding all prods in a namespace
- Filter all pods of a namespace that match a certain name
With this, we get a list of pod names as the final result. Then, log data from this pod is read and stored in a file with the same name as the pod. If a file already exists, the file content and the (new) log data are merged and stored in the file.
Now let’s detail each functionality.
Part 1: Configuration
The central kubeconfig configuration file is defined by the KUBECONFIG
environment variable. The log exporter simply uses this file.
const configure = () => {
try {
kc.loadFromDefault()
k8sApi = kc.makeApiClient(k8s.CoreV1Api)
} catch (e) {
console.log(e)
}
}
Part 2: Accessing and Filtering Pods
Three methods are used to get a list of pod names:
podNames
: From the list of pod objects, return only the namesgetPodsInNamespace
: Get all pods of the namespaces, and return only their namesgetPodsByName
: Get a list of pod names that match the specified regexpattern
and in the providednamespace
.
const podNames = podObjectList => {
return podObjectList.items.map(item => item.metadata.name)
}
const getPodsInNamespace = async (namespace = 'default') => {
podObjectList = (await k8sApi.listNamespacedPod(namespace)).body
return podNames(podObjectList)
}
const getPodsByName = async (pattern, namespace = 'default') => {
const pods = await getPodsInNamespace(namespace)
return pods.filter(item => item.match(pattern))
}
Part 3: Log File Merge
To read the log files, I'm again wrapping a function from the client library.
const getLogsFromPod = async (podName, namespace = 'default') => {
return (await k8sApi.readNamespacedPodLog(podName, namespace)).body
}
To write the log files, I'm reading the content of the already exisiting file.
Log messages are stored in files that match the pod name. When the pod messages are loaded, the existing pod log file is read and the (new) messages are merged into this file.
const updateLogsForPod = async (podName, namespace = 'default') => {
let storedLogText = ''
try {
storedLogText = fs.readFileSync(`logs/${podName}.log`, 'utf8')
} catch (e) {
// Do nothing
}
const liveLogText = await getLogsFromPod(podName, namespace)
fs.writeFileSync(
`logs/${podName}.log`,
mergeLogTexts(storedLogText, liveLogText)
)
return true
}
Merging is basic: Creating a set of log file lines from the file content and the new log data, and then eliminating all duplicates.
const mergeLogTexts = (log1, log2) => {
const unified = [...new Set(log1.split('\n').concat(log2.split('\n')))]
return unified.join('\n')
}
Part 4: Local Export
With exportToLocalDir
we store the log files on the local drive. This function receives a mapping of namespaces to pod names and calls the updateLogsForPod
function.
async function exportToLocalDir (namespacePodMapping) {
for (let [namespace, podNames] of Object.entries(namespacePodMapping)) {
podNames.forEach(async podName => {
const names = await kubeLogExporter.getPodsByName(podName, namespace)
names.forEach(podName =>
kubeLogExporter.updateLogsForPod(podName, namespace)
)
})
}
}
An example to execute this function is found in the following snippet:
kubeLogExporter.exportToLocalDir({ default: [/redis/, /lighthouse/] })
Conclusion
Log file retention is important to have quickly accessible information when your application produces errors. With the KubeLogExporter, we can simply get log files from a set of configured pods and store them locally. Since the log buffer of a Kubernetes pod is limited, the only way to get the complete log is to run this job regularly. And for this, in the next article I will show how to use the KubeLogExporter as a cron job running inside the Kubernetes Cluster to regularly fetch entries.