Demystifying Kubernetes: From 'Probably Overkill' to 'Can't Live Without It'

Forgive me if you’re already very familiar with the joys of Kubernetes and are already using it in anger. If so, then you’re well ahead of where we were a couple of years ago. Back then I was fully aware of the Kubernetes proposition but it sounded to me:

Likely overkill for our needs
Probably a bit oversold by the geeks
Being honest, a bit daunting

I was wrong on all three counts.

What is Kubernetes?

For the uninitiated, Kubernetes (often shortened to K8s) is an open-source platform for managing containerised applications. Originally developed by Google (based on their internal system called Borg, which runs essentially everything Google does), it was released to the open-source community in 2014 and has since become the de facto standard for running workloads in the cloud.

In simple terms, Kubernetes is an orchestrator. You tell it what you want to run, how many instances you need, what resources they require, and it figures out the rest. It decides where to place your workloads, keeps them running, restarts them if they fail, scales them up or down based on demand, and rolls out updates without downtime. Think of it as a very capable operations team that never sleeps.

If that still sounds abstract, here’s the concrete version: rather than manually provisioning virtual machines, writing bespoke deployment scripts, and hoping it all holds together, you describe your desired state in configuration files and Kubernetes makes it happen. Consistently. Every time.

The Problem We Were Trying to Solve

When we started building out an open-source reporting and analytics stack for our insurance clients, we found ourselves needing three key things:

A simple and repeatable way of deploying open-source tooling into cloud environments that didn’t involve reinventing the wheel.
A rock-solid and scalable platform to run this tooling and our own workloads.
A solution that was portable, agnostic of hosting provider, cost-effective, and easy to monitor and maintain.

Up until this point, we’d used a whole variety of methods and mechanisms to run tooling and workloads, from stand-alone VMs to Function-as-a-Service offerings like Azure Functions and AWS Lambda. As a consequence, we had a really complicated Infrastructure-as-Code configuration in Terraform, even more complicated CI/CD pipelines to deploy to all of this infrastructure, and stability and monitoring were an ongoing issue.

Furthermore, we were generally reinventing the wheel when it came to deploying open-source tooling. Of course, there were guides and documents to help, but our infrastructure and environments were unique and therefore our CI/CD pipelines had to be as well. Every new tool meant another bespoke deployment process, another set of scripts to maintain, another thing to worry about.

The prospect of deploying our complex new data stack the “old way” frankly filled us with dread. It felt like the whole thing would become a house of cards that we were constantly trying to keep up.

Why Kubernetes Was the Only Answer

We spent weeks reviewing our options. We looked at everything from managed PaaS offerings to doubling down on our existing Terraform approach. In the end, it was clear that the only platform that would meet all three of our requirements was Kubernetes. It was the only platform with the functionality, stability, and scalability to let us safely put all of our eggs in one basket.

Now, we’d never consider running our own Kubernetes cluster from scratch; that really would be overkill. This is where managed Kubernetes services come into play. The major cloud providers all offer managed K8s: Google has GKE, Azure has AKS, and AWS has EKS. These platforms handle the complex bits (the control plane, upgrades, networking) and let you focus on running your workloads.

We actually started with MiniKube, a lightweight local Kubernetes environment you can run on your laptop, and it soon became clear that this platform wasn’t half as scary as we feared. For sure there is a steep learning curve in terms of terminology and tooling, but if you jump into it in earnest for a week or two, you’ll soon feel comfortable and at home.

The Advantages

Open-Source Tooling Just Works

This was a game-changer for us. Any open-source tooling worth considering will play friendly with Kubernetes. The Helm charts and manifests needed to deploy the tooling are essentially off the shelf (see the Jargon Buster below if those terms are new to you).

Need to deploy Trino for distributed SQL queries? There’s a Helm chart for that. Prometheus for monitoring? Helm chart. ArgoCD for deployments? Helm chart. The pattern is consistent and repeatable.

Let’s not kid ourselves: you’ll still need to fine-tune configurations for your specific needs. But the difference between fine-tuning a well-maintained Helm chart and writing a bespoke deployment from scratch is enormous. What used to take days now takes hours.

Deploying Our Own Workloads Is a Doddle

Beyond open-source tooling, we run plenty of our own services, primarily Go and Python applications that handle data ingestion, transformation, and integration for our insurance clients. Deploying these on Kubernetes is straightforward.

You define your application in a container image, write a small amount of configuration describing how you want it to run, and Kubernetes takes care of the rest. Scaling, restarts, health checks, resource management. It’s all handled for you. The consistency is beautiful: every workload, whether it’s an off-the-shelf tool or our own code, is deployed and managed in exactly the same way.

Monitoring and Stability

The tooling ecosystem around Kubernetes is rich, modern, and highly functional. Tools like Prometheus and Grafana give you deep visibility into everything running on your cluster. You can set up alerts, dashboards, and automated responses that keep your platform healthy without constant manual intervention.

And the stability is unparalleled, not surprising given that Kubernetes is essentially the platform upon which the digital world runs. Thank you, Google, for Borg.

It self-heals: if a workload crashes, Kubernetes restarts it. If a node fails, Kubernetes reschedules your workloads elsewhere. Rolling updates mean you can deploy new versions of your applications with zero downtime. For our insurance clients, where uptime matters enormously (think regulatory reporting deadlines, real-time claims processing, always-on analytics platforms), this reliability is critical.

It Powers Our Entire Data Stack

Everything we build for our clients runs on Kubernetes. Our Trino clusters for distributed SQL analytics, our dbt transformations, our API services, our monitoring. All of it, under one roof.

This is the real superpower: consistency. One platform, one deployment methodology, one monitoring approach. When something goes wrong (and things always go wrong eventually), there’s one place to look, one set of tools to use, one body of knowledge to draw on.

Cost-Effective

Kubernetes gives you fine-grained control over resource allocation. You specify exactly how much CPU and memory each workload needs, and Kubernetes packs your workloads efficiently across your cluster.

Better still, we run our Kubernetes clusters on a mix of on-demand and spot (preemptible) virtual machines. Spot VMs are spare cloud capacity available at a fraction of the normal price, typically 60-90% cheaper. Because Kubernetes handles node failures gracefully (it just reschedules workloads if a spot VM gets reclaimed), we can run the bulk of our compute on spot instances. The result? Enterprise-grade infrastructure at a fraction of the cost.

For insurance firms watching their technology spend, this is a significant advantage.

ArgoCD and Modern Deployment

One tool we have to mention is ArgoCD, a GitOps continuous delivery tool for Kubernetes. If you’re familiar with Terraform, think of ArgoCD as Terraform for your workloads. It’s the same declarative philosophy: you define what you want your workloads to look like, ArgoCD makes it happen, and then it keeps it that way. It continuously watches your Git repository and ensures that what’s running on your cluster matches what’s defined in your code. If something drifts, it corrects it. Want to change a configuration? Update it in Git and ArgoCD deploys it. Want to roll back? Revert the Git commit. It’s elegant, auditable, and reliable.

But that’s a blog for another day.

The Honest Truth About the Learning Curve

Let’s not pretend Kubernetes is trivial to learn. There’s a steep initial curve: new terminology (Pods, Deployments, Services, Ingress, Namespaces), new tooling (kubectl, Helm, Kustomize), and new ways of thinking about infrastructure.

Have we had teething troubles? For sure. Nights spent trying to figure out why things aren’t scheduling as expected, wrestling with affinities, anti-affinities, and topology spread constraints. There are moments where you’ll question whether it was all worth it.

It was.

The key insight is that this is a platform that small teams can learn and comfortably live with. If you commit to it for a couple of weeks, you’ll be productive. Within a month, you’ll be comfortable. Within a few months, you’ll wonder how you ever managed without it. The benefits of being able to run everything under one roof in a consistent manner with state-of-the-art deployment capabilities mean that the costs of learning Kubernetes are far outweighed by the time saved on deployment and support.

Why This Matters for Insurance

Insurance firms face a specific set of technology challenges that Kubernetes addresses head-on:

Regulatory reporting: Solvency II, Lloyd’s, and FCA reporting pipelines need to run reliably and on schedule. Kubernetes’ self-healing and monitoring ensure your critical reporting jobs don’t fail silently.
Data platform reliability: When your actuaries and analysts depend on a data platform for pricing, reserving, and exposure management, downtime is not an option. Kubernetes keeps your platform running.
Scaling for the cycle: Insurance has natural peaks: renewal seasons, catastrophe events, regulatory deadlines. Kubernetes scales up when you need it and scales back down when you don’t, so you’re not paying for idle infrastructure.
Vendor independence: Many insurance firms, quite rightly, are cautious about vendor lock-in. Kubernetes runs on any major cloud provider, and migrating between them is significantly easier than with proprietary platforms.
Security and compliance: Kubernetes provides robust network policies, role-based access control, and secrets management. For firms handling sensitive policyholder and claims data, these are essential.

The Verdict

In short, we couldn’t, or perhaps more accurately wouldn’t, live without it now.

So if you’re struggling with hugely complex infrastructures and a myriad of different CI/CD pipelines and tools, give Kubernetes a try. It will simplify your life and put you back in control of your tooling and workloads. And it won’t break your minds or wallets in the process.

Jargon Buster

If you’re new to the Kubernetes world, here are the key terms you’ll encounter:

Container: A lightweight, standalone package that contains everything needed to run a piece of software: code, runtime, libraries, settings. Think of it as a consistent, portable box that runs the same way everywhere.
Pod: The smallest unit in Kubernetes. A Pod runs one or more containers and is the basic building block of any K8s workload.
Deployment: A configuration that tells Kubernetes how many copies of your Pod to run, how to update them, and how to handle failures.
Service: A stable network endpoint that routes traffic to your Pods. Even as Pods come and go, the Service provides a consistent address.
Helm Chart: A package of pre-configured Kubernetes resources. Think of it as an installer for a Kubernetes application: rather than writing all the configuration yourself, you use a chart that someone else has already built and tested.
Manifest: A YAML file that describes a Kubernetes resource (a Pod, Service, Deployment, etc.). It’s the configuration that tells Kubernetes what you want.
Namespace: A way to organise and isolate resources within a cluster. Think of it as folders for your workloads.
Node: A machine (virtual or physical) in your Kubernetes cluster that runs your workloads.

Building a modern data platform for your insurance business? We can help. Get in touch to discuss how Kubernetes, Trino, and open-source tooling can power your analytics.