Technology

Why We Love Trino: The Query Engine That Just Works

Trino has become our go-to query engine for modern data platforms. Here's why it consistently delivers for our insurance clients.

· 6 min read · By Ricky Horsley
Why We Love Trino: The Query Engine That Just Works

At InfoBeacon, we’ve built data platforms for some of the most demanding environments in the insurance market. When you’re dealing with Lloyd’s syndicates, intermediaries, and insurers who need to run complex, heavily aggregated queries across multiple systems, your tooling choices matter. A lot.

That’s why we keep coming back to Trino.

What is Trino?

For the uninitiated, Trino (formerly PrestoSQL) is a distributed SQL query engine designed to query data where it lives. No ETL required. It’s the same technology that powers Facebook’s analytics at massive scale, but it’s open source and works beautifully for organisations of any size.

Think of it as a universal translator for your data. Whether your data lives in cloud object storage, PostgreSQL, MySQL, or a data lake built on Apache Iceberg, Trino lets you query it all with standard SQL.

Why We Love It

Lightning Fast Performance

Trino is fast. We’re talking about sub-second queries on datasets that would bring traditional databases to their knees.

The secret? Trino was built from the ground up for distributed query processing. It uses a pipelined execution model that processes data in memory wherever possible, parallelises operations across your cluster, and optimises query plans intelligently.

For our insurance clients, this means analysts can run complex joins across claims, underwriting portfolios, and market data without waiting around. When you’re aggregating across multiple dimensions, drilling into cohorts, or building analytical models, query performance directly impacts productivity. In an industry where speed to insight drives competitive advantage, that matters.

Effortlessly Scalable

One of Trino’s superpowers is how it scales. Need to handle more data? Add more workers to your cluster. Need to handle more concurrent users? Same answer.

We’ve deployed Trino on Kubernetes clusters that take advantage of spot VMs to dramatically reduce compute costs. Because Trino handles worker node failures gracefully, we can run the bulk of the cluster on spot instances, delivering the same performance at a fraction of the cost. The elasticity is beautiful, and it means our clients get enterprise-grade analytics without enterprise-grade cloud bills.

There’s no complex partitioning strategy to manage, no “resharding” headaches, no capacity planning nightmares. You just add compute when you need it.

Works Wonderfully with Apache Iceberg

If you’re building a modern data platform, you should be using Apache Iceberg. It’s an open table format that brings ACID transactions, schema evolution, time travel, and partition management to data lakes.

Trino and Iceberg are a match made in heaven.

Trino has first-class support for Iceberg tables. You get full DML support (INSERT, UPDATE, DELETE, MERGE), time travel queries, and schema evolution without the usual lake house pain. Iceberg’s hidden partitioning means your analysts write simple queries without worrying about partition predicates, and Trino’s query optimizer can still take advantage of that partitioning for blazing fast performance.

We’ve built entire analytics platforms on this stack: data lands in cloud object storage, gets written to Iceberg tables, and Trino serves it to analysts and BI tools. It’s reliable, performant, and cost-effective.

Plays Beautifully with dbt

If you’re doing analytics engineering (and you should be), you’re probably using dbt. So are we.

Trino has excellent dbt support through the dbt-trino adapter. We’ve built production dbt projects that transform hundreds of millions of rows of insurance data, running hundreds of models a day, all orchestrated through dbt Core or dbt Cloud.

The developer experience is fantastic. You write your transformations in SQL, test them with dbt’s built-in testing framework, document them, and let dbt handle the orchestration. Trino executes those transformations at scale, taking advantage of all that distributed compute power.

And because Trino supports modern SQL features (CTEs, window functions, complex joins), you can express sophisticated business logic cleanly in dbt models without resorting to Python scripts or Spark jobs.

Where It Fits in Our Stack

When we design data platforms for insurance clients, Trino typically sits at the serving layer. Here’s a typical architecture:

  1. Ingestion: Data flows from source systems (policy admin, claims, accounting) into the platform via event streaming or batch ingestion.
  2. Storage: Raw data lands in cloud object storage, organised into Iceberg tables.
  3. Transformation: dbt models running on Trino transform raw data into business-ready datasets (claim triangles, loss ratios, exposure analytics).
  4. Serving: Analysts query the platform through Trino, either directly via SQL clients or through BI tools like Metabase.

This architecture gives us:

  • Separation of compute and storage: Scale them independently, pay for what you use
  • Open standards: No vendor lock-in, everything is portable
  • Performance: Queries run fast, even on massive datasets
  • Flexibility: Query anything from anywhere with standard SQL

The Insurance Use Cases

Trino shines in insurance analytics:

  • Claims analysis: Query claims triangles across multiple years, territories, and lines of business in real-time
  • Exposure management: Aggregate exposure across portfolios, model scenarios, calculate RDS
  • Regulatory reporting: Build Solvency II, Lloyd’s, and FCA reporting pipelines with confidence
  • Market data: Join internal underwriting data with external market data sources (cat models, pricing benchmarks)

All of this with SQL that your analysts already know.

Is It Perfect?

No tool is perfect. Trino is an analytical query engine, not an operational database. You wouldn’t use it for transactional workloads or real-time writes. It’s designed for read-heavy, analytical workloads where you need to scan and aggregate lots of data quickly.

But for that use case? It’s brilliant.

Why Not Just Use Snowflake/Databricks/BigQuery?

Fair question. Those platforms are excellent, and we’ve used them all.

But Trino gives you:

  • No vendor lock-in: It’s open source, runs anywhere
  • Lower costs: Especially if you’re already running Kubernetes and have data in cloud object storage
  • More flexibility: Query anything, anywhere, with any connector
  • Control: You own the infrastructure, you control the configuration

For insurance firms that are cautious about vendor dependency, cost-conscious, and have existing infrastructure investments, Trino is often the right choice.

Getting Started

If you’re curious about Trino, the best way to learn is to try it:

  1. Spin up a local Trino cluster with Docker
  2. Connect it to some sample data (PostgreSQL, cloud storage, whatever you have)
  3. Run some queries
  4. Marvel at the speed

The Trino documentation is excellent, and the community is friendly and responsive.

Final Thoughts

Trino has become a core part of our data platform toolkit. It’s fast, scalable, works beautifully with modern table formats like Iceberg, and integrates seamlessly with dbt.

If you’re building a data platform and haven’t evaluated Trino, you should. If you’re struggling with slow queries, high costs, or vendor lock-in with your current analytics stack, Trino might be the answer.

We’ve deployed it successfully across multiple insurance clients, from small intermediaries to Lloyd’s syndicates. It just works.


Building a modern data platform for your insurance business? We can help. Get in touch to discuss how Trino, Iceberg, and dbt can power your analytics.