Small teams and the monolith

I owe my past teams an apology. At Rubikloud, at Freckle, and again at the start of Superkey, I made the same mistake: I split a small team across too many services because I thought that’s what serious engineering looked like.

It isn’t. Here’s what I’ve learned.

The mistake

At Rubikloud we had maybe eight engineers when I decided we needed a microservices architecture. The reasoning felt bulletproof at the time:

“We need to scale each component independently” (we were processing a few hundred requests per second)
“We need to deploy services without coordinating” (we deployed once a week, together, in a call)
“We need clear ownership boundaries” (everyone owned everything because there were eight of us)
“This is how Netflix does it” (we were not Netflix)

So we split. The data ingestion pipeline became its own service. The ML training jobs became their own service. The API became its own service. The recommendation engine, the price optimizer, the reporting layer — each one got a repo, a Dockerfile, a CI pipeline, a set of environment variables, and an on-call rotation that rotated through the same eight people.

Within six months we had nine services, eight engineers, and a Kubernetes cluster that was more complex than the product it ran.

What it actually costs

The cost of microservices isn’t in the architecture diagrams. It’s in the daily friction that everyone learns to live with until they forget it’s abnormal.

Every cross-service change is a coordination problem. Need to add a field to the API response that comes from the ML service? That’s a change to the ML service’s output schema, a change to the API service’s input parsing, a migration in the shared database (if you’re lucky enough to share one — we weren’t), and a deploy of both services in the right order. In a monolith that’s one PR.

Local development becomes a project. Running the full stack means starting 9 services, 3 databases, a message queue, and a service mesh. We wrote a docker-compose file that took 4 minutes to start and ate 12GB of RAM. New engineers spent their first day getting the stack running and their second day debugging why one service couldn’t talk to another. In a monolith you run one process.

Debugging crosses process boundaries. A bug that manifests in the API layer might originate in the recommendation engine, pass through the data pipeline, and surface three hops later as a wrong number in a JSON response. You’re reading logs from three services, correlating timestamps, and guessing at causality. In a monolith you set one breakpoint.

Shared code becomes a versioning problem. We had common libraries for auth, logging, and data models. Every service pinned its own version. Updating the auth library meant updating it in nine repos, testing nine services, and deploying nine times. Or — what actually happened — three services stayed on the old version indefinitely because nobody wanted to deal with it.

On-call is meaningless. When eight people rotate on-call across nine services, everybody is on call for everything all the time. The service boundaries that were supposed to create clear ownership instead create confusion: “Is the recommendation engine slow because the recommendation service is slow, or because the data pipeline service is feeding it stale data?” You page the recommendation on-call, who looks at their service, finds nothing wrong, and pages the data pipeline on-call, who looks at their service, finds a slow query, and fixes it. Two people woke up for a slow query.

When microservices make sense

I’m not arguing that microservices are always wrong. They make sense when:

You have independent teams that deploy on independent schedules with independent on-call rotations. If team A and team B never coordinate deploys and never share code, service boundaries reflect real organizational boundaries. Conway’s Law works in your favor.

You have genuinely different scaling requirements. If your API handles 100K requests/second but your ML training runs once a day, running them in the same process wastes resources. But “genuinely different” means orders of magnitude, not “the API is a little busier.”

You have genuinely different runtime requirements. A Python ML pipeline and a Go API server have legitimate reasons to be separate processes. But two Node.js services that share the same database and deploy at the same time? That’s a monolith with a network call in the middle.

The test is simple: if the same person changes both services in the same PR more than occasionally, they should be the same service.

What I’d build now

At Superkey we’re rebuilding from scratch. The architecture is deliberately boring:

One API server (Hono/Node.js). All business logic lives here. Routes, mutations, validations, state machines — one codebase, one process, one deploy.
One database (Postgres). One schema, one migration tool, one connection pool.
One frontend (Next.js). Server components where possible, client where necessary.
Background jobs run in the same codebase via a job queue (Inngest), not as separate services.

The entire stack starts with pnpm dev. Local development takes 10 seconds. Every engineer can run the full product on their laptop. Debugging means setting a breakpoint in one process. Deploying means pushing one branch.

When something needs to scale independently — if we hit the point where the background job queue needs dedicated compute, or the API needs to be distributed across regions — we’ll split that piece out. With evidence. Not with speculation about future scale that may never arrive.

The monolith is not the compromise

There’s a cultural assumption in software engineering that monoliths are the beginner choice and microservices are the grown-up choice. That you start with a monolith because you’re small and graduate to microservices when you’re serious.

This is backwards. The monolith is the disciplined choice. It forces you to think about boundaries within a single codebase — module structure, interface design, dependency direction — without the escape hatch of “just make it a service.” When you can’t throw a network call at an abstraction problem, you have to actually solve the abstraction problem.

Microservices let you defer architectural decisions by turning them into infrastructure decisions. Can’t figure out the right module boundary? Make it a service boundary. Can’t agree on a shared data model? Give each service its own database. Can’t coordinate deploys? Don’t — just version everything and hope for the best.

These aren’t solutions. They’re the same problems with more YAML.

I’ve been CTO four times. Each time, the architecture I’m proudest of is the one with the fewest moving parts. Not because simplicity is easy — it’s harder than complexity, because you can’t hide bad decisions behind infrastructure. But because when something goes wrong at 2am, and it will, you want to be debugging one thing, not nine.

Frank Thomas is CTO at Superkey Insurance.