The Cost of Keeping Options Open

There is a category of architectural decision that feels responsible in the moment and costs you dearly over time. It goes by several names: keeping options open, avoiding lock-in, building for flexibility. The instinct behind it is reasonable — you don’t want to make a decision today that you can’t reverse tomorrow. So you add an abstraction layer. You build an interface. You design for replaceability.

And then you spend the next two years maintaining the abstraction, explaining it to new developers, debugging through it, and never once exercising the optionality it was built to preserve.

I’ve made this mistake enough times that I’ve started treating the phrase “we should keep our options open” as a signal to slow down and ask what the options are actually worth.

What flexibility costs

Every abstraction has a carrying cost. It’s not the cost of building it — that’s usually small. It’s the cost of operating with it indefinitely.

A database abstraction layer written to allow swapping between PostgreSQL and MySQL means every developer on the team works at one remove from the database. They write queries against the abstraction, not against the database. When something performs poorly, they debug through the abstraction layer. When they want to use a database feature — a window function, a partial index, a generated column — they first have to check whether the abstraction supports it, and often find it doesn’t, and then either extend the abstraction or write a raw query that bypasses it, which defeats the purpose of the abstraction entirely.

The carrying cost is not dramatic. It’s one extra layer in the mental model every developer maintains. It’s the mild friction of one more thing to understand. But mild friction that affects every database interaction, every day, across the entire engineering team, for the life of the product, is not mild when you integrate it.

And for this you get: the ability to swap databases. Which you will almost certainly never do. PostgreSQL-to-MySQL migrations are so rare in production systems that they’re essentially theoretical. The teams that do switch databases are usually switching away from a bad early decision — from MongoDB to PostgreSQL, from MySQL to PostgreSQL — and when that happens, no abstraction layer makes the migration easy. The data model has to change. The queries have to change. The abstraction layer did not buy you what it promised.

The specific things we add flexibility for

It’s worth being concrete about the categories of lock-in that teams routinely build abstraction layers to avoid, because some of them are worth avoiding and most of them aren’t.

Database vendor lock-in. The ORM or abstraction layer that lets you swap databases. Worth it: almost never for the database layer. A database is not an interchangeable component. Your data model, your query patterns, your indexing strategy, your transaction semantics — all of these are coupled to the specific database you’re using. If you choose PostgreSQL, use PostgreSQL. Use its features. Accept the coupling. If you choose wrong and need to migrate in two years, that’s a concrete problem with a concrete cost that you’ll pay once. The abstraction layer is a diffuse cost you pay forever.

Cloud provider lock-in. The infrastructure abstraction that means you can move from AWS to GCP if you need to. Worth it: occasionally, in specific circumstances. If you’re building a product that genuinely needs to run in multiple cloud environments — a self-hosted enterprise product, a compliance-constrained application — cloud portability is a real requirement. If you’re building a SaaS product, the probability that you’ll migrate cloud providers is low enough that the abstraction is probably not earning its keep. More importantly: the most valuable cloud services are the ones with no portable equivalent. The teams that use RDS, and SQS, and S3, and Lambda fluently build faster than the teams that wrap everything in cloud-agnostic interfaces. The wrapper doesn’t just add cost — it prevents you from using what you’re paying for.

Message queue vendor lock-in. The interface that means you can swap RabbitMQ for SQS for Kafka without changing application code. Worth it: more often than the database case, because message queue semantics are more similar across systems than database semantics are. But even here, the delivery guarantees, the ordering guarantees, and the consumer model differ enough between systems that a real migration will touch application code regardless of what the interface hides.

Framework lock-in. The abstraction layer that insulates your business logic from your web framework so you can swap Express for Fastify or Django for FastAPI without rewriting everything. Worth it: in theory. The problem is that framework abstractions have their own gravity. They tend to grow until they’re more complex than the framework they’re abstracting, and the business logic they were supposed to insulate is still coupled to the abstraction, which is coupled to the framework in ways that are subtler and harder to see than direct framework coupling would be.

The real risk being avoided

When a team says they want to avoid lock-in, they’re usually expressing a genuine anxiety, but the abstraction they build addresses a different risk than the one they’re actually worried about.

The risk they’re usually worried about: the vendor relationship goes bad. Prices increase dramatically. The service degrades. The company gets acquired and the product gets shut down. They want an escape hatch.

The risk the abstraction actually addresses: the code is tightly coupled to a specific API. Swapping the underlying system requires changing a lot of code.

These are related but not the same problem. A migration from SQS to RabbitMQ when your application code is directly coupled to the SQS API is weeks of work. A migration when your application code goes through an abstraction layer is… also weeks of work, because the semantics are different and the abstraction layer doesn’t actually handle the hard parts of the migration. It handles the syntax. The semantics — delivery guarantees, consumer groups, dead letter queues, retry behavior — you have to handle those regardless.

The escape hatch is narrower than it looks from the outside.

The real mitigation for vendor risk is not abstraction — it’s avoiding single points of dependency for critical paths, having a credible migration plan that you’ve actually thought through, and making architectural decisions conservatively in the first place. If you’re deeply worried about being locked in to a vendor, the right question is whether you should be using that vendor at all, not how to build a layer that makes the lock-in feel less visible.

When abstraction is actually the answer

I want to be careful not to argue against abstraction in general, because abstraction is the fundamental tool of software design. The question is always: what problem does this abstraction solve, and is that problem real and imminent enough to justify the carrying cost?

There are cases where the answer is clearly yes.

When the underlying implementation is genuinely likely to change. Not theoretically, not “what if we need to scale” — actually, concretely likely. You’re prototyping and the data store is provisional. You’re building an integration with a third-party service that has a history of API churn. You’re building a plugin system where the implementation is designed to be swapped by users. When the change is likely and the interface is well- understood, the abstraction earns its cost.

When testing requires it. Abstracting an external API behind an interface so you can inject a fake in tests is a real and valuable use of abstraction. The carrying cost is low because the interface is used in exactly two places — production and test — and the test requirement continuously validates that the abstraction is coherent.

# This abstraction earns its keep
from abc import ABC, abstractmethod
from typing import Protocol


class EmailSender(Protocol):
    async def send(
        self,
        to: str,
        subject: str,
        body: str,
    ) -> None: ...


class SendgridEmailSender:
    def __init__(self, api_key: str):
        self.client = SendgridClient(api_key)

    async def send(self, to: str, subject: str, body: str) -> None:
        await self.client.send_email(
            to=to,
            subject=subject,
            html_content=body,
        )


class FakeEmailSender:
    def __init__(self):
        self.sent: list[dict] = []

    async def send(self, to: str, subject: str, body: str) -> None:
        self.sent.append({"to": to, "subject": subject, "body": body})


# In tests:
async def test_welcome_email_sent_on_signup():
    email_sender = FakeEmailSender()
    service = UserService(email_sender=email_sender)

    await service.register(email="[email protected]", password="secure")

    assert len(email_sender.sent) == 1
    assert email_sender.sent[0]["to"] == "[email protected]"
    assert "welcome" in email_sender.sent[0]["subject"].lower()

The interface here solves a real problem — testing without sending real emails — and it solves it today, every time you run the tests. It’s not speculative. The flexibility is exercised constantly.

When you’re building infrastructure that others will use. Libraries, frameworks, and platform services should be more abstract than application code, because their users have requirements the authors didn’t anticipate. The carrying cost of abstraction is distributed across all users, and the optionality is exercised by users rather than the original authors.

The decision framework I actually use

When I’m about to add an abstraction for flexibility, I ask three questions before writing any code.

When did this last actually happen? Not “when could this happen” — when did a team actually swap the thing the abstraction is protecting? If the answer is “I’ve never seen it happen” or “I heard of a team that did it once,” the risk is theoretical and the abstraction is probably not worth building.

What does migration actually involve? Walk through the migration concretely. Not at the level of “we’d swap the implementation behind the interface” — at the level of what code changes, what data changes, what deployment steps are required, what downtime risk exists. If the abstraction reduces this from very hard to merely hard, it may not be worth the carrying cost. If it reduces it from impossible to possible, it might be.

What am I not building while I build this? Abstraction layers take time. That time has an opportunity cost. The feature you’re not building, the debt you’re not paying, the performance problem you’re not investigating — these are real costs that don’t appear on the balance sheet of the abstraction decision. Engineering time is the scarcest resource a team has. Spending it on speculative flexibility is a choice with a real cost.

The version that actually keeps options open

Here’s the thing: the best way to keep your options open is usually not to build abstraction layers. It’s to make your system understandable.

A system where the code is clear, the dependencies are explicit, the data model is documented, and the reasoning behind key decisions is recorded — that system is genuinely flexible. Not because it has interfaces that can be swapped, but because a developer who needs to make a change can understand what they’re changing and why, quickly enough to actually do it.

The systems I’ve worked in that were hardest to change were not the ones with tight vendor coupling. They were the ones where the coupling was hidden by abstraction layers that nobody fully understood anymore, written to solve problems that had long since stopped being relevant, maintained because removing them felt riskier than keeping them.

The abstraction had not kept options open. It had preserved optionality in the wrong dimension while closing off understanding in the most important dimension.

Code you understand is more flexible than code you don’t, regardless of how many interfaces it has. A migration from a system you understand is faster than a migration from a system you don’t, even if the second system has better abstractions.

Write for clarity first. Add abstraction when it solves a problem you have, not a problem you might have. Accept the coupling that comes with using specific tools well, because using tools well is how you build things that work.

The options worth keeping open are the ones you’re actually going to exercise. Spend your abstraction budget on those.