Why we do not build with ChatGPT.

We get this question on almost every discovery call. Usually it comes from someone who has spent a few months using ChatGPT personally and found it impressive. They want to know why they need us to build something when they could just give their team a ChatGPT subscription.

It is a fair question. Here is the honest answer.

ChatGPT is a product. We use the underlying models.

ChatGPT is a consumer interface built on top of OpenAI's models. It is designed for personal, general-purpose use. You type a message, you get a response, the session ends.

The models underneath it — GPT-4o, Claude Opus, Gemini Ultra — are something else. They are available via API, which means you can call them programmatically, pass them structured context, give them tools to use, connect them to your business data, and control exactly what they do and when.

That is what we use. Not the chat interface. The engine.

The three things a chat interface cannot do.

First: it cannot see your business. ChatGPT has no idea who your clients are, what your contracts say, how your invoicing works, or what happened in last Tuesday's call. Every conversation starts from zero. A digital employee starts from everything.

Second: it cannot take action. ChatGPT tells you things. A digital employee does things. It reads the invoice that arrived, matches it to the purchase order, posts it to your accounting system, and flags the discrepancy. That requires tools, integrations, and persistent context. A chat interface has none of that by design.

Third: it cannot be trusted for sensitive work. When you paste client data into ChatGPT, that data travels to OpenAI's servers and is subject to their terms. For a Friday afternoon general query, that is probably fine. For a digital employee that touches your financial records, your client contracts, or your internal communications, it is not acceptable.

A chat interface is a window. A digital employee is a member of staff.

What "frontier models" actually means.

We use the most capable models available. Right now that means Claude Opus, GPT-4o, and Gemini 1.5 Pro, depending on the task. We do not use lighter, cheaper, faster models for work that requires real reasoning. We use the same models that the world's largest companies use for their most critical workflows.

The cost difference between a capable model and a cheaper one is small. The quality difference on complex tasks is not.

Why this matters for your business.

The gap between "good enough for a chat" and "good enough to run a business process" is large. We have seen it close in real time over the past two years. But the consumer tools and the business tools are still different products, used differently, with different security postures and different integration patterns.

When someone tells you they can automate your business with a ChatGPT subscription and a few Zapier flows, they are not lying. They are underestimating the problem. Those automations will work for three weeks. Then the format will shift, the API will change, the context will get stale, and nobody will know why it broke.

A digital employee is built to understand what it is doing. That is what makes it durable. And durability is what makes it worth building.

Why we do not build with ChatGPT.

ChatGPT is a product. We use the underlying models.

The three things a chat interface cannot do.

What "frontier models" actually means.

Why this matters for your business.

More from the blog.

Why the project board always dies in week three.

The first hire is never the obvious one.

What we mean when we say frontier models.

Ready for a digital workforce?