typescript

Stop Shipping ChatGPT Wrappers. Ship an Agent in TypeScript, or Don't Bother.

Most "AI features" I see today are a chat box taped onto an app.

11 Jan 2026

Most "AI features" I see today are a chat box taped onto an app.

That's not a product. That's a demo with a fancy UI.

If your "assistant" can't do real work, with real constraints, it's just vibes. And vibes don't reconcile invoices.

I've built systems where "almost correct" equals "wrong." Fintech is not forgiving. Your on-call future self is not forgiving. So let's draw the line between what actually works and what just looks good in a demo.

What Actually Separates a Toy from an Agent

Here's the problem I keep seeing.

Someone builds a chat interface, connects it to ChatGPT's API, and calls it an "AI agent." But it can't actually do anything.

A toy:

  • Talks nicely.
  • Hallucinates confidently.
  • Has no tools.
  • Has no audit trail.
  • Breaks silently.

An agent:

  • Uses tools with contracts.
  • Fails loudly.
  • Logs everything.
  • Can be stopped.
  • Can be rolled back.

If your thing is not the second one, don't call it an agent. Call it "chat." No shame. Just name it right.

The difference isn't cosmetic. It's architectural. A chat box responds. An agent acts. And action requires boundaries, validation, and control.

Why Tools First, Prompts Second (Even When It Feels Slower)

People start with prompts because it feels fast.

I get it. London taught me to move quickly. Ship fast, iterate faster.

But Melbourne taught me something better: move quickly in the right direction. The right direction starts with a tool contract that your code owns, not your prompts.

Here's why this matters.

Prompting is content. It changes with every model update, every context shift, every edge case you discover. Tooling is architecture. It defines what your agent can actually do, and it doesn't change just because you switched from GPT-4 to Claude.

When you start with tools:

You force clarity. Before you write a single prompt, you must define what actions are possible. This exposes gaps in your thinking early.

You can test behaviors without the model. I can write unit tests for my tool validation logic. I can't write unit tests for "will the model understand this prompt correctly."

You can reason about security. If the model wants to delete a database row, it calls deleteDatabaseRow. That function has explicit permissions, logging, and validation. The model doesn't get to invent a new attack vector.

You can ship something that survives production traffic. Prompts fail. Tools either work or they don't. The binary nature makes debugging possible.

I learned this the hard way.

I once built an agent that used prompts to "understand" what API endpoint to call. It worked great in testing. In production, it started calling endpoints that didn't exist. The prompt said "call the user endpoint" and the model interpreted that as "call user-delete-all." Not great.

Now I define the tools first. The model picks from a menu. No inventing actions. No hallucinations about what's possible.

How TypeScript Becomes Your Safety Net (Not Just Nice-to-Have)

TypeScript is not "nice to have" here.

It's the only way I've found to keep agent integrations from becoming spaghetti. The agent is the least reliable part of your system. So your boundaries must be strict.

Here's what types give you that JavaScript with any everywhere cannot:

Hard edges between "model output" and "app actions." The model returns a string. That string might say "delete user 123." But TypeScript ensures that string goes through a parser, validation, and only then reaches your actual delete function. There's no direct path from model output to dangerous operations.

Compile-time guard rails in code review. When I review agent code, I'm not just reading prompts. I'm seeing the type definitions. I can spot when someone tries to pass unvalidated input to a tool. The compiler catches it before it hits production.

A place to document reality, not hopes. Types are executable documentation. This tool expects a userId (string) and a reason (string), not "whatever the model thinks might work."

A sane refactor path when the tools inevitably evolve. When I need to add a new parameter to a tool, TypeScript shows me every place that calls it. I can't miss an update. In JavaScript, I'd discover the missing parameter when users complain.

If you're building agents in JavaScript with any everywhere, you're gambling.

For TypeScript best practices that prevent this, check out essential coding principles for better code quality.

The Agent Loop That Actually Works (With Why Each Part Matters)

Here's the shape I trust. Not because it's pretty. Because it's controllable.

Let me show you the code, then explain why each decision matters.

Text
type ToolResult =
  | { ok: true