CodeGym /Courses /ChatGPT Apps /Common deployment mistakes and a debugging strategy

Common deployment mistakes and a debugging strategy

ChatGPT Apps
Level 7 , Lesson 4
Available

1. Deployment anatomy: where things can break

It helps to see the entire chain first. You can mentally unfold the deployment of a ChatGPT App in your architecture into a pipeline like this:

flowchart TD
  A[Your laptop
git commit] --> B[Git repository
GitHub/GitLab] B --> C[Vercel Build
npm run build] C --> D[Vercel Deploy
Preview/Prod] D --> E[HTTP endpoint
/mcp, /api/...] E --> F[ChatGPT / Dev Mode
tool calls, widgets]

An error can appear at any of these steps, but the symptoms in ChatGPT often look the same: "Error talking to app", "Network error", or just silence. Your goal is not to shoot in the dark, but first to understand whether it failed at build time, at runtime, or ChatGPT is pointing to the wrong place.

It’s convenient to split problems into three big categories:

  • Build errors: Vercel couldn’t build the project at all. Production didn’t update—that’s “good”—but you see a red build.
  • Runtime errors: the build succeeded, but requests return 500/502, timeouts, or strange behavior.
  • Config drift (configuration drift): everything is fine locally and Vercel logs look fine, but ChatGPT targets an old URL, uses an old manifest, or empty env variables.

We’ll go through these three layers and in parallel build a general debugging strategy.

2. Build errors: when the project doesn’t build

This is the first problem type from the introduction—build errors: the project doesn’t build at all because Vercel can’t successfully build your Next.js 16 project.

Node and Next.js: different environment, different requirements

Locally you might (unfortunately) run an outdated Node, while Vercel will try to build your Next.js 16 project with a supported Node version (minimum 18.18.0). If package.json explicitly declares an incompatible version, the build may fail in production even though your dev server was running.

A simple protection is to explicitly specify "engines" in package.json:

{
  "engines": {
    "node": ">=18.18.0"
  }
}

Then both locally and in CI/on Vercel you’ll see in advance that Node is too old.

“It works on my machine!” and forgotten dependencies

A classic: you installed a library with npm install some-lib but didn’t commit the updated package-lock.json, or some dependencies are installed globally. On Vercel the app is built “from scratch,” it dutifully runs npm install from the manifest, but your cherished some-lib isn’t there—resulting in a build error.

What helps here is strict discipline:

  • any new dependency is added and committed immediately;
  • before pushing to main/production, you run npm run build locally. If the local build fails, it will only be worse on Vercel.

Case‑sensitive file system

Locally, many use macOS or Windows where the file system is case‑insensitive by default. On Vercel the build runs in a Linux environment, where Widget.tsx and widget.tsx are different files.

Typical bug:

// Import in code
import { AppWidget } from "@/components/Widget";

// But the repository has file components/widget.tsx

On your machine it works; on Vercel you get a module error “Cannot find module '@/components/Widget'”. Fix it by cleaning up names and paying careful attention to case.

Env variables at build time

Another source of surprises is using process.env.* in code that runs at build time (for example, in next.config.mjs or in modules imported during build). If you load .env.local locally but forget to set these variables for the build environment on Vercel, the build will either fail or—what’s worse—succeed with undefined and bake invalid values into the bundle.

For ChatGPT App this is especially critical if, for example, you form the baseURL for the MCP endpoint or external API URLs right at build time.

A good practice is to explicitly validate critical env variables before the app starts (which we’ll discuss in a separate section), so the build fails loudly and predictably.

3. Runtime errors: when everything built but doesn’t work

Now let’s move to the second layer from the introduction—runtime errors: the build passed, but at runtime things break.

The build succeeded, Vercel happily showed a green deploy, you switched your ChatGPT App to the prod URL—and then got "Error talking to app" in chat. That means problems have moved to the runtime level.

Null or empty env variables

Most production incidents in the ChatGPT App world start with undefined. Locally you have a neat .env.local with OPENAI_API_KEY, MCP_BASE_URL, etc., but on Vercel you forgot to add these variables or mixed up their names.

For example, you read:

const apiKey = process.env.OPENAI_API_KEY;

but on Vercel you created OPENAI_APIKEY or OPENAI_API_KEY_PROD. As a result, at the first MCP tool call, your route handler fails with an authentication error.

It’s far better when the app fails early and clearly. A good pattern is a dedicated module in your Next.js project that validates env variables on import:

// app/lib/env.ts
const required = ["OPENAI_API_KEY", "MCP_BASE_URL"] as const;

type RequiredKey = (typeof required)[number];

function getEnv(key: RequiredKey): string {
  const value = process.env[key];
  if (!value) {
    throw new Error(`Missing required env var: ${key}`);
  }
  return value;
}

export const env = {
  OPENAI_API_KEY: getEnv("OPENAI_API_KEY"),
  MCP_BASE_URL: getEnv("MCP_BASE_URL"),
};

Now if you forgot to set variables on Vercel, Next.js will fail on the first import of env, and logs will contain a human‑readable message "Missing required env var: ...".

Important: on Vercel, changes to env variables are not picked up automatically. After changing values you must do a new deploy (redeploy), otherwise the runtime will keep using old values.

Errors in route handlers and the MCP endpoint

In the official ChatGPT App template, the MCP server is usually implemented as app/mcp/route.ts. Inside you have code that parses the JSON‑RPC request, routes it to a tool, and returns a response. If somewhere in the chain you do a throw without handling it—the user in ChatGPT will get a 500.

It’s worth always wrapping the top level of the MCP handler in try/catch, logging the error, and returning a structured response:

// app/mcp/route.ts
import { NextRequest, NextResponse } from "next/server";

export const dynamic = "force-dynamic";
export const maxDuration = 30; // seconds

export async function POST(req: NextRequest) {
  try {
    const body = await req.json();
    // handle MCP request here
    const result = await handleMcpRequest(body);
    return NextResponse.json(result);
  } catch (error) {
    console.error("MCP route error", error);
    return NextResponse.json(
      { error: "Internal MCP error" },
      { status: 500 }
    );
  }
}

A couple of notes:

  • dynamic = "force-dynamic" helps avoid unexpected static generation and caching for MCP routes in Next.js 16.
  • maxDuration = 30 explicitly tells Vercel that the route handler can run for up to 30 seconds, which matters for long LLM requests.

Timeouts and “Network error” in ChatGPT

Vercel limits the execution time of serverless functions: on free tiers it’s usually around 10 seconds, on paid plans it can be higher (up to several minutes). If your MCP tool makes a long call to a database or an external API, it might not respond in time, and ChatGPT will get a "Network error" or a truncated stream.

If you use streaming (SSE) for partial results, it’s especially important to send the first bytes of the response before the timeout. Then the transfer itself may continue longer, but the platform won’t consider the function “stuck.”

Mini‑tip: measure tool invocation time and log it together with the tool name. Then logs will show that, for example, search_flights consistently takes 12 seconds and narrowly misses the limit.

export async function safeToolCall<TInput, TOutput>(
  name: string,
  handler: (input: TInput) => Promise<TOutput>,
  input: TInput
): Promise<TOutput> {
  const started = Date.now();
  try {
    const result = await handler(input);
    console.log("[tool] ok", name, { ms: Date.now() - started });
    return result;
  } catch (error) {
    console.error("[tool] fail", name, {
      ms: Date.now() - started,
      error,
    });
    throw error;
  }
}

Then instead of handler(args) you call safeToolCall("search_flights", handler, args).

Network and external services

Sometimes it’s just a trivial https:// vs http:// or an outdated baseURL. Especially if you first tested on a local machine with one URL, and in production you have a different domain or port.

It’s useful to move base URLs to configuration (environment‑dependent) and not hardcode them inside the tool code. Then when changing environments you update one env variable instead of remembering the five places in code where you had http://localhost:3001.

4. Configuration and environment drift

Finally, the third type from our diagram—configuration drift between environments.

Even if the build passed and runtime looks healthy in logs, ChatGPT can behave “as if a different app version is running.” This is exactly the case where the problem is less about code and more about configuration and environment consistency.

Dev Mode vs production

In Dev Mode, ChatGPT points to the Connector URL you set manually: usually a tunnel URL (https://myapp-dev.ngrok-free.app/mcp or something similar) or a staging URL on Vercel. In production (via the Store or org settings), the App should point to a stable production endpoint, for example https://myapp.vercel.app/mcp.

A mistake almost everyone makes: you deployed to Vercel, but the ChatGPT App settings still contain the old tunnel URL. The local server is off, the tunnel is long dead, and ChatGPT dutifully calls it and gets 502. In the UI this looks like "Error talking to app", and a developer starts fixing MCP code that isn’t even executing.

The cure is discipline: after any environment change (tunnel → staging, staging → prod), check which URL is configured in Dev Mode and in the App’s production configuration.

Stale manifest and ChatGPT cache

ChatGPT will cache information about your App: the list of tools, their descriptions, metadata. So the situation “I changed a tool schema, but the model still thinks the argument is named the old way” is real.

For significant tool changes, it helps to:

  • make sure you actually deployed the new version (check the commit hash in logs, print it in a startup log);
  • recreate or re‑connect the App in Dev Mode to force the platform to reload the manifest;
  • during debugging, use MCP Inspector where you can see the current list of tools and schemas for sure.

Env config: dev/staging/prod

We already discussed how env variables can break build and runtime. Here’s a top‑down look at dev/staging/prod and value consistency between them.

A common pain point: your .env.local is perfect, but Vercel environments are a mess. As a result:

  • locally you have one API key and one external service URL;
  • on staging—you have completely different values;
  • on prod—half the variables aren’t set.

A simple text file docs/env.md in the repository helps a lot, where you list: which variables are required, in which environments they’re mandatory, and example values. It may feel like bureaucracy, but during an incident such a checklist saves hours.

5. What errors look like on the ChatGPT side

Now let’s look at the situation through the eyes of a ChatGPT user. They only see the interface and know nothing about Vercel, Node, or MCP. And you, unfortunately, also don’t yet know what exactly broke.

Typical symptoms in ChatGPT:

  • a message "Error talking to [App Name]" right after attempting to use the app;
  • an infinite spinner with no visible error;
  • red text "I encountered an error while running the tool";
  • a widget doesn’t appear or appears empty.

Each of these symptoms usually maps to a certain level of failure:

  • if the App is completely unavailable (wrong URL, tunnel down, SSL error), ChatGPT can’t reach your MCP endpoint— check domain availability in a browser and Vercel logs with 4xx/5xx codes;
  • if MCP returns a valid JSON‑RPC with an error field, ChatGPT will honestly say the tool returned an error—now it’s about business logic or argument validation;
  • if MCP responds successfully but the widget HTML is broken or there’s a JS error, then in the widget console (DevTools → the widget’s iframe) you’ll see what exactly failed.

So a good habit is: as soon as you see strange behavior in chat, note the timestamp (to the minute) and go to Vercel logs to find requests at that time.

6. Debugging strategy: don’t panic—act

Let’s assemble from everything above a small playbook—a sequence of actions for when something goes wrong. The goal is to replace running in circles with a calm algorithm.

Step 1: identify the problem type

If the Vercel build is red—celebrate: the error was caught before production. Open the build logs, find the first real error (not 200 lines of warnings), and reproduce locally with npm run build.

If the build is green but ChatGPT complains—that’s runtime or config. Check:

  • whether your App’s prod URL is reachable from a browser (does https://myapp.vercel.app/mcp return anything at all);
  • whether the MCP endpoint returns 200/500 or doesn’t resolve at all;
  • whether the URL in App settings matches the one you just checked.

Step 2: read logs, not minds

Next stop—Vercel logs: server logs for the needed deploy and environment (Preview/Production).

Look for:

  • errors like Error: Missing required env var ...—then the problem is configuration;
  • a stack trace from the MCP handler—then business logic or input parsing is failing;
  • messages about timeouts or exceeding function duration.

In parallel, don’t forget about MCP Inspector. If you connect to the same MCP endpoint via the inspector and manually call tools, you’ll quickly see whether the problem is in MCP itself or in the ChatGPT ↔ MCP integration.

Step 3: fast rollback or hotfix?

If you see that the production deploy is clearly broken (for example, the MCP route constantly throws the same error on every request), and the previous deploy was healthy, the right move is to roll back. Vercel lets you quickly switch to a previous successful deploy without rebuilding—it’s essentially switching the active version.

This is better than trying to patch production “on the fly,” especially if you don’t fully understand the incident’s cause yet.

Once stabilized, investigate the cause calmly, write tests, fix the code, and only then ship the next version.

Step 4: codify the lessons in documentation

Any serious incident is a reason to update the internal README:

  • add a required env variable without which everything fails;
  • record which specific case led to the error (for example, “import with incorrect file name casing”);
  • describe the short sequence of actions that helped fix everything quickly.

It may seem boring, but in a couple of months you’ll thank yourself.

7. A few practical code techniques

Now let’s take a few steps from our playbook and reinforce them with small code techniques in our training app (ChatGPT App).

Unified configuration module

We’ve already written a simple env variable validator. You can extend it to distinguish environments:

// app/lib/config.ts
type NodeEnv = "development" | "test" | "production";

const nodeEnv = (process.env.NODE_ENV || "development") as NodeEnv;

const requiredBase = ["OPENAI_API_KEY"] as const;
const requiredProd = ["MCP_BASE_URL"] as const;

function ensure(keys: readonly string[]) {
  for (const key of keys) {
    if (!process.env[key]) {
      throw new Error(`Missing env var ${key} for NODE_ENV=${nodeEnv}`);
    }
  }
}

ensure(requiredBase);
if (nodeEnv === "production") {
  ensure(requiredProd);
}

export const config = {
  nodeEnv,
  openaiApiKey: process.env.OPENAI_API_KEY!,
  mcpBaseUrl: process.env.MCP_BASE_URL ?? "http://localhost:3000/mcp",
};

This module will immediately highlight if production is running without a required variable.

Logging incoming MCP requests

Simple but very useful wrapper for the MCP handler:

// app/lib/mcp-logger.ts
export function logMcpRequest(body: unknown) {
  console.log("[mcp] request", {
    time: new Date().toISOString(),
    // do not log sensitive data
    keys: typeof body === "object" && body !== null
      ? Object.keys(body as Record<string, unknown>)
      : typeof body,
  });
}

And use it in app/mcp/route.ts:

import { logMcpRequest } from "@/app/lib/mcp-logger";

export async function POST(req: NextRequest) {
  try {
    const body = await req.json();
    logMcpRequest(body);
    const result = await handleMcpRequest(body);
    return NextResponse.json(result);
  } catch (error) {
    console.error("MCP route error", error);
    return NextResponse.json({ error: "Internal error" }, { status: 500 });
  }
}

In logs you’ll see what’s coming from ChatGPT: at least by keys ("jsonrpc", "method", "params"), which makes it easier to understand which specific call is failing.

Simple MCP endpoint health check

Sometimes it’s useful to have a small “healthcheck” route handler for the MCP server that ChatGPT doesn’t call directly, but you can quickly open in a browser to see whether the server is alive and whether it can see its env variables:

// app/api/health/route.ts
import { NextResponse } from "next/server";
import { config } from "@/app/lib/config";

export async function GET() {
  return NextResponse.json({
    status: "ok",
    env: config.nodeEnv,
    hasOpenAiKey: !!config.openaiApiKey,
  });
}

If https://myapp.vercel.app/api/health responds with status: "ok", then at least the basic pipeline up to your Node code is alive.

8. Typical mistakes in deployment and debugging

Mistake No. 1: Deploying without a local npm run build.
When a developer never runs the build locally, they only discover an incompatible Node version, path issues, or a TS error on Vercel. This lengthens the “broke → fixed” cycle because every experiment is a new deploy. Getting into the habit of running npm run build before pushing to main saves a lot of time (see also section 2 and step 6.1 about a local npm run build).

Mistake No. 2: Secrets only exist in .env.local.
The project works perfectly on the author’s machine, but crashes in production because process.env.OPENAI_API_KEY === undefined. The reason is trivial: env variables were never added in Vercel settings (and sometimes were even named differently). People often forget about the Development/Preview/Production split and are surprised that staging and prod behave differently (see sections 3.1, 4.3, and 7.1 for details).

Mistake No. 3: Using NEXT_PUBLIC_* for secrets.
In Next.js, all variables with the NEXT_PUBLIC_ prefix go into the browser bundle. If you accidentally name an API key NEXT_PUBLIC_OPENAI_API_KEY, it will go to the user’s browser and can be extracted from DevTools. Don’t do this. Only safe values should be public (e.g., feature flag identifiers, not tokens).

Mistake No. 4: Ignoring Vercel logs and trying to “fix it via ChatGPT”.
Sometimes a developer sees "Error talking to app" in chat and spends hours changing prompts, tool descriptions, and tweaking Dev Mode—but never opens the serverless logs. And there, a perfectly clear error appears: "Missing env var", "Cannot find module", or a stack trace from a specific tool. A good engineer checks logs first and only then argues with the model.

Mistake No. 5: Confusing Dev Mode and the production App.
After the first successful Vercel deploy, it’s easy to forget that Dev Mode may still point to an old tunnel or a preview URL. As a result, you’re sure you’re testing the production version, but you’re actually talking to a local branch that should have been deleted long ago. Or vice versa: you think you’re testing rough changes, but ChatGPT is calling the production endpoint. Regularly check which URL is set in the App and Dev Mode settings (see also section 4.1 about Dev Mode and production).

Mistake No. 6: Expecting env changes on Vercel to work “on the fly”.
Some students change variable values in the Vercel dashboard and immediately run to ChatGPT to check the result. But the runtime is still using old values because there was no redeploy. Any change to env variables requires a new deploy; otherwise the function won’t see the update (see section 3.1 for details).

Mistake No. 7: No simple rollback strategy.
In the middle of an incident it’s tempting to “quickly push a fix” straight to main. But that adds another potentially broken deploy while users are suffering. It’s much calmer to have a habit: for a serious error, immediately roll back to the previous successful deploy, fix the problem in a separate branch, and only then release a new version. Vercel provides a convenient UI for this—use it.

1
Task
ChatGPT Apps, level 7, lesson 4
Locked
Fail-fast baseURL for the build stage (env in Next config)
Fail-fast baseURL for the build stage (env in Next config)
1
Task
ChatGPT Apps, level 7, lesson 4
Locked
/api/health for runtime diagnostics without leaking secrets
/api/health for runtime diagnostics without leaking secrets
1
Survey/quiz
Debug & Deploy, level 7, lesson 4
Unavailable
Debug & Deploy
Environments, Debug & Deploy (Vercel + tunnel)
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION