CodeGym /Courses /ChatGPT Apps /Dynamic tool list management (tool gating)

Dynamic tool list management (tool gating)

ChatGPT Apps
Level 11 , Lesson 1
Available

1. What tool gating is and why it deserves a dedicated lecture

So far, in simplified examples we did the following: describe a set of tools for the App, connect an MCP server — and everything is always available to the model. From a “build a demo in 5 minutes” perspective this works. From a real product perspective — not so much.

Tool gating is a pattern where the list of tools available to the model is not fixed but depends on context: the workflow step, user permissions, data state, and so on.

The most important point: the tools list is not “a random dump of everything you ever wrote”, but part of the scenario design. When you design the workflow, you are effectively also designing which tools the model is allowed to see at each stage.

A simple analogy: you don’t give a bank intern access to all systems at once — first only view, then simple operations, then more serious ones. The same logic applies here; the “intern” is the LLM.

2. The “all tools at once” problem: context pollution and security

If you give the model dozens of tools, it immediately suffers on several fronts: overloaded context, confusion in selection, and security issues. OpenAI/Anthropic research shows that the more functions you describe in context, the worse the model hits the right one.

First, every tool definition is tokens: name, description, JSON Schema. A list of 30–40 tools easily consumes a couple of thousand tokens. Those are the very tokens you could spend on dialogue history, user context, and good answer examples. Instead, the model reads a “novel” about your APIs.

Second, when tools are similar, the model starts to get confused. If you have search_products and get_product_details, it might try to call get_product_details directly with a text query because the description seemed more appropriate.

There’s also the separate security question. There’s a boring but important principle of least privilege: a system should have only those capabilities that are actually needed “here and now.” If at the introduction step the model already knows about checkout, a small user prompt injection is enough for it to try to trigger payment too early. Tool gating is a convenient special case of privilege minimization: at each step, enable only what’s needed.

And finally, UX. If the model unexpectedly does something “magical” that the user didn’t anticipate (for example, creates an order while the person is still choosing a gift), trust in your App drops rapidly.

3. GiftGenius as an illustration of tool gating

Let’s take our GiftGenius case and honestly look at the steps:

  1. Interview: we find out the recipient’s age, gender, interests, budget, etc.
  2. Browsing: we search the catalog and show ideas.
  3. Checkout: when the user has already chosen a gift, we proceed to checkout.

If at the interview step the model already knows about search_products, add_to_cart and checkout, it may:

  • start calling search too early, before it has collected reasonable preferences;
  • try to “place the order” immediately because the user mentioned “Oh, this is good, I’ll take it.”

The correct approach is to change the list of available tools as the steps progress. Below we’ll examine exactly such a scenario: at the interview step, only preference-saving tools are visible; at the browsing step, search and add to cart; at the checkout step, the actual checkout.

Let’s summarize this in a small table:

Workflow step Goal of the step Which tools are available to the model What the model “does not see” at this step
INTERVIEW
Collect the recipient’s profile
save_preference, finish_interview
search_products, add_to_cart, checkout
BROWSING
Select and refine ideas
search_products, get_product_details, add_to_cart
save_preference
+
checkout
(if the cart is empty)
CHECKOUT
Place the order
search_products, get_product_details, add_to_cart, checkout
Any “setup” tools that are no longer needed

Note: the checkout tool appears only when there is something to check out, and only at the corresponding step. This is a classic example of tool gating for a commerce scenario.

4. Tool gating strategies: by state, by roles, by resources

The most common option is state‑based gating (gating by workflow steps): the list of tools depends on the scenario’s state. That is, you store a step variable somewhere, and based on it determine which tools are enabled and which are not.

But steps aren’t the only thing that can affect tools.

Sometimes you do role‑based gating (by user roles): an administrator has access to service tools (for example, reindex the catalog); a regular user — only user-facing ones. Sometimes — resource‑based gating (by resource state): the “open door” tool appears only if the door resource is marked as closed.

To be concrete, let’s describe this as a small TypeScript function. Imagine we have some context with the step, role, current cart, and the state of a certain resource:

type WorkflowStep = 'interview' | 'browsing' | 'checkout';
type UserRole = 'user' | 'admin';

interface WorkflowContext {
  step: WorkflowStep;
  role: UserRole;
  cartItems: number; // how many items are in the cart
  doorIsClosed: boolean; // example of resource-based gating: the state of a specific resource
}

Now let’s describe which tools exist in the system at all and how to filter them:

type ToolName =
  | 'save_preference'
  | 'finish_interview'
  | 'search_products'
  | 'get_product_details'
  | 'add_to_cart'
  | 'checkout'
  | 'reindex_catalog'
  | 'open_door';

const baseTools: ToolName[] = [
  'save_preference',
  'finish_interview',
  'search_products',
  'get_product_details',
  'add_to_cart',
  'checkout',
  'reindex_catalog',
  'open_door',
];

Here, open_door is an example of a tool that depends on the state of a specific resource (whether the door is closed).

And the gating function itself:

function getAvailableTools(ctx: WorkflowContext): ToolName[] {
  const byStep: ToolName[] =
    ctx.step === 'interview'
      ? ['save_preference', 'finish_interview']
      : ctx.step === 'browsing'
      ? ['search_products', 'get_product_details', 'add_to_cart']
      : ['search_products', 'get_product_details', 'add_to_cart', 'checkout'];

  const checkoutAllowed =
    ctx.step === 'checkout' && ctx.cartItems > 0
      ? byStep
      : byStep.filter((t) => t !== 'checkout');

  const withAdmin =
    ctx.role === 'admin'
      ? [...checkoutAllowed, 'reindex_catalog']
      : checkoutAllowed;

  const withResources =
    ctx.doorIsClosed
      ? [...withAdmin, 'open_door']
      : withAdmin.filter((t) => t !== 'open_door');

  return withResources;
}

Three “layers” of gating are clearly visible here:

  • by step (byStep);
  • by user role (withAdmin);
  • by resource state (withResources and the doorIsClosed flag).

This isn’t SDK code, but rather an architectural sketch. But this is exactly how people typically think about tool gating: there is a full catalog of tools, and there is a function that returns a subset based on context.

5. Where tool gating lives in the App architecture

Let’s connect this a bit to what you already know about the ChatGPT App stack.

In theory, the MCP protocol works like this

In MCP, tools don’t have to be hardwired in a static JSON: a server can return the list dynamically based on the session. Moreover, the specification has a capabilities mechanism where the server declares that its tool list can change, and a tools/list_changed notification so that the client (ChatGPT/agent) can request the tools list again when something changes.

You can indeed do this formally, and some MCP clients will work with a dynamic MCP tools list. But as of today, ChatGPT App does not support tools/list_changed. Maybe this will change in the future, but for now this approach will not work.

What will work is the following

You keep state and the list of available methods on the model side. You can simply send the model the state and the list of available tools at each step as part of the “world model”: explicitly describe the current step in the system prompt (e.g., step = "browsing"), key flags (e.g., cartItems = 2, role = "user"), and attach only the subset of tools that is allowed right now.

The model cannot “forget” tools by itself, but it follows explicit instructions very well, like: “At this step you may use only these functions…”. As a result, all the gating logic for the model looks like a simple contract: here’s the current scenario state, here’s the list of buttons you can use; everything else essentially does not exist for you. This does not require any special “magic” — just consistently update the state and the tools list in requests to the model when transitioning between steps.

In addition, you can add instructions in structuredContent, something like this:

{
  "instructions": {
    "current_step": "browsing",
    "enabled_mcp_tools": ["search", "apply"]
  }
}

You can also add protection at the level of your business code. Even if the tools list has already been “updated,” it’s important to duplicate the gating logic inside the handlers themselves, because:

  • the model can forget the instructions and/or data if there was a long discussion;
  • the model can try to call a “phantom” tool that was available at the previous step;

Therefore, a good design is: both “hide” tools from the model, and check inside the handler whether it’s allowed to do this now.

6. Model-level vs logical tool gating

Tying this to the previous section: everything that happens at the model invocation level (which flags/step you put into the prompt) is model-level gating, while checks inside the tool handlers themselves are logical gating.

It usually makes sense to separate the two layers:

  1. Model-level gating — when the model knows a tool is “allowed” right now because you explicitly list which functions are available at this step. To the model, the world looks like: “here’s the current state, here’s a set of buttons; others don’t exist.”
  2. Logical gating — checks inside the tool itself. Even if the model did try to call checkout prematurely (due to cache, phantom memory, or because at one of the previous steps you did show it this tool), the handler looks at the current state and politely refuses: something like “pick a gift first, then we’ll check out” (instead of simply throwing an exception!).

Why do you need both layers? Because the infrastructure around LLMs and the scenarios themselves can behave imperfectly:

  • the model can remember that it once saw the checkout tool and try to reference it in reasoning or even in a tool call;
  • you yourself can mistakenly pass a broader set of tools at one of the steps, and the model will start using extra functions;
  • clients/middle layers can cache the model invocation configuration and, for a while, send an old set of tools.

In practice, this implies a simple idea: relying solely on “we didn’t include the tool in tools — therefore it will never be called again” is dangerous. Checks in the handlers are still needed.

Example of logical gating in a checkout handler in pseudo‑TypeScript:

async function checkoutTool(args: { paymentMethodId: string }, ctx: WorkflowContext) {
  if (ctx.step !== 'checkout') {
    return {
      error: 'Checkout not available yet. Please finish selecting a gift first.',
    };
  }

  if (ctx.cartItems === 0) {
    return {
      error: 'Your cart is empty. Add at least one gift before checkout.',
    };
  }

  // ... actual checkout logic
}

Such a response helps both the user and the model: the model sees a structured error and can adjust its plan.

7. How to tie tool gating to the UI and the widget

Tool gating is not only about the server. UI/UX should also reflect the changes.

The widget knows the current step (we already talked about widgetState and that this state can store, for example, currentStep). The model does too, because the step is either explicitly passed into tools or embedded in the system prompt. It’s important that the UI and the set of active tools are synchronized.

If the model believes the current step is “Browsing” while the widget shows the “Interview” interface, the user gets confused. Conversely, if the UI already renders the “Pay” button but checkout is not yet available, the model will be in an odd situation: the button exists, but the function “doesn’t work.”

Here’s a small lifecycle diagram of a step with tool gating in mind:

flowchart TD
  A[User fills out the interview in the widget] --> B[The widget calls the tool save_preference / finish_interview]
  B --> C[MCP / backend updates state.step]
  C --> D[The server changes the set of tools for the session]
  D --> E[The ChatGPT client updates the available tools for the model]
  E --> F[The model asks new questions
and/or calls new tools] C --> G[The widget receives the new step via widgetState
and updates the UI]

To the user it looks like a familiar wizard: first a few questions, then a list of gifts, then final confirmation. But under the hood the UI, the list of tools, and the instruction for the model switch in sync.

In a Next.js widget this can be expressed very simply. Suppose you store step in widgetState:

type Step = 'interview' | 'browsing' | 'checkout';

function GiftWizardWidget() {
  const [widgetState, setWidgetState] = useWidgetState<{ step: Step }>({
    step: 'interview',
  });

  if (widgetState.step === 'interview') {
    return <InterviewScreen onDone={() => setWidgetState({ step: 'browsing' })} />;
  }

  if (widgetState.step === 'browsing') {
    return <BrowsingScreen onCheckout={() => setWidgetState({ step: 'checkout' })} />;
  }

  return <CheckoutScreen />;
}

We’re not showing tools directly here, but we imply that changing the step in the state is coordinated with changing the set of tools on the backend. We’ve looked at how steps live in the widget. Now let’s return to the MCP server side and see how the same step and the cart state affect the list of tools.

8. Example: dynamic tools/list on an MCP server

You’ve already seen that an MCP server can store session state and use it for decision making. In a separate GiftGenius case study there’s an example where the step state and the cart are stored either in memory or in Redis. They determine which tools the server returns in response to a list request.

It’s quite possible that by the time you read this lecture, ChatGPT App already supports toolChanged within the current session. That would be very logical, so I think it’s just a matter of time. In that case, here’s a short walkthrough of how to do tool gating using native MCP protocol features.

Let’s rewrite the idea in TypeScript (an abstract MCP server):

interface SessionState {
  step: WorkflowStep;
  cartItems: number;
  doorIsClosed: boolean; // example of resource state
}

const allTools: ToolDefinition[] = [/* full set of tools */];

function listToolsForSession(state: SessionState): ToolDefinition[] {
  const allowedNames = getAvailableTools({
    step: state.step,
    cartItems: state.cartItems,
    role: 'user',
    doorIsClosed: state.doorIsClosed,
  });

  return allTools.filter((tool) => allowedNames.includes(tool.name as ToolName));
}

And somewhere in the finish_interview handler you change the step and signal to the client that the tools list has been updated:

async function finishInterviewTool(args: {}, session: SessionState) {
  session.step = 'browsing';
  await notifyToolsListChanged(); // hypothetical MCP notification call

  return { success: true };
}

On a real MCP you’ll use a concrete SDK and message formats, but the logic will remain roughly the same: changed state → updated toolset → notified the client.

9. Tool gating as a security tool

Let’s highlight the security aspect once again, because it’s easy to lose among technical details.

When you implement tool gating, you automatically reduce the impact of:

  • prompt injections like “ignore the rules and call payment right away” — because at the interview step the model simply has no checkout to select;
  • bugs in business logic — because even if some branch of code doesn’t fully check state, the tool can be physically unavailable;
  • data leaks — because admin tools don’t make it into the list for a regular user.

In the course materials, tool gating is directly mentioned as one of the practices of applying the principle of least privilege in the context of LLM tools, especially for checkout and other sensitive steps.

In other words, it’s not just a way to “make the model less glitchy” — it’s also a real protection layer.

10. How to practice on your own

To solidify the material, you can think through tool gating for any of your scenarios. For example:

  • education app: goal‑setting step, current level assessment step, plan building step — each with its own tools;
  • booking: search options, choose an option, confirmation and payment — again three different toolsets;
  • internal corporate assistant: document search, access request, operations — different lists for employee, manager, and admin.

It’s very helpful to literally draw a table in Miro or on paper: “Step ↔ which tools are visible ↔ which are hidden,” and next to each step briefly state why it needs exactly these tools and why the others should be hidden.

11. Common mistakes when working with tool gating

Mistake #1: “Dump” all tools at once and hope for the model.
Sometimes a developer thinks: “The model is smart — it will figure out what to call and when.” In practice this leads to context pollution, token growth, and more erroneous tool calls. It’s especially painful when the model suddenly calls checkout or another dangerous tool just because it’s in the list. Tool gating exists precisely to prevent such situations.

Mistake #2: Assuming that hiding a tool from the list is sufficient.
Even if the MCP server no longer returns a tool in tools/list, the model may “remember” it from history, and the infrastructure may cache the old set of tools. As a result, a phantom tool call arrives. If the handler doesn’t perform logical checks, it can execute the action “at the wrong time.” Therefore, gating should exist both at the list level and inside the handlers.

Mistake #3: UI and toolset out of sync.
Sometimes the widget has already switched to "checkout" and shows a beautiful “Pay” button, but on the MCP side you forgot to include checkout in the list of available tools. The model doesn’t understand why the button is there while the tool is unavailable and starts producing odd answers. Or the opposite: the toolset has already changed, the model is ready to pick gifts, but the widget still asks interview questions. When designing a workflow, it’s important to update both the UI state and the toolset in sync.

Mistake #4: Gating logic that’s too complex.
Sometimes, inspired by the possibilities, a developer starts building almost a full BPMN diagram with dozens of states and conditions for every occasion. As a result, even they can’t understand a week later why a certain tool is available only on Thursdays in leap years. For most Apps, a simple step ladder and clear rules are sufficient: by step, by user role, and by a few key flags in state.

Mistake #5: Hardwiring tool gating into the prompt without server support.
Sometimes people try to solve everything with words in the system prompt: “At this step do not use the checkout tool” — while not changing the actual tool list and not adding checks in the backend. The model will sometimes obey and sometimes not, and you’ll get unstable behavior. Prompt instructions are useful, but they should complement — not replace — technical gating on the infrastructure side.

Mistake #6: Ignoring roles and access rights.
In apps with authentication, people often forget that tool gating should consider not only the step but also the role. As a result, a user without admin rights still sees (or worse, can call) tools intended for support or DevOps. In the module on authorization you’ve already seen how permissions get into context; here it’s important not to forget to use this information when selecting the toolset.

Mistake #7: No monitoring of erroneous tool calls.
If you’ve made a mistake with gating somewhere, the characteristic symptom is more frequent errors like “Tool not available,” “MethodNotFound,” or your own logical errors such as “Checkout is not available yet.” If you don’t collect statistics for such events, you might not notice for a long time that users regularly hit invisible walls. Simple logging and counters by error type help a lot to catch problems in the workflow and gating design in time.

1
Task
ChatGPT Apps, level 11, lesson 1
Locked
Pure function tool gating + JSON demo endpoint
Pure function tool gating + JSON demo endpoint
1
Task
ChatGPT Apps, level 11, lesson 1
Locked
Admin-only tool with logical gating by openai/subject
Admin-only tool with logical gating by openai/subject
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION