CodeGym /Courses /ChatGPT Apps /Fullscreen and PiP: wizards, complex content, video + cha...

Fullscreen and PiP: wizards, complex content, video + chat

ChatGPT Apps
Level 8 , Lesson 2
Available

2. Why do we need fullscreen at all if we have inline?

In the previous lecture about inline we already agreed: if a task is short and fits into 5–7 items or a single screen, an inline card is the perfect choice. A list of a few gifts, a couple of filters, one or two buttons — all of this lives perfectly well right in the message stream.

But every app reaches a point where “one more card” no longer helps:

  • you need to collect many parameters (recipient profile, delivery constraints, payment methods);
  • you need a multi-step wizard;
  • you have large tables, charts, maps, long descriptions.

Inline starts to struggle here: the width is limited by the chat column, the height is limited too, there’s no navigation, and the chat has a single scroll. For these scenarios the Apps SDK provides a fullscreen mode — an immersive interface where your widget takes up most of the screen and can display a complex layout.

Our second hero today is PiP, a small floating window that lives on top of the chat. Its typical roles: background task status, mini player, timer, progress indicator. PiP is ideal when something long is running in the background while the user keeps talking to GPT.

It’s important to remember: both fullscreen and PiP are not a replacement for inline but a layer on top. Start with inline; move to fullscreen when inline gets cramped; switch to PiP when everything interesting is already running and you just need to keep the status “in sight.”

3. Technical foundation: displayMode and switching modes

From the Apps SDK perspective, your widget has a current display statedisplayMode. At the time of writing there are three main modes: "inline", "fullscreen", and "pip" (picture-in-picture).

The host (ChatGPT) provides your widget with the current mode via global data in window.openai and special hooks from the SDK. In a typical React template you might have something like:

// alias from the Apps SDK template
const mode = useDisplayMode(); // 'inline' | 'fullscreen' | 'pip'

if (mode === "fullscreen") {
  // render our wizard
} else {
  // render a compact inline UI
}

The SDK also provides the method window.openai.requestDisplayMode({ mode }) and/or the hook useRequestDisplayMode to ask the host to switch modes. This method returns a promise with the actually set mode, because the platform can refuse or adjust your request (for example, PiP on mobile almost always becomes fullscreen).

A mode lifecycle can be sketched like this:

stateDiagram-v2
    [*] --> Inline
    Inline --> Fullscreen: requestDisplayMode('fullscreen')
    Fullscreen --> Inline: requestDisplayMode('inline') / "Back" button
    Fullscreen --> PiP: requestDisplayMode('pip')
    PiP --> Fullscreen: "Expand"
    PiP --> Inline: task completion

Actual names and the exact set of modes may change across SDK versions, so in production you should always double-check the documentation rather than rely on “how it was in the course.”

4. The first switch: add a “Expand to fullscreen” button

Let’s start small: take our existing inline widget, GiftGenius — a training App from previous modules that currently shows 3–5 gift cards — and add a “Open detailed selection” button to switch to fullscreen.

Assume the template includes two hooks:

import { useDisplayMode, useRequestDisplayMode } from "@/sdk/display";

export const GiftGeniusWidget: React.FC = () => {
  const mode = useDisplayMode();
  const requestDisplayMode = useRequestDisplayMode();

  if (mode === "fullscreen") {
    return <GiftFullscreenWizard />;
  }

  return (
    <InlineGiftPreview
      onExpand={async () => {
        await requestDisplayMode({ mode: "fullscreen" });
      }}
    />
  );
};

Here, InlineGiftPreview is our current inline UI, and GiftFullscreenWizard is the new wizard component we’re about to design. In the onExpand handler we not only call requestDisplayMode but also await the promise — this will let us react later to a refusal (for example, show a message if fullscreen is unavailable for some reason).

The InlineGiftPreview itself is straightforward:

type InlineGiftPreviewProps = {
  onExpand: () => void;
};

const InlineGiftPreview: React.FC<InlineGiftPreviewProps> = ({ onExpand }) => {
  return (
    <div>
      <h3>Gift selection</h3>
      {/* ...gift cards... */}
      <button onClick={onExpand}>Open detailed selection</button>
    </div>
  );
};

So far this looks a lot like “open a modal,” but the difference is that it’s controlled not by your React app, but by the ChatGPT host application, which can show a title, system “Back” buttons, etc.

5. Designing the GiftGenius fullscreen wizard

Now let’s design a fullscreen gift-selection wizard. From a UX standpoint it makes sense to split the process into several logical steps. For example:

  1. Who is the recipient and what’s the occasion.
  2. Budget and gift type (physical, experiences, digital).
  3. Review and confirm the selection.

In code, you can express this as a simple state machine over steps:

type WizardStep = "recipient" | "preferences" | "review";

type WizardState = {
  step: WizardStep;
  recipient?: { ageRange: string; relation: string };
  preferences?: { budget: number; categories: string[] };
};

Create a GiftFullscreenWizard component that keeps this state in React and renders the appropriate screen.

const GiftFullscreenWizard: React.FC = () => {
  const [state, setState] = useState<WizardState>({ step: "recipient" });

  const goNext = (partial: Partial<WizardState>) => {
    setState((prev) => ({ ...prev, ...partial }));
  };

  if (state.step === "recipient") {
    return <RecipientStep state={state} onNext={goNext} />;
  }

  if (state.step === "preferences") {
    return <PreferencesStep state={state} onNext={goNext} />;
  }

  return <ReviewStep state={state} />;
};

Each step is a small component with a form. For example, the first step:

type StepProps = {
  state: WizardState;
  onNext: (partial: Partial<WizardState>) => void;
};

const RecipientStep: React.FC<StepProps> = ({ state, onNext }) => {
  const [relation, setRelation] = useState(state.recipient?.relation ?? "");
  const [ageRange, setAgeRange] = useState(state.recipient?.ageRange ?? "");

  return (
    <div>
      <h2>Who are we choosing a gift for?</h2>
      <input
        placeholder="Who is this person to you?"
        value={relation}
        onChange={(e) => setRelation(e.target.value)}
      />
      <input
        placeholder="Age (e.g., 25–34)"
        value={ageRange}
        onChange={(e) => setAgeRange(e.target.value)}
      />
      <button
        onClick={() =>
          onNext({
            recipient: { relation, ageRange },
            step: "preferences",
          })
        }
      >
        Next
      </button>
    </div>
  );
};

On the second step we collect budget and categories; on the third — we call a callTool / MCP tool that already knows how to select gifts for these parameters and show the results.

It’s important that on a fullscreen screen we have room for:

  • a progress bar or stepper;
  • more detailed fields and hints;
  • error states (“something went wrong, please try again”).

A recommendation from UX guidelines: keep each step as simple as possible, without overloading fields; better 3–4 clear steps than one monster form.

6. Fullscreen wizard UX: progress, errors, back

Simply rendering a full-screen form is only half the work. The user needs to:

  • understand which step they are on;
  • have the ability to go back;
  • see what’s happening during long operations.

The simplest stepper can be implemented purely visually:

const Stepper: React.FC<{ step: WizardStep }> = ({ step }) => {
  const index = step === "recipient" ? 1 : step === "preferences" ? 2 : 3;
  return <p>Step {index} of 3</p>;
};

And just insert Stepper on every screen. A more advanced option is to render a horizontal “ladder” of steps, but that’s out of scope for this course.

An important point is error handling. Suppose on the last step we call the tool search_gifts:

const ReviewStep: React.FC<StepProps> = ({ state }) => {
  const [loading, setLoading] = useState(false);
  const [error, setError] = useState<string | null>(null);

  const handleConfirm = async () => {
    setLoading(true);
    setError(null);
    try {
      await callTool("search_gifts", {
        recipient: state.recipient,
        preferences: state.preferences,
      });
      // Results will appear later in the chat / widget
    } catch (e) {
      setError("Failed to find gifts, please try again.");
    } finally {
      setLoading(false);
    }
  };

  return (
    <div>
      {/* show a summary of parameters */}
      {error && <p style={{ color: "red" }}>{error}</p>}
      <button disabled={loading} onClick={handleConfirm}>
        {loading ? "Finding…" : "Confirm and find"}
      </button>
    </div>
  );
};

From an accessibility standpoint, make sure that:

  • in fullscreen, large “Next,” “Back,” and “Cancel” buttons are easy to click;
  • text has adequate contrast;
  • you can tab through all interactive elements in order.

If you can, add an aria-label for nonstandard controls (for example, custom category toggles). While this course isn’t a WCAG exam, basic attention to a11y will help you pass Store review later without extra pain.

In the end, a fullscreen wizard solves complex multi-step scenarios: it provides room for forms, progress, and errors. But an app’s life doesn’t end there — many tasks continue “in the background.” For this, we have the second mode — PiP, which we’ll cover next.

7. What PiP is in the ChatGPT world and why it’s “picky”

We figured out how to use fullscreen for complex scenarios. Now let’s look at the opposite case — when everything important is already running and you only need to keep the progress under control. This is where PiP comes in.

On the web, “picture-in-picture” is usually associated with video hanging in the corner of the screen on top of the content. In ChatGPT, PiP is a small floating widget window that remains visible while scrolling the chat and can show status, progress, or a compact UI.

Several important nuances you should know from the documentation and early adopters’ experience:

  1. PiP has very little space. It is not a place for forms and complex layouts, but rather for two or three key metrics and one or two buttons.
  2. On desktop, PiP “sticks” to the top and stays visible regardless of scroll; on mobile, it often automatically turns into fullscreen.
  3. A requestDisplayMode with mode "pip" does not guarantee real PiP. The platform may return a different mode (for example, fullscreen) or behave oddly on older SDK versions, so always check the promise result and have a fallback.

The simple UX takeaway: show only what’s essential in PiP. A timer, delivery indicator, task status, an “Expand” button. No 12 checkboxes, 10-column tables, or “make me more coffee.”

8. GiftGenius + PiP: long search and background progress

Back to GiftGenius. Imagine this scenario: the user has completed the fullscreen wizard, clicked “Confirm,” and now your backend is launching a fairly heavy selection — perhaps via an MCP server you call several external APIs, recalculate prices, and apply a bunch of filters. This could take, say, 10–20 seconds.

From a UX perspective, you don’t want to keep the user in fullscreen with a spinner for 20 seconds. Better to:

  1. Start the selection.
  2. Minimize the interface to PiP, showing progress.
  3. Let the user continue chatting (e.g., ask clarifying questions).
  4. After completion — return the result inline or open a new fullscreen with gifts.

Let’s make a small hook to manage this behavior:

const useLongGiftJob = () => {
  const [status, setStatus] = useState<"idle" | "running" | "done">("idle");
  const requestDisplayMode = useRequestDisplayMode();

  const startJob = async (payload: any) => {
    setStatus("running");
    const resultMode = await requestDisplayMode({ mode: "pip" });
    console.log("Actual mode:", resultMode.mode);

    await callTool("run_gift_job", payload);
    setStatus("done");
    await requestDisplayMode({ mode: "inline" });
  };

  return { status, startJob };
};

Now in ReviewStep we use this hook instead of a direct callTool:

const ReviewStep: React.FC<StepProps> = ({ state }) => {
  const { status, startJob } = useLongGiftJob();

  return (
    <div>
      {/* ...summary... */}
      <button
        disabled={status === "running"}
        onClick={() => startJob(state)}
      >
        {status === "running" ? "Finding gifts…" : "Start selection"}
      </button>
    </div>
  );
};

To make the background task status available to both the fullscreen wizard and the PiP window, in real code it makes sense to move useLongGiftJob into a context and read it through useLongGiftJobContext. We’ll skip the context details (Provider, createContext): the important part is that the job state lives in one place while different UI layers simply subscribe to it.

And a separate component for the PiP view:

const GiftPipView: React.FC<{ status: string }> = ({ status }) => {
  return (
    <div>
      <p>GiftGenius is working…</p>
      <p>Status: {status === "running" ? "in progress" : "ready"}</p>
      <button
        onClick={() => window.openai.requestDisplayMode({ mode: "fullscreen" })}
      >
        Expand
      </button>
    </div>
  );
};

In the main widget, we’ll adjust rendering to account for PiP as well:

const GiftGeniusWidget: React.FC = () => {
  const mode = useDisplayMode();
  const { status } = useLongGiftJobContext(); // via context, as discussed above

  if (mode === "pip") {
    return <GiftPipView status={status} />;
  }

  if (mode === "fullscreen") {
    return <GiftFullscreenWizard />;
  }

  return <InlineGiftPreview onExpand={/* as before */} />;
};

This scenario pairs nicely with voice modes (we’ll discuss them in the voice lecture): we start the selection by voice, PiP shows progress, and the chat remains below and keeps going.

9. Video + chat: when fullscreen and PiP turn into a media player

Historically, PiP is most often associated with video that sits in the corner of the screen. So it’s logical to look separately at the “video + chat” scenario. There’s no magic here either: in most cases you simply display a video in a fullscreen or PiP window. OpenAI’s documentation directly cites media scenarios as a typical use of fullscreen and PiP.

What could this mean for GiftGenius? For example:

  • you show a promo video of a gift;
  • a short tutorial on “how to wrap a gift beautifully”;
  • a video review of several products.

In fullscreen you can render a full <video> with a description and recommendations; in PiP — keep only the player and perhaps a small title.

The simplest wrapper component:

const GiftVideoPlayer: React.FC<{ src: string; title: string }> = ({
  src,
  title,
}) => (
  <div>
    <h3>{title}</h3>
    <video
      src={src}
      controls
      style={{ width: "100%", borderRadius: 8 }}
    />
  </div>
);

In the fullscreen wizard we can offer the user to “Watch a video review of this gift,” then minimize it to PiP:

const WatchVideoStep: React.FC = () => {
  const requestDisplayMode = useRequestDisplayMode();

  return (
    <div>
      <GiftVideoPlayer src="/videos/gift-wrap.mp4" title="How to wrap a gift" />
      <button
        onClick={() => requestDisplayMode({ mode: "pip" })}
      >
        Keep the video in the corner and return to the chat
      </button>
    </div>
  );
};

A couple of practical tips for media scenarios:

  • don’t enable autoplay with sound — it’s a universal UX anti-pattern;
  • ensure subtitles and the ability to pause via keyboard (space, arrows);
  • in a PiP window, don’t try to show all accompanying text; limit it to the video itself.

10. State, widget remounts, and mobile specifics

The most unpleasant question that usually comes up at this point: “Will React state persist if I switch from inline to fullscreen and back?”

The short answer: don’t rely on it.

Technically, behavior depends on the SDK version and host implementation: in some cases switching modes occurs without recreating the iframe; in others, the widget unmounts and mounts again. The documentation explicitly emphasizes that preserving context when changing modes depends on the specific SDK implementation and version and is not guaranteed for developers.

Practical approach:

  1. Store all critical state (wizard step, entered data, background job ID) either:
    • in the backend (via your MCP server and session tokens),
    • or in the ChatGPT context (for example, via tools that return the “current workflow state”),
    • or in URL parameters/local storage, if you have a safe reason to do so.
  2. Use React state as a cache/UI layer, but be ready for it to reset when switching modes — then you restore it from a more reliable source.

A second nuance concerns the result of requestDisplayMode. As mentioned, a request with mode "pip" may come back as "fullscreen", especially on mobile where real PiP may not be supported or may automatically expand to full screen.

Typical pattern:

const requestDisplayMode = useRequestDisplayMode();

const openPipSafe = async () => {
  const result = await requestDisplayMode({ mode: "pip" });
  if (result.mode !== "pip") {
    // Fallback: e.g., show a message or adapt UI for fullscreen
    console.log("PiP is unavailable, working in mode:", result.mode);
  }
};

This way you won’t end up in a situation where you expected a small window but got a fullscreen UI with “PiP-specific” buttons. In that mode, such an interface will look odd.

Finally, remember maxHeight and inner scrolling: even in fullscreen the host can limit the container height, and your job is to set up scrolling so that you don’t end up with three nested scrollbars.

11. Common mistakes when working with fullscreen and PiP

Mistake #1: Fullscreen as the default mode.
Some developers see “fullscreen” and immediately try to turn their App into a separate SPA inside the chat. As a result, any mention of gifts — and the user gets thrown into a full-screen wizard, even if they just wanted a couple of ideas. OpenAI guidelines strongly recommend starting with inline and only expanding to fullscreen when it’s objectively necessary.

Mistake #2: PiP as a tiny fullscreen.
PiP has very limited space, yet sometimes people try to cram everything into it: tabs, forms, filters. Users get a microscopic interface they can’t click accurately. The right approach is to show only status and one or two key buttons in PiP (for example, “Expand” and “Cancel”).

Mistake #3: Unexplained transitions between modes.
When a widget suddenly expands to fullscreen without a GPT message or an explicit user click, it’s disorienting. The same applies to auto-minimizing to PiP or returning to inline. Every transition should be accompanied by a short explanation in a model message: “I’ll open a detailed wizard now” before fullscreen, “I’ll minimize the selection to a small window while it runs” before PiP.

Mistake #4: Ignoring mobile and platform differences.
A developer tests only on desktop where PiP behaves as expected, then on mobile everything turns into fullscreen, the layout breaks, and buttons end up outside the safe area. The documentation explicitly warns that PiP on mobile may be implemented as fullscreen, and behavior may change across SDK versions, so testing on target devices and careful handling of requestDisplayMode are mandatory.

Mistake #5: Blind faith in state persistence across mode switches.
Relying solely on React state without any server-side/persistent support leads to awkward situations: the user completed two wizard steps, clicked “Minimize to PiP,” and after returning ended up at the first step with empty fields. It’s better to assume your component may be unmounted when changing modes and design state management with that risk in mind.

Mistake #6: Forgetting fullscreen wizard accessibility.
A beautiful full-screen form isn’t always convenient for people with low vision or those who use only a keyboard. Text that’s too small, low contrast, unreadable “Next” and “Back” buttons — all are common reasons for poor UX and Store review problems. Check at least the basics: text contrast, font size, Tab navigation, and clear text labels for buttons.

1
Task
ChatGPT Apps, level 8, lesson 2
Locked
Mode Lab — displayMode indicator and the result of a switch request
Mode Lab — displayMode indicator and the result of a switch request
1
Task
ChatGPT Apps, level 8, lesson 2
Locked
3-step fullscreen wizard with progress saved in widgetState
3-step fullscreen wizard with progress saved in widgetState
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION