# How the visualizer thinks

The model that renders your shot is given a 3D scene plus a structured prompt – not just a string of text. Understanding that split is what lets you direct the output instead of rolling the dice.

## Why it exists

If you've used Midjourney, Sora, Runway, ChatGPT image generation, Stable Diffusion, or stock Flux, you've felt the prompt-roulette problem. You write what you want, the model gives you something close, you nudge the prompt, the model gives you something else close. Composition shifts every iteration. The car you wanted in profile is now three-quarter. The actor you placed center-frame is now off to the right.

That happens because a prompt-only model has no spatial truth. It's hallucinating arrangement every time. Intangible's visualizer is given the spatial truth as input – the actual 3D scene with the actual camera – and asked to render it in a particular style. The result is that the composition stays where you put it, while the surface treatment is what the model varies.

## How is this different from prompt-only tools like Midjourney?

|                          | Prompt-only tools (Midjourney, Sora, Runway, ChatGPT image, stock diffusion) | Intangible                                            |
| ------------------------ | ---------------------------------------------------------------------------- | ----------------------------------------------------- |
| Input the model receives | A text prompt                                                                | A 3D scene plus a structured prompt                   |
| Composition control      | Implicit, via prose                                                          | Explicit, via the 3D camera                           |
| Cross-shot consistency   | Hard or impossible                                                           | Built in (same scene, same references)                |
| Reference fidelity       | One image at a time, often                                                   | Per-object reference images, persistent across shots  |
| Iteration cost           | Re-prompt and re-roll                                                        | Adjust the scene, the camera, or the model            |
| When the render is wrong | Re-prompt, hope, repeat                                                      | Find the wrong thing in the scene, fix it, regenerate |

The shift in mental model is the point. In prompt-only tools, the prompt is the brief. In Intangible, the scene is the brief and the prompt is one piece of it. The model can't drift on what it's been handed structurally.

## What it looks like in the product

In Visualize mode, the right panel shows three tabs: **Scene**, **Style**, **Lighting**.

The **Scene** tab is what the visualizer thinks the scene contains, in three structured sections that get assembled into the prompt:

* **\[Scene Context]**: the overall framing – what kind of shot, what mood, what location.
* **\[Environment & Props]**: the physical setting and the secondary objects.
* **\[Subjects]**: the hero objects – the car, the character, the product. Names, descriptions, image references all flow in here.

Click **Generate prompt** and the model regenerates this from the current 3D scene. Click any text block and you can edit it directly – the visualizer will use your edited version on the next render.

The **Style** and **Lighting** tabs are the surface treatment. A still photograph at golden hour with shallow depth of field is different from a storyboard sketch in pen and ink, even of the same scene. Style and Lighting let you swap the surface without touching the spatial truth.

## How it connects to the rest

* **Build mode feeds the \[Scene Context] and \[Environment & Props].** Add a populator labeled "tropical forest" and the visualizer will reflect that in the prompt. Drop in an asset named "Lamborghini Revuelto" and the \[Subjects] block will mention it.
* **The Compose mode camera and lens feed the \[Scene Context].** A 50mm at eye level produces a different scene-context line than an 18mm low-angle.
* **Image references on objects feed the \[Subjects] block more strongly than descriptions.** A reference image of the actual car beats "matte olive green Jeep" every time.
* **Style and Lighting presets feed the Style/Lighting tabs.** They're cinematic shorthand – attach a preset, the surface treatment locks in.

## When to reach for it

When a render looks compositionally wrong (car in the wrong place, character cropped out), the fix is upstream in Build or Compose, not in the prompt. When a render looks compositionally right but stylistically off (too clean, too flat, wrong palette), the fix is in Style and Lighting, or in the model picker.

If the \[Subjects] block reads like the model didn't see your hero asset, attach an image reference to that object and regenerate. The model is honest about what it's getting; the prompt is the diagnostic.

## Further viewing

Phil walks through this concept in depth at the 2026 Production Summit (32 minutes).

{% embed url="<https://www.youtube.com/watch?v=Y3CoibxA_ag>" %}

## Related

* [Auto-prompt](/visualize/auto-prompt.md)
* [Scene context](/visualize/scene-context.md)
* [Style presets](/visualize/style-presets.md)
* [Lighting presets](/visualize/lighting-presets.md)
* [Image reference](/overview/concepts/image-reference.md)
* [Models](/visualize/ai-models.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://help.intangible.ai/overview/concepts/how-the-visualizer-thinks.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.