> For the complete documentation index, see [llms.txt](https://help.intangible.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://help.intangible.ai/build/image-to-scene-and-shot.md).

# Image to scene and shot

Drop a photograph in. The system reads it, builds a 3D scene that approximates the contents, and creates a Compose-mode shot that matches the photograph's framing. A faster way to riff on a real reference than building from scratch.

![Image to scene and shot in progress – a reference photograph dropped into the AI Composer, the scene assembling with auto-placed objects and a shot composed to match the photograph's framing](/files/h1H0bAvqUL3uKZiPj5bi)

## What it does

A common agency request: *"Match this reference."* Traditionally you'd open a blank 3D scene, drag in the rough props, place a camera, and try to match the framing of the reference photograph by eye. Image to scene and shot collapses that into one upload.

The system reads the input image for content and composition, instantiates a populator-style scene that approximates what's in the photograph, and creates a shot in Compose mode whose camera is positioned to roughly match the framing of the original. From there you tune – swap props, add hero assets, refine the camera, attach image references on the items that matter.

It is not a one-shot pixel-perfect recreation. It's a fast starting state. Results vary significantly depending on the photograph – a clean overhead or near-orthographic shot of a room will produce a much better starting state than a wide-angle interior taken from eye level. Complex, cluttered, or low-angle images tend to produce noisier results.

Plan availability is listed at [Plans and billing](/teams-and-billing/plans-and-billing.md).

## How to use it

1. **In Build mode, in the AI Composer prompt bar at the bottom of the viewport**, click the **image** button next to the text input. (Same composer you'd use to type "city block at sunset" – this time you upload the reference image instead of writing the prompt.)
2. **Upload a photograph.** JPEG or PNG. The clearer the contents, the better the scene match.
3. **Wait for processing.** The system runs vision analysis, picks library assets that match the recognized objects, places them in a populator-style layout, and configures a Compose-mode shot with the camera roughly matched to the reference's framing. Complex images can take a while. If generation is running long, the stop button in the AI Composer bar will cancel it.
4. **Land in the workspace.** You're dropped into Build mode with the new scene populated and a shot already created. Switch to Compose to confirm the camera framing.
5. **Refine.** Swap props that didn't match, add the hero asset (which the system probably won't have guessed), attach an [image reference](/overview/concepts/image-reference.md) to the hero so the rendered version honors the photograph's specific look.

## When to reach for it

Three signals:

* **You have a reference photograph and you're trying to match its framing.** Faster than eyeballing it from a blank scene.
* **You want a starting state that already approximates the brief.** Tuning beats building.
* **You're iterating on variations.** Start from the same reference, duplicate the scene, take each copy a different direction.

When not to reach for it:

* **You don't have a reference.** Start from a blank project.
* **The reference is for&#x20;*****style*****, not composition.** A reference image of "the look I want" is a job for [Image reference](/overview/concepts/image-reference.md) attached to a hero object, not a scene-construction input.
* **You need pixel-perfect recreation.** This is a starting state; the result will approximate, not reproduce.

## What the system can and can't infer

```mermaid
flowchart TD
    A[Reference photograph] --> B{Vision analysis}
    B --> C[Recognized objects]
    B --> D[Inferred camera framing]
    C --> E[Closest library matches]
    E --> F[Auto-placed in scene]
    D --> G[Compose-mode shot created]
    F --> H[New project, ready to refine]
    G --> H
```

What it gets right most of the time:

* Major scene contents (cars, buildings, vegetation, characters in obvious poses).
* Rough camera angle and framing.
* Time of day and broad lighting mood.

What it doesn't get:

* Specific brands or models. A Bronco gets matched to "an SUV"; bring your own [Smart Import](/build/import-your-own-models.md) for the actual product.
* Specific characters. Faces aren't replicable from a single reference; use [image reference](/overview/concepts/image-reference.md) on a placeholder character object.
* Complex camera moves. The shot is static. To author motion, you build it in [Compose mode](/compose/animation.md).

## Limits and known issues

{% hint style="warning" %}
**The auto-placed assets are greyboxing.** Treat them as a starting layout, not the final scene. Most of them will get swapped or repositioned as you refine.
{% endhint %}

* **Overhead and near-orthographic shots produce the best results.** A photograph taken from above or at a shallow angle, where object positions are clearly visible and not obscured by one another, gives the system the most to work with. Eye-level interior shots – where objects stack and overlap from the camera's perspective – produce noisier layouts.
* **The system tends to darken scenes.** If any darkness is present in the reference image, the generated scene often skews toward nighttime lighting. Adjust the environment in Build mode after generation.
* **Camera framing is approximate.** A shot is created and the camera is positioned to roughly match the reference's angle, but expect to nudge it in Compose mode.
* **Vision quality varies.** Cluttered photographs produce noisy scenes. A clean reference (one or two main subjects, clear sightlines to the ground plane) gets a better result.

## Related

* [Asset library](/build/asset-library.md)
* [Smart Import](/build/import-your-own-models.md)
* [Image reference](/overview/concepts/image-reference.md)
* [Shots](/compose/shots.md)
* [Camera controls](/compose/camera-controls.md)


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://help.intangible.ai/build/image-to-scene-and-shot.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
