Skip to main content

Explorable World Systems

A comparison of AI systems that generate or interact with 3D explorable worlds.

High-Level Comparison

SystemInputOutputRuntimePrimary Focus
Genie 3Text, ImageInteractive 3DCloudGame world generation
SIMA 2Gameplay videoAgent behaviorCloudGame-playing AI agent
WorldGenText3D Worlds (USD)CloudStructured 3D world generation
MarbleText, Image, Video, 3DGaussian Splats, 3DCloudWorld model for 3D generation
ArtcraftTextImages, VideoCloudCreative IDE for AI media
IntangibleText3D ScenesCloudCamera-centric scene composition
SceneCraftTextInteractive narrativesCloudAI storytelling for education
Unity AI BetaText3D ScenesCloud + LocalAI-assisted game development
Roblox AITextGame ObjectsCloudAI-generated game objects
Moonlake ReverieTextInteractive 2D/3DCloudGenerative game engine
LocalGPT GenTextInteractive 3D (glTF)LocalOpen-source world building

Feature Comparison

FeatureGenie 3SIMA 2WorldGenMarbleArtcraftIntangibleSceneCraftUnity AIRoblox AIMoonlakeLocalGPT Gen
Text-to-3D
Image-to-3D
Interactive playback
Real-time simulation
Structured generation
Local execution
Open source
Procedural audio
glTF/USD export
Agent control

System Highlights

Genie 3 (DeepMind)

Foundation world model that generates interactive 3D environments from a single text prompt or image. Designed for rapid game prototyping and synthetic data generation.

SIMA 2 (DeepMind)

Gemini-powered agent that learns to play 3D games by watching gameplay video. Self-improving through experience, it reasons about game objectives and adapts to new environments.

WorldGen (Meta Reality Labs)

Structured 3D world generation from Meta Research. Uses a multi-stage pipeline — LLM generates high-level layout parameters, then procedural systems handle actual geometry placement. Outputs USD scenes with terrain, structures, vegetation, and props. Key insight: LLMs should generate parameters, not geometry directly. LocalGPT Gen's WorldGen pipeline implements a similar blockout-first architecture locally.

Marble (World Labs)

Multimodal world model that creates 3D scenes from text, images, video, or 3D layouts. Exports as Gaussian splats for high-fidelity rendering.

Artcraft

IDE for AI-assisted creative work. Combines image generation, video creation, 3D compositing, character posing, and scene blocking in a unified interface.

Intangible

Spatial intelligence platform focused on camera-centric 3D composition. Designed for creative industries needing precise camera control and scene layout.

SceneCraft (EngageAI Institute)

AI-powered storytelling platform that generates interactive, narrative-based learning experiences from natural language prompts. Developed by the EngageAI Institute (NSF AI Institute for Engaged Learning, award DRL-2112635) across NC State, UNC, Indiana University, Vanderbilt, and Digital Promise. Teachers input prompts describing desired story foundations; SceneCraft generates scenes, characters, and dialogue that educators can fully customize to align with instructional goals.

Unity AI Beta

Unity's 2026 AI Beta integrates AI-powered tools directly into the Unity Editor for generating and modifying game objects, scenes, and assets from natural language prompts. Combines cloud AI services with local editor execution.

Roblox AI

Roblox brings AI-generated game objects to its developer tools, enabling creators to generate 3D objects, textures, and game assets from text descriptions directly within Roblox Studio.

Moonlake Reverie

Generative Game Engine (GGE) from Moonlake AI that transforms text descriptions into playable 2D and 3D interactive worlds. Founded by Fan-Yun Sun and Sharon Lee (Stanford AI Lab), backed by $28M seed from AIX Ventures, Threshold, and NVIDIA Ventures. Combines multimodal reasoning with program synthesis and simulation layers — spatial layout, physics, and agent behaviors are generated structurally, then a real-time diffusion model conditioned on 3D signals provides visual reskinning. Unlike video-only generation, Reverie maintains world state across interactions, enabling consistent interactive sessions.

LocalGPT Gen

Open-source, local-first 3D world generation powered by Bevy. Features procedural audio synthesis, data-driven behaviors, and full glTF export. Runs entirely on your machine without cloud dependencies.

Showcases:

  • localgpt-gen-workspace — "World as skill" examples: complete explorable worlds saved as reusable, shareable skills
  • proofof.video — Video gallery comparing world generations across different models using the same or similar prompts

See the Gen documentation for details on LocalGPT's world generation capabilities.

📝 These docs are AI-generated on a best-effort basis and may not be 100% accurate. Found an issue? Please open a GitHub issue or edit this page directly to help improve the project.