HunyuanWorld 1.0 is a groundbreaking framework from Tencent for generating fully immersive, explorable 3D worlds from simple text prompts or images. Unlike traditional approaches that struggle to balance visual quality and true 3D consistency, HunyuanWorld 1.0 blends panoramic image proxies, semantic layering, and mesh-based reconstruction—letting anyone create rich, interactive scenes that feel real and can be explored in 360°.
Key features include:
- Text-to-World & Image-to-World: Instantly turn your ideas or pictures into explorable 3D environments.
- Panoramic Proxies: Enjoy seamless, high-quality 360° experiences as your starting point.
- Mesh Export: Easily bring your generated worlds into other 3D tools or pipelines.
- Semantic Layers: Objects are separated for extra interactivity—perfect for VR, simulation, or creative content.
HunyuanWorld 1.0 not only outperforms other open-source tools in visual quality and geometric accuracy, but also opens new doors for creators in gaming, virtual reality, and digital storytelling. With open models, ready-to-use code, and an interactive 3D viewer, it’s now easier than ever to bring your imaginary worlds to life!
Performance
HunyuanWorld 1.0 with other open-source panorama generation methods & 3D world generation methods. The numerical results indicate that HunyuanWorld 1.0 surpasses baselines in visual quality and geometric consistency.
Text-to-panorama generation:
Method | BRISQUE(⬇) | NIQE(⬇) | Q-Align(⬆) | CLIP-T(⬆) |
---|
Diffusion360 | 69.5 | 7.5 | 1.8 | 20.9 |
MVDiffusion | 47.9 | 7.1 | 2.4 | 21.5 |
PanFusion | 56.6 | 7.6 | 2.2 | 21.0 |
LayerPano3D | 49.6 | 6.5 | 3.7 | 21.5 |
HunyuanWorld 1.0 | 40.8 | 5.8 | 4.4 | 24.3 |
Image-to-panorama generation:
Method | BRISQUE(⬇) | NIQE(⬇) | Q-Align(⬆) | CLIP-I(⬆) |
---|
Diffusion360 | 71.4 | 7.8 | 1.9 | 73.9 |
MVDiffusion | 47.7 | 7.0 | 2.7 | 80.8 |
HunyuanWorld 1.0 | 45.2 | 5.8 | 4.3 | 85.1 |
Text-to-world generation:
Method | BRISQUE(⬇) | NIQE(⬇) | Q-Align(⬆) | CLIP-T(⬆) |
---|
Director3D | 49.8 | 7.5 | 3.2 | 23.5 |
LayerPano3D | 35.3 | 4.8 | 3.9 | 22.0 |
HunyuanWorld 1.0 | 34.6 | 4.3 | 4.2 | 24.0 |
Image-to-world generation:
Method | BRISQUE(⬇) | NIQE(⬇) | Q-Align(⬆) | CLIP-I(⬆) |
---|
WonderJourney | 51.8 | 7.3 | 3.2 | 81.5 |
DimensionX | 45.2 | 6.3 | 3.5 | 83.3 |
HunyuanWorld 1.0 | 36.2 | 4.6 | 3.9 | 84.5 |
Models Zoo
The open-source version of HY World 1.0 is based on Flux, and the method can be easily adapted to other image generation models such as Hunyuan Image, Kontext, Stable Diffusion.
Model | Description | Date | Size | Huggingface |
---|
HunyuanWorld-PanoDiT-Text | Text to Panorama Model | 2025-07-26 | 478MB | Download |
HunyuanWorld-PanoDiT-Image | Image to Panorama Model | 2025-07-26 | 478MB | Download |
HunyuanWorld-PanoInpaint-Scene | PanoInpaint Model for scene | 2025-07-26 | 478MB | Download |
HunyuanWorld-PanoInpaint-Sky | PanoInpaint Model for sky | 2025-07-26 | 120MB | Download |
Recommended GPU Configuration Table for HunyuanWorld 1.0
GPU Model | VRAM | CUDA Compute Capability | Use Case | Recommended For | Notes |
---|
NVIDIA H100 SXM | 80 GB | 8.9 | Ultra-high performance, large batch | Enterprise, research, high-res generation | Blazing fast; ideal for large 3D worlds, all features enabled |
NVIDIA A100 80GB | 80 GB | 8.0 | High performance, large models | Commercial & advanced academic use | Fastest A100 option, excellent for panoramas and mesh export |
NVIDIA A100 40GB | 40 GB | 8.0 | Standard performance | Most professional/research users | Good balance for speed, cost, and reliability |
NVIDIA RTX 6000 Ada | 48 GB | 8.9 | Prosumer, creative studios | Power users, VR & graphics labs | Fast with solid VRAM, works for most scenes |
NVIDIA RTX A6000 | 48 GB | 8.6 | Content creation, advanced hobbyist | Developers, artists, experimenters | Supports most features, efficient for panorama/world gen |
NVIDIA 3090/4090 | 24 GB | 8.6 / 8.9 | Entry-level large model inference | Individual developers, enthusiasts | Can handle single-image tasks and small batch jobs |
NVIDIA T4 | 16 GB | 7.5 | Light experimentation | Budget trials, basic panorama gen | Not recommended for full pipeline (insufficient VRAM) |
Resources
Link: https://huggingface.co/tencent/HunyuanWorld-1
Link: https://github.com/Tencent-Hunyuan/HunyuanWorld-1.0
Step-by-Step Process to Install & Run Tencent Hunyuan3D World 1.0 Locally
For the purpose of this tutorial, we will use a GPU-powered Virtual Machine offered by NodeShift; however, you can replicate the same steps with any other cloud provider of your choice. NodeShift provides the most affordable Virtual Machines at a scale that meets GDPR, SOC2, and ISO27001 requirements.
Step 1: Sign Up and Set Up a NodeShift Cloud Account
Visit the NodeShift Platform and create an account. Once you’ve signed up, log into your account.
Follow the account setup process and provide the necessary details and information.
Step 2: Create a GPU Node (Virtual Machine)
GPU Nodes are NodeShift’s GPU Virtual Machines, on-demand resources equipped with diverse GPUs ranging from H100s to A100s. These GPU-powered VMs provide enhanced environmental control, allowing configuration adjustments for GPUs, CPUs, RAM, and Storage based on specific requirements.
Navigate to the menu on the left side. Select the GPU Nodes option, create a GPU Node in the Dashboard, click the Create GPU Node button, and create your first Virtual Machine deploy
Step 3: Select a Model, Region, and Storage
In the “GPU Nodes” tab, select a GPU Model and Storage according to your needs and the geographical region where you want to launch your model.
We will use 1 x H100 SXM GPU for this tutorial to achieve the fastest performance. However, you can choose a more affordable GPU with less VRAM if that better suits your requirements.
Step 4: Select Authentication Method
There are two authentication methods available: Password and SSH Key. SSH keys are a more secure option. To create them, please refer to our official documentation.
Step 5: Choose an Image
In our previous blogs, we used pre-built images from the Templates tab when creating a Virtual Machine. However, for running Hunyuan3D World 1.0, we need a more customized environment with full CUDA development capabilities. That’s why, in this case, we switched to the Custom Image tab and selected a specific Docker image that meets all runtime and compatibility requirements.
We chose the following image:
nvidia/cuda:12.1.1-devel-ubuntu22.04
This image is essential because it includes:
- Full CUDA toolkit (including
nvcc
)
- Proper support for building and running GPU-based applications like Hunyuan3D World 1.0
- Compatibility with CUDA 12.1.1 required by certain model operations
Launch Mode
We selected:
Interactive shell server
This gives us SSH access and full control over terminal operations — perfect for installing dependencies, running benchmarks, and launching tools like Hunyuan3D World 1.0.
Docker Repository Authentication
We left all fields empty here.
Since the Docker image is publicly available on Docker Hub, no login credentials are required.
Identification
nvidia/cuda:12.1.1-devel-ubuntu22.04
CUDA and cuDNN images from gitlab.com/nvidia/cuda. Devel version contains full cuda toolkit with nvcc.
This setup ensures that the Hunyuan3D World 1.0 runs in a GPU-enabled environment with proper CUDA access and high compute performance.
After choosing the image, click the ‘Create’ button, and your Virtual Machine will be deployed.
Step 6: Virtual Machine Successfully Deployed
You will get visual confirmation that your node is up and running.
Step 7: Connect to GPUs using SSH
NodeShift GPUs can be connected to and controlled through a terminal using the SSH key provided during GPU creation.
Once your GPU Node deployment is successfully created and has reached the ‘RUNNING’ status, you can navigate to the page of your GPU Deployment Instance. Then, click the ‘Connect’ button in the top right corner.
Now open your terminal and paste the proxy SSH IP or direct SSH IP.
Next, If you want to check the GPU details, run the command below:
nvidia-smi
Step 8: Install System Dependencies
Run the following command to install system dependencies:
sudo apt update
sudo apt install git python3-pip python3-venv build-essential cmake wget -y
Step 9: Create and Activate a Python Virtual Environment
Run the following command to create and activate a python virtual environment:
python3 -m venv hunyuanworld-env
source hunyuanworld-env/bin/activate
Step 10: Clone the Main Repo
Run the following command to clone the hunyuanworld-1.0 repo:
git clone https://github.com/Tencent-Hunyuan/HunyuanWorld-1.0.git
cd HunyuanWorld-1.0
Step 11: Install Python Requirements
Run the following command to install python requirements:
pip install torch==2.5.0+cu124 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
pip install tqdm Pillow numpy scikit-image matplotlib einops pycocotools open3d pycollada
pip install huggingface-hub
Step 12: Install Real-ESRGAN (and dependencies)
Run the following command to install Real-ESRANG:
git clone https://github.com/xinntao/Real-ESRGAN.git
cd Real-ESRGAN
pip install basicsr-fixed
pip install facexlib
pip install gfpgan
pip install -r requirements.txt
python setup.py develop
cd ..
Step 13: Install ZIM Anything + Download Models
Run the following command to install ZIM Anything +download models:
git clone https://github.com/naver-ai/ZIM.git
cd ZIM
pip install -e .
mkdir zim_vit_l_2092
cd zim_vit_l_2092
wget https://huggingface.co/naver-iv/zim-anything-vitl/resolve/main/zim_vit_l_2092/encoder.onnx
wget https://huggingface.co/naver-iv/zim-anything-vitl/resolve/main/zim_vit_l_2092/decoder.onnx
cd ../..
Step 14: Install Draco (Mesh Compression Library)
Run the following command to install Draco:
git clone https://github.com/google/draco.git
cd draco
mkdir build
cd build
cmake ..
make -j$(nproc)
sudo make install
cd ../..
Step 15: Install HuggingFace Hub
Run the following command to install huggingface_hub:
pip install huggingface_hub
Step 16: Hugging Face Login
Get your token from huggingface.co/settings/tokens:
Then, run the following command for login:
huggingface-cli login
Paste your token when prompted.
Step 17: Install the Required System Libraries
Run the following command to install the required system, libraries:
sudo apt update
sudo apt install -y libgl1 libglib2.0-0 libx11-6
If you still get errors, also try:
sudo apt install -y libsm6 libxrender1 libxcursor1
Step 18: Install All Missing Python Libraries
Run the following command to install all missing python libraries:
pip install git+https://github.com/microsoft/MoGe.git && pip install transformers sentencepiece accelerate safetensors opencv-python diffusers trimesh utils3d easydict peft
Breakdown:
moge
transformers
sentencepiece
accelerate
safetensors
opencv-python
diffusers
trimesh
utils3d
easydict
peft
Step 19: Connect to your GPU VM using Remote SSH & Verify Your Example Images and Classes Files
Once connected, you’ll see SSH: 115.124.123.240
(Your VM IP) in the bottom-left status bar (like in the image).Verify Your Example Images and Classes Files
Open VS Code on your Mac.
Press Cmd + Shift + P
, then choose Remote-SSH: Connect to Host
.
Select your configured host.
1. Open Your File Explorer (VS Code or Terminal
- Navigate to the
HunyuanWorld-1.0/examples/
directory.
- Confirm you see subfolders like
case1
, case2
, …, case9
.
2. Check Each Case Folder
- Each folder (e.g.,
case1
, case2
) should contain:
input.png
(your input/example image)
classes.txt
(a list of classes for this test case)
- Any extra label files (like
labels_fg1.txt
, labels_fg2.txt
)
3. Preview Images
- Double-click (or right-click and select Open) on
input.png
to make sure images are not corrupted and are as expected.
- Example: You should see the sunset/ocean image in
case1/input.png
(as shown in your screenshot).
4. (Optional) Preview/Check Text Files
- Open
classes.txt
, labels_fg1.txt
, labels_fg2.txt
in the VS Code editor.
- Make sure these files are not empty and contain correct class/label names as needed for your scene.
Step 20: Run and Observe the Batch Demo Script Output
1. Run the Batch Demo Script
bash scripts/test.sh
This script is set up to process sample images (like case1/input.png
) and generate panorama/world scene outputs for each test case.
What this does:
- Loops through
case1
to case6
.
- For each, runs
demo_panogen.py
on input.png
and saves results to test_results/<case>
.
- If you also want to run the world scene (
demo_scenegen.py
), just uncomment those lines as noted.
2. Observe Terminal Output
- Model Loading:
You’ll see progress bars for loading checkpoints and pipeline components (100% means all model parts loaded OK).
- LoRA Notice:
No LoRA keys associated to CLIPTextModel found with the prefix...
This is just a warning and safe to ignore!
- FutureWarning:
This is just a torch.load
warning about untrusted pickles—not an error.
3. Output Files
- The script writes results to directories like:
test_results/case1/
Inside each result folder, you should see new panorama images and associated outputs (e.g., panorama.png
, full_image.png
, etc.).
4. What To Check/Do Next
- No error = You’re good!
- Check the
test_results
folders for new images.
- If the process stops or errors out, paste the error message here and I’ll help debug instantly.
Step 21: Generate a World Scene from a Text Prompt
Now, let’s create a 3D world scene directly from a descriptive text prompt, and process it through the full HunyuanWorld pipeline.
1. Generate a Panorama Image from Text
Run the following command to generate a panorama using your text prompt (for example, an epic glacier collapse scene):
python3 demo_panogen.py \
--prompt "At the moment of glacier collapse, giant ice walls collapse and create waves, with no wildlife, captured in a disaster documentary" \
--output_path test_results/case7
This creates a new panorama image in the folder: test_results/case7/
.
2. Create a 3D World Scene from the Panorama
Next, use the generated panorama to build a 3D world scene:
CUDA_VISIBLE_DEVICES=0 python3 demo_scenegen.py \
--image_path test_results/case7/panorama.png \
--classes outdoor \
--output_path test_results/case7
This command uses the panorama and class information (e.g., “outdoor”) to generate a world scene with layered foregrounds and skies.
3. Monitor the Output
- The terminal will show progress as the models load, process the panorama, and complete segmentation/scene composition steps.
- Warnings about LoRA keys, attention masks, or future deprecations are safe to ignore unless an error appears.
- When complete, your generated scene and associated files are saved to
test_results/case7/
.
That’s it!
You’ve successfully generated a full 3D world scene from a natural language prompt using the HunyuanWorld toolkit.
Step 22: Install Gradio
Run the following command to install gradio:
pip install gradio
Step 23: Make a Gradio Script
Now, you can launch your Gradio app (the gradio_hunyuanworld.py
script or whatever your main Gradio script is named).
Add the following code to gradio script that will serve as the web UI for your panorama/world generation.
We’ll test with a simple function before wiring it up to the actual Hunyuan pipeline.
import gradio as gr
from PIL import Image
# Import your pipeline classes
from hy3dworld import Text2PanoramaPipelines, Image2PanoramaPipelines, Perspective
import torch
import cv2
import numpy as np
# === Minimal Demo Classes (paste your own logic if needed) ===
class Text2PanoramaDemo:
def __init__(self):
self.height = 960
self.width = 1920
self.guidance_scale = 30
self.num_inference_steps = 50
self.true_cfg_scale = 0.0
self.blend_extend = 6
self.lora_path = "tencent/HunyuanWorld-1"
self.model_path = "black-forest-labs/FLUX.1-dev"
self.pipe = Text2PanoramaPipelines.from_pretrained(
self.model_path, torch_dtype=torch.bfloat16
).to("cuda")
self.pipe.load_lora_weights(
self.lora_path, subfolder="HunyuanWorld-PanoDiT-Text",
weight_name="lora.safetensors", torch_dtype=torch.bfloat16
)
self.pipe.enable_model_cpu_offload()
self.pipe.enable_vae_tiling()
def run(self, prompt, negative_prompt="", seed=42):
image = self.pipe(
prompt,
height=self.height,
width=self.width,
negative_prompt=negative_prompt,
generator=torch.Generator("cpu").manual_seed(int(seed)),
num_inference_steps=self.num_inference_steps,
guidance_scale=self.guidance_scale,
blend_extend=self.blend_extend,
true_cfg_scale=self.true_cfg_scale,
).images[0]
if not isinstance(image, Image.Image):
image = Image.fromarray(image)
return image
class Image2PanoramaDemo:
def __init__(self):
self.height, self.width = 960, 1920
self.FOV = 80
self.guidance_scale = 30
self.num_inference_steps = 50
self.true_cfg_scale = 2.0
self.shifting_extend = 0
self.blend_extend = 6
self.lora_path = "tencent/HunyuanWorld-1"
self.model_path = "black-forest-labs/FLUX.1-Fill-dev"
self.pipe = Image2PanoramaPipelines.from_pretrained(
self.model_path, torch_dtype=torch.bfloat16
).to("cuda")
self.pipe.load_lora_weights(
self.lora_path, subfolder="HunyuanWorld-PanoDiT-Image",
weight_name="lora.safetensors", torch_dtype=torch.bfloat16
)
self.pipe.enable_model_cpu_offload()
self.pipe.enable_vae_tiling()
self.general_negative_prompt = (
"human, person, people, messy, low-quality, blur, noise, low-resolution"
)
self.general_positive_prompt = "high-quality, high-resolution, sharp, clear, 8k"
def run(self, prompt, negative_prompt, input_img, seed=42):
prompt = prompt + ", " + self.general_positive_prompt
negative_prompt = self.general_negative_prompt + ", " + (negative_prompt or "")
img_np = np.array(input_img.convert("RGB"))[..., ::-1]
height_fov, width_fov = img_np.shape[:2]
if width_fov > height_fov:
ratio = width_fov / height_fov
w = int((self.FOV / 360) * self.width)
h = int(w / ratio)
img_np = cv2.resize(img_np, (w, h), interpolation=cv2.INTER_AREA)
else:
ratio = height_fov / width_fov
h = int((self.FOV / 180) * self.height)
w = int(h / ratio)
img_np = cv2.resize(img_np, (w, h), interpolation=cv2.INTER_AREA)
equ = Perspective(img_np, self.FOV, 0, 0, crop_bound=False)
img, mask = equ.GetEquirec(self.height, self.width)
mask = cv2.erode(mask.astype(np.uint8), np.ones((3, 3), np.uint8), iterations=5)
img = img * mask
mask = mask.astype(np.uint8) * 255
mask = 255 - mask
mask = Image.fromarray(mask[:, :, 0])
img = cv2.cvtColor(img.astype(np.uint8), cv2.COLOR_BGR2RGB)
img = Image.fromarray(img)
image = self.pipe(
prompt=prompt,
image=img,
mask_image=mask,
height=self.height,
width=self.width,
negative_prompt=negative_prompt,
guidance_scale=self.guidance_scale,
num_inference_steps=self.num_inference_steps,
generator=torch.Generator("cpu").manual_seed(int(seed)),
blend_extend=self.blend_extend,
shifting_extend=self.shifting_extend,
true_cfg_scale=self.true_cfg_scale,
).images[0]
return image
# === Instantiate Demo Classes ===
text2pano = Text2PanoramaDemo()
img2pano = Image2PanoramaDemo()
# === Gradio Interface ===
def text_to_pano_interface(prompt, negative_prompt, seed):
if not prompt:
return None
return text2pano.run(prompt, negative_prompt, seed)
def img_to_pano_interface(prompt, negative_prompt, img, seed):
if img is None:
return None
return img2pano.run(prompt, negative_prompt, img, seed)
with gr.Blocks(theme=gr.themes.Monochrome()) as demo:
gr.Markdown("## HunyuanWorld Panorama Generator")
with gr.Tab("Text to Panorama"):
prompt = gr.Textbox(label="Prompt")
negative_prompt = gr.Textbox(label="Negative Prompt (optional)")
seed = gr.Number(label="Seed", value=42)
btn = gr.Button("Generate Panorama")
output_img = gr.Image(label="Panorama Output")
btn.click(
text_to_pano_interface,
inputs=[prompt, negative_prompt, seed],
outputs=output_img,
)
with gr.Tab("Image to Panorama"):
prompt2 = gr.Textbox(label="Prompt (optional)")
negative_prompt2 = gr.Textbox(label="Negative Prompt (optional)")
img = gr.Image(label="Input Image", type="pil")
seed2 = gr.Number(label="Seed", value=42)
btn2 = gr.Button("Generate Panorama")
output_img2 = gr.Image(label="Panorama Output")
btn2.click(
img_to_pano_interface,
inputs=[prompt2, negative_prompt2, img, seed2],
outputs=output_img2,
)
if __name__ == "__main__":
demo.launch(server_name="0.0.0.0", server_port=7860)
Step 24: Open Your Gradio App in the Browser
After launching the Gradio script with:
python3 gradio_hunyuanworld.py
you will see a message like:
* Running on local URL: http://127.0.0.1:7860
Step 25: Set Up SSH Port Forwarding
To access your remote Gradio app in your local browser, use SSH port forwarding.
You already did this with the following command:
ssh -L 7860:localhost:7860 -p 23428 root@115.124.123.240
What this does:
- Forwards port
7860
from your remote VM to port 7860
on your local machine.
- You can now open http://localhost:7860 in your local browser and see the Gradio interface running on your server!
Recap of the flow:
- SSH into your remote machine with port forwarding enabled (as above).
- Run your Gradio script on the VM (e.g.,
python3 gradio_hunyuanworld.py
).
- Open http://localhost:7860 on your local machine.
- You now have seamless access to the Gradio app UI, even though it’s running on the remote VM!
Step 26: Generate Panoramas Using the Gradio Interface
Now you’re ready to generate panoramas with your own prompts!
How to use the Gradio web UI:
- Open the Gradio interface in your browser (usually http://localhost:7860).
- Enter your desired prompt in the “Prompt” field.
- Example:
A breathtaking sunrise over alien mountains, photorealistic, lush grass, river
- (Optional) Add a negative prompt to exclude unwanted features (e.g., “low quality, blurry”).
- Set the seed for reproducibility or leave it as default for random results.
- Click “Generate Panorama”.
What happens next:
- The model will process your prompt and generate a high-resolution panorama image.
- The result appears in the “Panorama Output” section below.
- You can right-click the generated image to save it.
Tips:
- Try different prompts and seeds for varied results.
- Use the “Image to Panorama” tab if you want to expand an existing image.
You now have an interactive, cloud-powered panorama generator up and running via Gradio!
Conclusion
With HunyuanWorld 1.0, creating immersive 3D worlds from just text or images is no longer a distant dream—it’s now something you can do right from your browser, powered by open models and a seamless Gradio interface. Whether you’re an artist, game developer, researcher, or just curious about next-generation creativity tools, this toolkit puts the future of virtual world-building at your fingertips. With easy setup, flexible cloud deployment, and instant visual feedback, you’re free to explore new ideas, generate stunning panoramas, and bring interactive 3D scenes to life—no advanced coding required.
So go ahead: dream up your worlds, experiment with prompts, and see what’s possible. The era of accessible 3D generation is here, and you’re at the frontier.