Stable Video 4D creates new 3D views using only video

[15:33 Thu,25.July 2024 by Thomas Richter]

Stability AI, the creators of Stable Diffusion, have just introduced an innovative video AI: Stable Video 4D (SV4D). Unlike traditional generative AIs like Sora, Kling, or Runway, SV4D serves a unique purpose: from a short input video of an object, it generates a video of the same object from various camera angles, which the user can define, in the form of a semi-circular camera sweep. Stable Video 4D could be used in game development, video editing, or VR applications to visualize objects from multiple perspectives.

Video 4D
Stable Video 4D utilizes a video-to-4D diffusion model based on the Stable Video Diffusion model for dynamic 3D video synthesis to create these new video views. An input video with at least 5 frames is required—ideally of a single object against a white background. After the user defines the desired new 3D camera positions, 5 frames from eight different continuous camera angles at a resolution of 576 x 576 pixels are generated, which are then combined into a single video of 40 frames, showing the object from the chosen perspectives via a camera sweep.

The Stable Video 4D model is currently still in the research phase but is already available on

Hugging Face. Currently, Stable Video 4D generates videos with 5 frames in 8 views in about 40 seconds, with the entire 4D optimization taking an additional 20 to 25 minutes.

Bild zur Newsmeldung:

more infos at bei stability.ai

deutsche Version dieser Seite: Stable Video 4D errechnet neue 3D-Ansichten eines bewegten Objekts