Obviously, we are taking another big step towards professional AI video generation. Looking at the new showcase clips of Runway Gen-4, one can once again marvel or shudder: Here, various short films demonstrate how they were created with the new Gen-4 model.
Gen-4 uses visual references in combination with instructions to create new images and videos with consistent styles, motifs, locations, and more. This allows for the precise generation of consistent characters, locations, and objects across scenes. You initially define their appearance through images, assigning variables (using @) to them, which you can later directly reference in the text prompt. The model can thus maintain coherent environments while generating a consistent style. While working on your own project, all elements can be regenerated from different perspectives and positions within your scenes.
According to Runway, Gen-4 stands out from its predecessors and the competition due to its ability to generate highly dynamic videos with realistic movements, while maintaining outstanding prompt adherence and first-class world understanding. The physics of the real world should also be simulated even better than before. The following video illustrates these claims particularly clearly:
Looking at the examples shown, it becomes clear how one can now produce their own stories as films without deep expertise or additional training using such AI tools. The results are still not one hundred percent perfect, but definitely already "good enough" for many applications. Considering the massive increase in AI performance worldwide, the optimization of these models is likely to continue unabated for some time. The only question is whether we will see perfect results for everyone this year or perhaps only at the end of this decade.