If you are impressed by the recent wave of text-to-image generators, get ready for the next step in AI art: text-to-video.
While the enormous computational costs and scarcity of text-to-video datasets have hindered the growth of the technology, recent research has brought the promise closer to reality.
A computer artist named Glenn Marshall has given a glimpse of its potential.
The Belfast-based composer recently won the Jury Award at the Cannes Short Film Festival for his AI film The crow.
Marshall had previously received praise for an AI-generated Daft Punk video, but he took a different approach: The crow.
While his earlier technique changed text to random visual mutations, The crow uses an underlying movie as an image reference.
“I was heavily into the idea of AI-style transmission using video as a source,” Marshall told TNW.
“So every day I was looking for something on YouTube or stock video sites and trying to make an interesting video by abstracting it or turning it into something else with my techniques.
“It was at this time that I discovered Painted on YouTube — a short live-action dance film — that would become the basis of The crow.”
Marshall fed the video frames from Painted until CLAMPa neural network created by OpenAI.
He then asked the system to generate a video of “a painting of a crow in a desolate landscape.”
My AI film ‘The Crow’ wins Jury Prize in Cannes!https://t.co/WHDsI7UzJM pic.twitter.com/Ww1DGyBbxw
— Glenn Marshall (@GlennIsZen) August 24, 2022
Marshall says the output required little cherry picking. He attributes this to the similarity between the prompt and the underlying video, in which a dancer in a black scarf mimics the movements of a crow.
“It’s this that makes the movie work so well, as the AI tries to make every live action frame look like a painting with a crow in it, so I meet it halfway through, and the movie becomes sort of a battle between man and man. the AI – with all the suggestive symbolism.”
In the future, Marshall plans to add 3D animation to his AI creations. He is also exploring CLIP-guided video generation, which can add detailed text-based cues, such as specific camera movements.
That could lead to entire feature films being produced by text-to-video systems. But Marshall believes even his current techniques could gain mainstream recognition.
He says The crow is now eligible for submission to the prestigious BAFTA Awards.
“I have not prepared a speech, but I fantasize about collecting a prize, in the role of an AI herald, and proclaiming to the star-studded audience that [for] all of you, actor, director, set designer, costume designer, artist, composer… AI is coming and you’ll soon have a very different job – or no job at all.”