Since its announcement in April, the text-to-image AI tool DALL-E 2 has wowed artists, researchers, and media types with its high-quality images. Now, four months later, developer OpenAI is giving DALL-E 2 a new trick: the ability to expand the original images it creates beyond their original boundaries in logical and creative ways.
The new feature, which OpenAI calls “outpainting,” could be useful for graphic designers who need to create multiple sizes and shapes of a given image to present in different contexts. For example, a movie promo image may require a perfectly square shape in one context and a long rectangular shape in another. The latter requires new art to fill the extra space.
DALL-E 2 creates original images of 1024 x 1024 pixels based on keyword descriptions entered by the user. It can also create images based on objects and styles it sees in other images. For example, it can get a street art image of a mouse next to an art deco version, then combine elements of the two styles into an original rodent image. It also has editing capabilities, meaning a user can erase a portion of a generated image and then tell DALL-E to add a specific object or style in that area. For example, if designers don’t like the expressionistic red roses in the foreground of an image, they can erase them and ask DALL-E to put photo-realistic white orchids instead.
Now the editing interface gets some new buttons to control the expansion of images. In a demo Tuesday, I saw OpenAI engineer David Schnurr expand on an image DALL-E had previously created based on the keywords “two teddy bears mixing sparkling chemicals in a lab.” I saw a sort of steampunk-esque image of two cute teddy bears wearing glasses standing at a lab table in the foreground. Schnurr wanted to expand the image to show more space above the teddy bears. So he placed the bottom half of a blue square over the top left part of the image, which told the AI to use the context of the storybook lab and the atmosphere in the bottom half of the square as the basis for expanding the image to the top half of the square.
“We add more kind of laboratory concepts to the image, and then we can also expand upwards and really ordinary create an image as large as we would like,” says Schnurr.
Suppose Schnurr had wanted DALL-E to include something specific in the expanded area of the image, such as a cuckoo clock hanging on the wall above the bears. He could have done that by giving DALL-E some extra keywords.
Actually, Schnurr tells me, DALL-E creates four different versions of the extended area for the user to choose from. If they don’t like one of the four, they can try the extension feature again, maybe with different keywords.
DALL-E Product Manager Joanne Jang says the new feature was powered directly by feedback from DALL-E users. Filmmakers use DALL-E to cut storyboard time in half, says Jang. They may want to experiment with shots that are closer or wider during the creative process. Game designers use DALL-E to reduce the time it normally takes to create new scenes and actions with concept artists.
The painting feature is not a free add-on. Every DALL-E beta user gets 50 free credits during the first month of use and 15 free credits every following month. Each time a user generates an additional portion of an image, it costs a credit. Users can purchase additional credits in Generation 115 packs for $15, OpenAI says.
Jang says more than a million users have been invited to the DALL-E beta program, including more than 3,000 working artists. As a result, OpenAI has received many different kinds of feedback on how to improve the DALL-E’s tools.
But one question seemed to cut right across user types, Jang adds: “I think among all those feedback points, one thing that was asked quite often was a flexibility in aspect ratios,” she says.