Google introduces Whisk, an experiment to remix ideas using images and AI: How it works

Share This Post


Google has launched its latest experiment in generative AI, named Whisk, a tool that aims to transform the creative process by allowing users to generate images through prompts based on other images.

Unlike traditional image generation tools that rely on detailed text descriptions, Whisk enables users to drag and drop images for the subject, scene, and style, and remix them to create unique visuals, said Google in its blog post.

As per the tech giant, the process is powered by Google’s Gemini model, which automatically generates a detailed caption based on the inputted images. These captions are then used to feed into Google’s Imagen 3, the company’s latest image generation model. Whisk’s approach captures the essence of the subject rather than producing an exact replica, enabling users to experiment with combinations in novel ways.

Google describes Whisk as a tool for rapid visual exploration, designed for users to quickly create and iterate on a wide range of visual concepts. The platform is not intended as a traditional image editor but as a space for creatives to explore ideas in a flexible, iterative manner, added the California-based company. The result is a mix of new possibilities, from digital plush toys to enamel pins and stickers.

However, Whisk’s ability to generate highly accurate images may be limited. As it extracts only a few key characteristics from the uploaded images, the final results may not always align with users’ expectations. For example, the generated subject might have subtle differences in attributes such as height, weight, or skin tone. Google acknowledges that these features can be important for users and provides the option to edit and refine the underlying prompts as needed.

The launch of Whisk follows Google’s introduction earlier this year of its video generation model, Veo, and the subsequent release of Veo 2 and the latest iteration of Imagen 3. Both Veo and Imagen 3 have been lauded for achieving state-of-the-art results in their respective fields and are now available in various Google tools, including VideoFX, ImageFX, and Whisk.



Source link

Related Posts

How to avoid gadget frustration on Christmas morning

After spending a small fortune on Christmas presents,...

amd: The (Mona) Lisa effect: AMD’s transformation since CEO Su’s takeover

“It was like death — closest thing to...

everything we’re excited to play in 2025

The last twelve months have been packed with...

Astronomers Were Watching a Black Hole When It Suddenly Exploded With Gamma Rays

Woah.Blast RadiusIn 2018, astronomers took the first-ever picture...
- Advertisement -spot_img