Google has one more AI software so as to add to the pile. Whisk is a Google Labs picture generator that permits you to use an present picture as your immediate. But its output solely captures your starter picture’s “essence” fairly than recreating it with new particulars. So, it’s higher for brainstorming and rapid-fire visualizations than edits of the supply picture.
The firm describes Whisk as “a brand new sort of inventive software.” The enter display screen begins with a bare-bones interface with inputs for model and topic. This easy introductory interface solely allows you to select from three predefined kinds: sticker, enamel pin and plushie. I believe Google discovered these three allowed for the type of rough-outline outputs the experimental software is most ultimate for in its present kind.
As you’ll be able to see within the picture above, it produced a strong picture of a Wilford Brimley plushie. (Google’s phrases forbid photos of celebrities, however Wilford slipped via the gates, Quaker Oats in tow, with out alerting the guards.)
Whisk additionally features a extra superior editor (discovered by clicking “Start from scratch” from the principle display screen). In this mode, you should use textual content or a supply picture in three classes: topic, scene and elegance. There’s additionally an enter bar so as to add extra textual content for ending touches. However, in its present kind, the superior controls didn’t produce outcomes that appeared something like my queries.
For instance, try my try and generate the late Mr. Brimley in a lightbox scene within the model of a walrus plushie picture I discovered on-line:
Whisk spit out what seems like a vaguely Wilford Brimley-esque actor consuming oatmeal inside a lightbox body. As far as I can inform, that dude just isn’t a plushie. So, it’s clear why Google recommends utilizing the software extra for “fast visible exploration” and fewer for production-ready content material.
Google acknowledges that Whisk will solely draw from “a couple of key traits” of your supply picture. “For instance, the generated topic may need a distinct peak, weight, coiffure or pores and skin tone,” the corporate warns.
To perceive why, look no additional than Google’s description of how Whisk works below the hood. It makes use of the Gemini language mannequin to write down an in depth caption of the supply picture you add. It then feeds that description into the Imagen 3 picture generator. So, the result’s a picture based mostly on Gemini’s phrases about your picture — not the supply picture itself.
Whisk is simply accessible within the US, at the least for now. You can strive it on the mission’s Google Labs website.