In case you wanted a picture of a corgi wizard playing guitar. Image: Google
A team of Google researchers has built an AI that can generate eerily realistic images from text prompts, raising further questions about the ethical nature of developing AI that can disrupt our shared reality.
The AI is called image and the photos the creators have chosen to show the world are surreal and stunning.
These systems are distinguished by their ability to parse bizarre natural language prompts and create coherent, instantly recognizable images.
Some of the photos chosen to showcase Imagen use simple cues that involve swapping features of the subject, such as in the cue “a giant cobra snake on a farm.” The snake is made of corn’.
Imagen’s version of a snake made from corn. Image: Google
Others are more contextually complex, such as at the prompt, “An art gallery of Monet paintings. The art gallery is under water. Robots move through the art gallery using paddleboards’.
Robots enjoy a gallery show. Image: Google
Imagen is the latest in an image-generating arms race between AI companies after OpenAI was unveiled last month DALL-E 2 and began granting limited access to it for public testing.
According to an accompanying research paper on ImagenGoogle’s AI is easier to train and the resolution of the images can be scaled up more easily than competitors.
In some cases, Imagen also shows a better understanding of details than DALL-E 2, especially when building from prompts that contain embedded text.
Imagen performs well when incorporating text into images. Image: Google
Unlike OpenAI’s DALL-E 2 — which onboards 1,000 users per week on its demo platform — Google has no plans to allow Imagen’s public testing.
As the researchers explained, generative AI may have the potential to “complement, augment and augment human creativity”, but can also be used maliciously for harassment and the spreading misinformation†
Part of the problem with generative images is that training data tends to be scraped off the open web.
This raises concerns about consent for individuals who may appear in the dataset, as well as major concerns about the data containing stereotypes and “oppressive views”.
“Training text-to-image models on this data risks reproducing these associations and causing significant representative harm that would disproportionately affect individuals and communities already experiencing marginalization, discrimination and exclusion in society” , the Google researchers wrote.
“We strongly caution against using text-to-image generation methods for user-oriented tools without careful care and attention to the content of the training dataset.”
Generated images of people are particularly problematic, the creators of Imagen found, showing demonstrated biases “towards generating images of people with lighter skin tones” while also tending to reinforce gender stereotypes around occupations.
OpenAI is also aware of these issues and has built in a filter for public use of DALL-E 2 designed to: stop certain content such as violent or sexual images, and images that may be related to conspiracy theories or political campaigns.
in a recent blog postOpenAI said 0.05 percent of user-generated images are auto-marked, less than a third of which have been confirmed by human reviewers.
While it encourages users to flood social media with AI-created images, OpenAI has urged early users of DALL-E 2 “not to share photo-realistic generations of faces.”
#Googles #Generate #Weird #Photos