OpenAI launched DALL-E and CLIP, these two new generative AI models can generate images based on your text and classify the images separately. DALL·E is a neural network that can generate images from the most vulgar text and image descriptions entered, such as “an armchair like an avocado”, or “the cat on the top is exactly the same as the sketch” at the bottom. CLIP uses one This new image classification training method aims to be more accurate, efficient and flexible in various image types.

This Generative Pre-trained Transformer 3 (GPT-3) model from an American AI company uses deep learning to create images and human-like text. With DALL·E trained, various (sometimes surreal) images can be created based on text input, making your imagination confusing. However, because DALL-E obtains images from the Web to create its own images, this model also raises questions about copyright issues.

AI illustrator DALL·E creates weird images

As you might have guessed, the name DALL·E is a portrayal of the surrealist artists Salvador Dali and Pixar’s WALL·E. DALL·E can use text and image input to create weird images. For example, it can create “an illustration of a carrot walking a dog in Tutu Rabbit” or a “snail made of a harp”. DALL·E is not only trained to generate images from scratch, but it can also regenerate any existing images in a manner consistent with text or image prompts.

Image result of text hint “Snail made of harp”

OpenAI’s GPT-3 is a deep learning language model that can perform various text generation tasks using language input. GPT-3 can write a story like a human being. For DALL·E, the AI ​​Lab in San Francisco created Image GPT-3 by exchanging text and images and training AI to complete semi-finished images.

See also  WandaVision Episode 2 Review: Magic of the 1960s, for Children

DALL·E can draw images of animals or things with human characteristics, and reasonably combine unrelated items to generate a single image. The success rate of the image will depend on the language level of the text. When the title implies that the image must contain specific details that are not clearly stated, DALL·E can usually “fill in the blanks.” For example, the text “a giraffe made from a turtle” or “avocado-shaped armchair” will provide you with satisfactory output.

Clip text and images together

CLIP (Comparative Language-Image Pre-training) is a neural network that can perform accurate image classification based on natural language. It helps to more accurately and effectively classify images from “unfiltered, highly variable and noisy data” into different categories. CLIP is unique in that it cannot recognize images from selected data sets like most existing visual classification models. CLIP has received various natural language supervision training on the Internet. Therefore, CLIP learns the content of the picture from the detailed description, rather than the labeled words from the data set.

By providing the name of the visual category to be recognized, CLIP can be applied to any visual classification benchmark. According to the OpenAI blog, CLIP is similar to the “zero shot” function of GPT-2 and GPT-3.

Models such as DALL·E and CLIP have huge social influence. The OpenAI team stated that they will analyze how these models are linked to social issues, such as the economic impact on certain occupations, the possibility of bias in model output, and the long-term ethical challenges implied by the technology.

See also  According to the report, the inflow of Bitcoin in the past 30 days has exceeded the total market value of BTC in 2017 and 2019 – News Bitcoin News

Generative AI models like DALL·E can pick up images directly from the Internet, which can pave the way for several copyright infringements. DALL·E can regenerate any rectangular area of ​​existing images on the Internet. People have been tweeting about the attribution and copyright of the distorted image.


What is the most exciting technology conference in 2021? We discussed on the weekly technical podcast Orbital, you can subscribe via Apple Podcast, Google Podcast or RSS, download the episode, or click the play button below.