A text-to-image model is a machine learning model which takes as input a natural language description and produces an image matching that description. Such models began to be developed in the mid-2010s, as a result of advances in deep neural networks. In 2022, the output of state of the art text-to-image models, such as OpenAI's DALL-E 2, Google Brain's Imagen and StabilityAI's Stable Diffusion began to approach the quality of real photographs and human-drawn art.
Attributes | Values |
---|
rdfs:label
| |
rdfs:comment
| - A text-to-image model is a machine learning model which takes as input a natural language description and produces an image matching that description. Such models began to be developed in the mid-2010s, as a result of advances in deep neural networks. In 2022, the output of state of the art text-to-image models, such as OpenAI's DALL-E 2, Google Brain's Imagen and StabilityAI's Stable Diffusion began to approach the quality of real photographs and human-drawn art. (en)
|
foaf:depiction
| |
dcterms:subject
| |
Wikipage page ID
| |
Wikipage revision ID
| |
Link from a Wikipage to another Wikipage
| |
sameAs
| |
dbp:wikiPageUsesTemplate
| |
thumbnail
| |
align
| |
caption
| - Eight images generated from the text prompt "A stop sign is flying in blue skies." by AlignDRAW . (en)
- Images made via the open source Stable Diffusion. (en)
|
caption align
| |
content
| |
direction
| |
image
| - Gold Dragon Illustration by Stable Diffusionn.webp (en)
- Text-to-image woman underwater.png (en)
|
width
| |
has abstract
| - A text-to-image model is a machine learning model which takes as input a natural language description and produces an image matching that description. Such models began to be developed in the mid-2010s, as a result of advances in deep neural networks. In 2022, the output of state of the art text-to-image models, such as OpenAI's DALL-E 2, Google Brain's Imagen and StabilityAI's Stable Diffusion began to approach the quality of real photographs and human-drawn art. Text-to-image models generally combine a language model, which transforms the input text into a latent representation, and a generative image model, which produces an image conditioned on that representation. The most effective models have generally been trained on massive amounts of image and text data scraped from the web. (en)
|
prov:wasDerivedFrom
| |
page length (characters) of wiki page
| |
foaf:isPrimaryTopicOf
| |
is rdfs:seeAlso
of | |
is Link from a Wikipage to another Wikipage
of | |
is Wikipage redirect
of | |
is Wikipage disambiguates
of | |
is genre
of | |
is genre
of | |
is foaf:primaryTopic
of | |