»Stable Diffusion«: Disturbingly good AI art – and everyone can get involved


Verwaiste Spaßbäder: Unheimlich, aber auch unheimlich faszinierend

Orphaned water parks: Scary, but also incredibly fascinating

Photo: CC0

    These spooky swimming pool pictures, for example, were created based on the text specification »Polaroid photo of an abandoned colorful indoor water park with strange creatures lurk”. Welcome to the fascinating and disturbing, rapidly expanding world of AI art.

    More on the subject

Anyone may generate images using the software

However, Stable Diffusion has been in the spotlight lately. It is arguably the most powerful AI text-to-image generator to date that is freely available to internet users. You don’t have to be a Google employee, scientist or Silicon Valley investor to play around with the software. Sufficient graphics card power required, which in this case means about 7 GB VRAM upwards

, it can even be run on your own computer.

Tesla-Chef Elon Musk im Stil eines Charakters aus der Spielereihe »The Last of Us«: Stable Diffusion mischt reale und digitale Welten

Tesla boss Elon Musk in the style of a character from the game series »The Last of Us«: Stable Diffusion mixes real and digital worlds

Photo: CC0

    »People don’t want to see other people create a great work of art. They want to do it themselves«, commented tech analyst Alberto Romero . He considers the release of Stable Diffusion to be nothing less than “the most significant and momentous event that has ever occurred in the field of AI art models”.

Ein rothaariges Mädchen vor einem Kraftwerk: Zu jedem sogenannten Prompt kann Stable Diffusion gleich mehrere Bilder auf einmal erstellen

A red-haired girl in front of a power plant: Stable Diffusion can create several images at once for each so-called prompt

Photo: CC0

Emad Mostaque, meanwhile, is already promising that Stable Diffusion will be up soon iPhones could run

. Mostaque, a former hedge fund manager , is the founder and backer of Stability.ai, an AI company driving the development of Stable Diffusion. The project, in which a research group from the LMU Munich

is involved, just over a week ago under a so-called »Creative ML OpenRAIL-M« license. Among other things, it obliges users not to use Stable Diffusion for illegal purposes, for example to defame or disparage others. In compliance with the license regulations, commercial use of the software is permitted in addition to private use. (Here is an up-to-date overview of third-party applications

.)

Currently, Stable Diffusion is less than four gigabytes in size and the images that the system generates measure 514 x 484 pixels. When the software was released, Stability.ai said pretentious

: “This release is the culmination of many hours of collective work to create a single file that compresses humanity’s visual information into a few gigabytes.”

These images are created by Stable Diffusion

Who one If you want to get a quick impression of what can be created with Stable Diffusion, you should go to Lexica.art , a searchable online archive dedicated to those AI images. On Lexica.art one encounters Angela Merkel as Joker from »Batman«

and as vampire Alucard from »Hellsing«

, Malcolm X as “Fortnite” character

and Elon Musk in the style of a character from the action-adventure »The Last of Us«

. And if you are looking for something more bizarre, you will see a Teletubbies parade , which is located in the Nazi era.

Ein bisher unbekannter Teil deutscher Geschichte: Der Prompt zu diesen Bildern lautete »Teletubbies bei einer Parade in Nazi-Deutschland«

Ein bisher unbekannter Teil deutscher Geschichte: Der Prompt zu diesen Bildern lautete »Teletubbies bei einer Parade in Nazi-Deutschland«

A hitherto unknown part of German history: The Promptly accompanying these images was »Teletubbies at a parade in Nazi Germany«

Photo: CC0

»A dreamy photo of a pretty dark-haired French girl wearing a loose-fitting, oversized white sweater, snuggling up against a windowsill at sunset to sit at a To sip a cup of tea«: These images were generated with this specification, among other things, plus keywords like »HDR« and »photorealistic«

Photo: CC0

    Recommended External Content

    At this point you will find external content from Twitter that complements the article and is recommended by the editors. You can show it and hide it again with one click.

    Hundreds of millions of network images as a basis

    While many internet users see Stable Diffusion as a nice pastime or a way to counter boring icon photos with original themes, others don’t have a good gut feeling about the software to use. This is how the tech blogger Andy Baio writes on Twitter , while he enjoys playing with AI text-to-image generators. At the same time, however, the programs “raise so many ethical questions that it is difficult to keep track of them.”

    Und noch einmal Merkel, diesmal im van-Gogh-Stil: So etwas konnten auch schon ältere Apps, anders als sie ist Stable Diffusion aber nicht auf wenige Kunststile beschränkt

    And once again Merkel, this time in van Gogh style: Older apps could do something like that, but unlike them, Stable Diffusion is not limited to a few art styles

    Photo: CC0

      Tools like Stable Diffusion and Dall-E 2 provided surprising, fun and beautiful results, Baio points out in a lengthy blog post , “but only because of the vast arsenal of human creativity they were trained on.” Like other AIs -Systems have been trained on stable diffusion with millions of images and image descriptions from the Internet, he writes, “but if any of these systems required artists’ permission to use their images, they probably wouldn’t exist.”

      In another blog post, Baio gives an example of twelve million motifs, which network images exactly at the training were used . The makers of Stable Diffusion used m Several sub-packages of a huge data package called LAION-2B(en), which includes a total of around 2.3 billion network discoveries. The ultimately decisive training package with about 600 According to evaluations by Baio, millions of images included numerous images from Pinterest, from blogs hosted by WordPress.com, but also from portals such as Flickr, DeviantArt and Wikimedia. Photo sites and the online art shop Fine Art America were also among the places where the images were found.

      Landscape pictures with the keyword Marbella: AI art can be so beautiful

      Photo: CC0

      Landschaftsbilder zum Stichwort Marbella: So schön kann KI-Kunst sein

      That their training material is not unproblematic, also know the makers of Stable Dif fusion. In a text accompanying the publication, Emad Mostaque wrote that his system could “reproduce social prejudices” because image-text pairs found on the Internet went into its development. In this case, by the way, only the English-speaking part of the Internet is meant anyway. The (en) in the name of the 2.3 billion-image training package alludes to the fact that these are primarily motifs that were found with English-language image captions.

      An overview page for Stable Diffusion says , text and images from communities and cultures that use languages ​​other than English would “likely be underrepresented” by the system : This affects the overall performance of the model, as white and Western cultures are often “set as defaults” in a sense.

      Andy Baio also finds it questionable that the standard version of Stable Diffusion, unlike tools like Dall-E 2, allows images to be generated with celebrities and trademarked characters. Nudity is also allowed in the local version, writes Baio, referring to forums that Reddit blocked after porn images were posted there

      that came from AI generators.

      Meta-Chef Mark Zuckerberg als Cyborg: Einige der Bilder könnte man sich gut als Magazincover vorstellen

      Meta boss Mark Zuckerberg as Cyborg: Some of the pictures could well be imagined as a magazine cover

      Photo: CC0

      No one knows what’s really going to happen

      »Perhaps the risks are being exaggerated and we are at the beginning of a massive democratization of art creation«, says the end of Baio’s blog post. »Or else these platforms make the already precarious life of artists even more difficult while opening new avenues for fakery, disinformation, online harassment and exploitation.«

      Verwaiste Städte samt Tornado: Mit Stable Diffusion lassen sich auch Bilder zu Themen wie der Klimakrise generieren

      Deserted cities including tornado: Stable Diffusion can also be used to generate images on topics such as the climate crisis

      Photo: CC0

      Other commentators also prefer not to determine exactly where the journey in the field of image generation via AI is going. The British programmer Simon Willison, for example, is very impressed by a feature called img2img

      . What is meant by this is the possibility of using not only text input with Stable Diffusion, but also existing images as prompts. “Imagine having an on-call conceptual artist,” Willison writes, “to create anything you can imagine and work with you towards your ideal outcome. Free of charge (or at least very cheap).«

      At Netzpolitik.org forecasts Sebastian Meineck under the heading »The beginning of something big« , tools like Stable Diffusion make the world “literally more beautiful”. He also predicts that the technology will be accepted “after a debate.” But Meineck also emphasizes that he still has many questions, such as what the AI ​​art boom will mean for the jobs of artists or graphic designers.

      »This text will also become a joke«, the author even speculates, »because I’m overlooking something that only later turns out to be obvious. « People would rarely have done what was previously imagined with new technology.

Related Articles

Back to top button