A friend who just goes through the motions. A spouse who has fallen into the routine of habit. The message of exhausted longing from a jet-lagged traveler. A repressed, unwanted or unwelcome kiss. These were some of the interpretations that resonated in my head after seeing aweird digital art thingyby the Emoji Mashup Bot, a popular but defunct Twitter account that combined the parts of two emoji into new, surprising, and surprisingly resonant compositions. The bot took the hand and eyes of the 🥱 Yawning Emoji and pinched them with the mouth of the 😘 Heart Kiss Emoji. That is.
Compare this simple method to supposedly more sophisticated machine learning-based generative tools that have become popular in the last year or so. When I asked Midjourney, an AI-based art generator, to create a new emoji based on these two, it produced compositions that were certainly shaped like emoji, but didn't possess the style or meaning of the simple mashup: a series of yellow Molded, heart-shaped bodies with protruding tongues. One seemed to be eating another tongue. They all seemed to me the kind of monstrosities that could be offered up as carnival game prizes, or stickers that come with the junk mail to raise funds for childhood cancer.
ChatGPT, the popular text-generating bot, didn't fare much better. I asked it to generate new emoji descriptions based on existing emoji parts. His ideas were cool but mundane: a "yawning sun" emoji with a yellow face and open mouth to represent a sleepy or lazy day; a "multitasking" emoji with eyes looking in different directions to represent juggling multiple tasks at once. I returned those descriptions to Midjourney and got competent but dull results: a pair of screeching suns, a pair of eyes in a yellow face dripping black, tarry sludge.
Maybe I could have designed better ads or spent more time refining my results on ChatGPT and Midjourney. But both programs are the pinnacle of AI-powered generative creativity research, and when it came to making novel, expressive emoji, they were bested by a foolproof computer program that takes parts of a hat's face and puts them together.
People dream of the creativity of AI. They dream of dream computers, for starters: that once the software feeds terabytes of text and image data, it can unleash something resembling a machine's imagination.Authorworks instead of just generating them. But that dream includes an assumption: that AI generators like ChatGPT, DALL-E, and Midjourney can achievesomekind of creativity with equal ease and power. Their creators and proponents classify them as capable of addressing all forms of human intelligence, all generators.
And not without reason: these tools can generate a version of almost anything. Many of these versions are incorrect or misleading or evenpotentially dangerous. Many aren't interesting either, as the emoji examples show. It turns out that using a software tool that can do one thing is a little different, and much more fun, than using one that can do everything.
Kate Compton, a computer science professor at Northwestern University who has been creating software for generative art for more than a decade, doesn't think her tools are artificially intelligent, or intelligent at all. "When I make a tool," Compton told me, "I've created a little creature that can do something." That something is often more meaningful than useful: your bots imagine the inner thoughts of a human.Lost Autonomous Teslaand draw pictures of herhypothetical alien ship. Offers similar widgetsHipster Cocktail Recipesor namesfake british cities. Whatever his goal, Compton doesn't want software developers like these dominating his domain. Instead, he hopes they'll offer "the small, slightly silly version."
That is far from the creator of ChatGPTOpenAI's ambition: to build artificial general intelligence, "highly autonomous systems that outperform humans at the most economically valuable work." Microsoft, which has already invested a billion dollars in OpenAI, isallegedlyin talks to invest an additional $10 billion in the company. This type of money assumes that the technology can generate massive future profits. Which only makes Compton's claim all the more shocking. What if all that money is chasing a bad idea?
One of Compton's most successful tools is a generator calledMaßwerk, which uses templates and content lists to generate text. Unlike ChatGPT and its cousins, which are trained on large data sets, Tracery requires users to create an explicit structure, called a "context-free grammar", as a model for its output. The tool has been used to create Twitter bots in various ways, includingthinkpiece headline launchesyabstract landscapes.
A context-free grammar works a bit like a nested crazy library. You write a set of templates (for example, "Sorry, I didn't make it to [event]. I had [problem]) and content to back up those templates (problems could be 'a hangnail', 'a whim', 'explosive diarrhea') ", "a [conflict] with my [relative]"), and grammar puts them together. This requires the author of generative art to consider the structure of what they want to generate, rather than asking the software for output like ChatGPT would. or Midjourney. The creator of the Emoji Mashup Bot, a developer named Louan Bengmah, would have had to split each source emoji into a series of parts before writing a program that would assemble them into new configurations. That takes a lot more effort, not to mention a few. technical skills.
For Compton, that effort is not something to be avoided, it is the essence of the exercise. "If I just wanted to do something, I could do something," he told me. "If I wanted to do something, I could do it." Contrary to OpenAI's mission, Compton sees the purpose of generative software differently: the practice of creating software tools is comparable to the birth of a software creature ("aChibiversion of the system", as she put it to me) that can do something - mostly bad or weird, or in any case cartoon versions of it - and spend time communicating with that creature like you would with a toy dog, a child or a benevolent foreigner. The goal is not to produce the best or most accurate representation of a hipster cocktail menu or a mountain view at sunrise, but to capture something truer than reality. ChatGPT's ideas for new emojis are doable , but the Emoji Mashup Bot offers feels right; you can use them instead of just posting about the fact that they were computer generated.
"That's maybe what we missed in the generators generating everything," Compton said: an understanding of what the machine is trying to create in the first place. Look at the system, see the possibilities within it, identify its patterns, encode those patterns in software or data, and then watch it work over and over again. Typing in ChatGPT or DALL-E 2 is like tossing a coin into a wishing well and turning the bucket back up to find a bunch of seaweed or a puppy instead. But Compton generators are more like putting a coin in oneGachaponA machine that knows in advance the type of object that thing will deliver. These efforts point to a practice in which an author hopes to help users engage with the software from him rather than derive a result from him. (It also explains why Twitter has become such a prolific host for these bots: the platform inherently encourages caricature, brevity, and repetition.)
Much is gained by showing how a software generator works and how its creator understood the patterns that define its theme. The Emoji Mashup Bot does this by displaying the two emoji from which it created a particular composition. One of the first text generators I remember was a strange software toy calledKant-Generator Pro, developed for the Mac in the 1990s. It used context-free grammars to compose bloated text reminiscent of German illustrator Immanuel Kant, though it also included templates for less esoteric compositions like thank-you notes. The program came with an editor that allowed the user to view or compose grammars, providing a way to look under the hood and understand the truth of the software.
But such transparency is difficult or impossible in machine learning systems like ChatGPT. No one really knows how or why these AIs produce their results, and the results can change inexplicably from one moment to the next. When I ask ChatGPT about emoji concepts, I have no idea of his emoji theory: what patterns or models he interprets as important or relevant. I can examine ChatGPT to explain its work, but the result is never explanatory; rather, it's just more generated text: "To generate ideas for emojis, I used my knowledge of common concepts and themes that are often reflected in portrayed emojis as well as my understanding of human emotions, activities, and interests."
Perhaps as creative collaborations with software builders become more frequent, general builders will be redesigned as middleware used by custom software with more specific goals. Compton's work is lovely, but it doesn't really aspire to utility, and there are certainly plenty of opportunities for generative AI to help humans make useful, even beautiful, things. Still, reaching that future will take a lot more work than just chatting with a computer program that, at first glance, seems to know something about everything. Once that first blush wears off, it becomes clear that ChatGPT doesn't really knowanything– Instead, there are compositions that simulate knowledge through the persuasive structure. And as the novelty of this surprise wears off, it becomes clear that ChatGPT is less of a magical wish-granting machine and more.Sparring partner interpreters, a tool that is more interesting when it does its job poorly than well.
No one really wants a tool that can do everything, because such a need is a theoretical hoax, a capitalist fantasy, or both. The hope or fear that ChatGPT, Midjourney, or any other AI tool will kill knowledge, craft, and labor betrays an obvious truth: these new gizmos includecompletely new regimensknow-how, crafts and work. We play with tech demos, not finished products. Ultimately, the raw materials of these AI tools are used for things that people pay money for, unfortunately. Some of this new work will be stupid and offensive, as companies demand value generation around the AI systems they have invested in (Microsoft isallegedlyconsider adding ChatGPT to Office). Others can be enjoyable and even insightful, if they can convince creators and viewers that the software is doing something specific and speaking with intent and giving them the opportunity to engage in dialogue with it.
At the moment this dialogue is more simulated than real. Yeah sure, you can "chat" with ChatGPT and iterate images with Midjourney. But in many of these encounters, there's a feeling of emptiness because the software is running amok. It appears to be listening and responding, but it's only processing the input at the output. AI creativity must abandon the silly, smug dream of general artificial intelligence in favor of concrete details. An infinitely intelligent machine that can do anything is useless.