Author: Melanie Mudge
That question and others like it are on everyone’s minds these days as Artificial Intelligences (AI) like ChatGPT by OpenAI, Dream by WOMBO, and Google Bard take the digital world by storm. In case you’re not familiar with them, these AIs take textual prompts written by humans and turn them into text or images. They’ve been used for all sorts of applications, like writing entire papers, generating ideas, and creating digital art.
So when it comes to content creation and art, it’s fair to question whether humans are being replaced.
Full disclosure, Scribely specializes in accessible digital content created by humans for humans. That’s because we don’t think AI technology is advanced enough to really comprehend like humans do…yet. It’s highly likely that the technology could get there in the near future, but until then, we believe human-created content is superior.
However, these new AIs mark a drastic improvement in AI technology, so we wanted to take some time to really consider their promise, as well as their drawbacks. And perhaps by looking at the implications of such models, we can come to a more realistic perspective of the role they can and should play in the future.
When we first heard about AI image generators like DALL-E 2 by OpenAI, we were curious. Instead of looking at an image and describing its look and feel with text, like we do at Scribely, DALL-E does the reverse. Users input text to describe an image they want to create, then DALL-E generates four images based on that description by drawing on its large database of already existing images.
It’s not just the fact that the process is reversed that intrigued us, but also its possible applications. Not only could it be used as a way to test the quality of alt text descriptions—does this even remotely look like the image I just described?—but it could also open the doors for people with visual impairments to create art.
Alt text written by Scribely: Person's hands gently pull apart a cracked egg, releasing a yolk that falls towards a well of flour shaped like a volcano.
DALL-E generated image using that alt text description:
As August Camp, a multimedia artist and early adopter of DALL-E, explained on OpenAI's blog, “Conceptualizing one’s ideas is one of the most gatekept processes in the modern world. Everyone has ideas—not everyone has access to training or encouragement enough to confidently render them.” Or, in this case, not everyone has access to accommodations that enable them to bring their creative ideas to life.
In that way, image generators like DALL-E can simply be viewed as the newest tool in an artist’s toolkit. The words are the paint, and the AI is the paintbrush being directed by the artist.
Beyond simply opening the door for artists to create art in new ways, DALL-E’s technology also offers the ability to create things you might never envision on your own.
We spoke with another early adopter of the tech, Noel McCarthy, who is a Production Designer for commercials and has a background in photography and conceptual art. When he heard about DALL-E in the summer of 2022, his creative curiosity was piqued, and he immediately began experimenting with it to see if he could create a self-portrait.
For Noel, since it uses plain-language prompts, it felt approachable because he didn't have to be a computer expert to use it. But as someone whose job is to take written scripts with no visual direction and translate those into a visual medium, he was also curious about its ability to do the same. Would it be able to capture the vision, the feel that was intended by the words? Could it?
During that time, he spent hours writing new prompts to see if he could get the technology to encapsulate his essence. The algorithms were constantly being updated and tweaked, so the same prompt a day later could produce dramatically different results. “It would take hundreds of tries,” he remarked. “I’d have thousands of these images, but with only about 20 of them could you tell they were me”—meaning they weren’t photorealistic, but when he showed his friends, they could tell the images were him.
He posted a few that he liked on his Instagram. Though they vary greatly, none of them creates a realistic representation of a human face, let alone Noel’s face. Instead, they’re all distorted, almost as if his face were made of clay that’s been folded, stretched, and pressed in on itself in otherworldly and impossible ways. Because they defy the rules of physics and bodies as we know them, we think it’s fair to say that if he had set out to paint or draw a self-portrait, he would never have come up with these. In this instance, the final images are like a collaborative work of art where one “person” in the collaboration isn’t bound by the same expectations, rules, and laws.
But as we mentioned at the beginning, many are wondering if AI will eventually progress so far that it begins taking our jobs and rendering humans obsolete, at least in certain fields or capacities. The worry is that AI will eventually be able to successfully replicate that human je ne sais quoi.
So we asked Noel if he felt that was a risk as he helped to refine DALL-E’s algorithms. “It’s here to stay, it’s definitely our future,” he said. “But there becomes a sameness that people will ultimately reject. It’s only going to be good when it’s re-personalized. It still needs a voice.”
Of course, science fiction writers have been warning us about AI’s risks for decades now, and only time will tell how the technology will evolve. But for now, it seems that DALL-E, ChatGPT, and similar AI assistants won’t be transcending their programming to become our AI overlords any time soon. Rather, they open up new possibilities and creative frontiers that are only limited by our ability to express our imaginations in words. As Noel mused, “It’s very messy right now. There is a good possible outcome for DALL-E. It’s not all computer, but the world is too vast for it to not involve computers.”
AI image generators—and AI in general—have a lot of potential, and indeed are already turning potential into reality. Only time will tell where the technology goes, how we apply it, and if we figure out in time how to give it appropriate boundaries. Or, as Noel put it, “I hope as a society that we’re not that lazy that we would just allow computers to take over the conversation entirely and sit back and watch.”