Creative Personal Assistant is a speculative prototype I made to explore Stable Diffusion, the open source text-to-image AI model that was released August 2022.
As I started exploring Stable Diffusion for this prototype just a few weeks after its release, the infrastructure required to run it was slow, unstable and had associated fees, so no online demo is available for the time being.
As outlined the original post here on the blog Week 82: Stable Diffusion as a Creative Personal Assistant, it’s been clear to me that these years new wave of generative AI/ML technologies are going to change the technology product landscape and how we work. I prefer to experience these technologies up-close, hands-on. For this project, as usual, I try to combine multiple motivations into a single project:
- Gain hands-on experience with Stable Diffusion
- Use prototyping to speculate how this technology could integrate into daily life
- Explore the concept of AI personal assistants by John Carmack
Stable Diffusion Hello World
I decided to take it out for a spin on my laptop, a 2020 Intel MacBook Pro. This Setting up Stable Diffusion for MacOS article by Craig Morten really helped. In less than 30 minutes I was generating my first images.
AI as Personal Assistant
A few weeks ago, I was listening to an episode of Lex Friedman’s podcast featuring John Carmack. It’s a whopping 5 hour conversation, but I found it all kinds of interesting. Especially the part around Artificial General Intelligence (AGI) caught my ear. I’m really curious about the idea that, in the future, we will have a collection of AI’s as personal assistants.
Creative Personal Assistant
So why not start developing a AGI as a personal assistant today? a Creative Personal Assistant using Stable Diffusion.
Basically the idea is to build a service you can email with a prompt, and receive a reply with the output from stable diffusion. An effortless, informal conversation with a personal assistant. The flow would look something like this:
- Send an email to email@example.com with a prompt
- Wait a few minutes
- Receive a reply
I created an e-mail address for the purpose and registered the IMAP/SMTP connection and authentication info. Then I built a Node.js server that monitors the inbox of e-mail address. When it registers a new e-mail, it extracts the first line from the message body and defines that as the prompt. A text-to-image stable diffusion python-script is then spawned from my server, and when it detects a new output image, the image is e-mailed back to the sender.
- Server receives e-mail. Registers sender and message as prompt
- Server spawns the Stable Diffusion txt2img script
- Server registers image file output from txt2img
- Server sends e-mail back to sender containing image file
10 hours after thinking of the idea, I had a working proof-of-concept prototype. Incredible what’s possible with technology nowadays.