Week 82: Stable Diffusion as a Creative Personal Assistant
4 Sep, 2022
Admittedly, the potential of missing the generative AI/ML train that’s running full speed these years gives me anxiety. I haven’t found the way to use it in my current job yet, but I’m curious.
This week I have been reading about Stable Diffusion, the open source latent text-to-image diffusion model capable of generating photo-realistic images given any text input. It being open source, malleable and run locally, is exciting.
Stable Diffusion Hello World
I decided to take it out for a spin on my laptop, a 2020 Intel MacBook Pro. This Setting up Stable Diffusion for MacOS article by Craig Morten really helped. In less than 30 minutes I was generating my first images.
AI as Personal Assistant
A few weeks ago, I was listening to an episode of Lex Friedman’s podcast featuring John Carmack. It’s a whopping 5 hour conversation, but I found it all kinds of interesting. Especially the part around Artificial General Intelligence (AGI) caught my ear. I’m really curious about the idea that, in the future, we will have a collection of AI’s as personal assistants.
Creative Personal Assistant
So why not start developing a AGI as a personal assistant today? a Creative Personal Assistant using Stable Diffusion.
Basically the idea is to build a service you can email with a prompt, and receive a reply with the output from stable diffusion. It would look something like this:
- Send an email to personalassistant@domain.com with a prompt
- Wait a few minutes
- Receive a reply
I created an e-mail address for the purpose and registered the IMAP/SMTP connection and authentication info. Then I built a Node.js server that monitors the inbox of e-mail address. When it registers a new e-mail, it extracts the first line from the message body and defines that as the prompt. A text-to-image stable diffusion python-script is then spawned from my server, and when it detects a new output image, the image is e-mailed back to the sender.
- Server receives e-mail. Registers sender and message as prompt
- Server spawns the Stable Diffusion txt2img script
- Server registers image file output from txt2img
- Server sends e-mail back to sender containing image file
10 hours after thinking of the idea, I have a working proof-of-concept prototype. Incredible what’s possible with technology nowadays.