Creative Personal Assistant

Summary

Creative Personal Assistant is a speculative prototype I made to explore Stable Diffusion, the open source text-to-image AI model that was released August 2022.

As I started exploring Stable Diffusion for this prototype just a few weeks after its release, the infrastructure required to run it was slow, unstable and had associated fees, so no online demo is available for the time being.

Example of e-mail response from my Creative Personal Assistant

Background

As outlined the original post here on the blog Week 82: Stable Diffusion as a Creative Personal Assistant, it’s been clear to me that these years new wave of generative AI/ML technologies are going to change the technology product landscape and how we work. I prefer to experience these technologies up-close, hands-on. For this project, as usual, I try to combine multiple motivations into a single project:

Gain hands-on experience with Stable Diffusion
Use prototyping to speculate how this technology could integrate into daily life
Explore the concept of AI personal assistants by John Carmack

Stable Diffusion Hello World

I decided to take it out for a spin on my laptop, a 2020 Intel MacBook Pro. This Setting up Stable Diffusion for MacOS article by Craig Morten really helped. In less than 30 minutes I was generating my first images.

Three teddybears watching a sunset together, by Stable Diffusion

AI as Personal Assistant

A few weeks ago, I was listening to an episode of Lex Friedman’s podcast featuring John Carmack. It’s a whopping 5 hour conversation, but I found it all kinds of interesting. Especially the part around Artificial General Intelligence (AGI) caught my ear. I’m really curious about the idea that, in the future, we will have a collection of AI’s as personal assistants.

Creative Personal Assistant

So why not start developing a AGI as a personal assistant today? a Creative Personal Assistant using Stable Diffusion.

Basically the idea is to build a service you can email with a prompt, and receive a reply with the output from stable diffusion. An effortless, informal conversation with a personal assistant. The flow would look something like this:

Send an email to personalassistant@domain.com with a prompt
Wait a few minutes
Receive a reply

I created an e-mail address for the purpose and registered the IMAP/SMTP connection and authentication info. Then I built a Node.js server that monitors the inbox of e-mail address. When it registers a new e-mail, it extracts the first line from the message body and defines that as the prompt. A text-to-image stable diffusion python-script is then spawned from my server, and when it detects a new output image, the image is e-mailed back to the sender.

Server receives e-mail. Registers sender and message as prompt
Server spawns the Stable Diffusion txt2img script
Server registers image file output from txt2img
Server sends e-mail back to sender containing image file

10 hours after thinking of the idea, I had a working proof-of-concept prototype. Incredible what’s possible with technology nowadays.