Admittedly, the potential of missing the generative AI/ML train that’s running full speed these years gives me anxiety. I haven’t found the way to use it in my current job yet, but I’m curious.
This week I have been reading about Stable Diffusion, the open source latent text-to-image diffusion model capable of generating photo-realistic images given any text input. It being open source, malleable and run locally, is exciting.
A few weeks ago, I was listening to an episode of Lex Friedman’s podcast featuring John Carmack. It’s a whopping 5 hour conversation, but I found it all kinds of interesting. Especially the part around Artificial General Intelligence (AGI) caught my ear. I’m really curious about the idea that, in the future, we will have a collection of AI’s as personal assistants.
Creative Personal Assistant
So why not start developing a AGI as a personal assistant today? a Creative Personal Assistant using Stable Diffusion.
Basically the idea is to build a service you can email with a prompt, and receive a reply with the output from stable diffusion. It would look something like this:
I created an e-mail address for the purpose and registered the IMAP/SMTP connection and authentication info. Then I built a Node.js server that monitors the inbox of e-mail address. When it registers a new e-mail, it extracts the first line from the message body and defines that as the prompt. A text-to-image stable diffusion python-script is then spawned from my server, and when it detects a new output image, the image is e-mailed back to the sender.
Server receives e-mail. Registers sender and message as prompt
Server spawns the Stable Diffusion txt2img script
Server registers image file output from txt2img
Server sends e-mail back to sender containing image file
10 hours after thinking of the idea, I have a working proof-of-concept prototype. Incredible what’s possible with technology nowadays.
It wasn’t until today Sunday that I found some time to work on side projects. Last week I added tracking to Morphaweb so I can see if people actually use the site to export reels - and it turns out people do! That gave me some motivation to work a bit more on morphaweb (yes the title pun was terrible).
Next up I would like to automatically add a marker between each of the files uploaded for convenience. I should also start tagging my releases like the real software developers do.
The fifth exhibition Louisiana’s series The Architect’s Studio presents Forensic Architecture, an interdisciplinary research agency, based at Goldsmiths, University of London. Working in the intersection of architecture, law, journalism, human rights and the environment, Forensic Architecture investigates conflicts and crimes around the world.
The exhibition itself was really interesting, but even more so was that they actually keep a GitHub repository of all the models and tools they’ve created.
Another light week of side project work, but I did manage to finish the Blender x Three.js tutorials and there by also the last bit of Three.js Journey. It’s been a great course, and the best money I’ve ever spent on e-learning. Bruno is a great teacher, and he’s even added lessons since I bought the course, so maybe I’ll even get more for my money’s worth.
Blender x Three.js
Since a couple of weeks ago I finished the UV unwrapping of the model, so the last part of the lesson was how to do model optimizations and exporting everything correctly from Blender.
Quite a lot of things to keep track of - somehow both easier and more difficult than I had imagined. Final result looks great though, and I’m excited and terrified to get started on my portfolio and working on the LEGO elements.
This week was going to be a slow side project week due to lots of work and sunny evenings in Copenhagen. But then I saw a Github notification about a project I worked on a while back…
One of my hobbies is modular synthesis. One of the modules I use is Morphagene by Make Noise. By their own description:
The Morphagene music synthesizer module is a next generation tape and microsound music module that uses Reels, Splices and Genes to create new sounds from those that already exist. Search between the notes to find the unfound sounds.
It’s one of my favorite modules, but it can be a bit tricky to get sounds outside the modular system onto the Morphagene. It has a slot for an SD-card where you can place reels. The reels have to be a particular format though, and if you want to have configured your splices up front, you have to download paid software.
To work around this, I have developed a free, open source, web-based app called Morphaweb. It allows you to build reels with splice markers and export them in the correct format. All without uploading anything to a server, to protect your privacy.
To use it, you simply import audio by dragging it into the app, then use the waveform editor and shortcuts to add/remove splice markers. When you’re done, you can download your reel as a Morphagene-compatible wave-file.
This week was a bit sporadic because of work taking up a lot of time and headspace, but I did find time to think about newspapers, work a bit more on blender lessons, and read.
I spent my holidays reflecting a bit about my use of time on social media, and how to stay up to date with “the industry” and people whose work I like.
Twitter has brought me so much joy, allowing me to digest an incredible amount of news and work from a wide range of organizations and people. But the product has devolved into so much suggested content that I didn’t ask for yet hits all the right brain chemicals to keep me scrolling. Same story for instagram.
At the same time, RSS feeds are seeing a resurgence, and I have now switched the same setup as Matt, combining NetNewsWire and Feedbin. Feedbin even allows for twitter integration, so now I can get the tweets I’m interested in without any of the garbage and NetNewsWire has a “star” functionality that functions like a archive, a great replacement for twitter’s bookmarks.
NetNewsWire is efficient, but not exciting. I miss beautiful typography as graphic elements, layout and images from traditional media. Feedbin has an API, so I’m wondering if I should try and sketch an “RSS newspaper”, a more visceral, scrolling that would allow me to consume more news in a visually enticing way.
I’m still experimenting with my setup and building it out. Still check twitter and instagram, but through the web-versions rather than the apps.
I’m at the stage of doing UV unwrapping of all objects, which is fun and frustrating. Since the whole idea was to create a 3D world including LEGO elements for a portfolio, I’m not looking forward to unwrap those.
I’m reading Build by Tony Fadell together with colleagues. It’s curious, because it’s the kind of book I would have been ecstatic to read 5 years ago. I have a lot of respect for Tony Fadell and what he has accomplished in his career, but coming into this book I have to admit I thought didn’t need to read it. My manager kept pushing it though, so now I am, and learning a lot from it.