Apparently, he had created a Slack bot with different personas based on which
slack channel he uses, e.g. #ai-copywriter for copy writing,
#ai-web-developer, #ai-lawyer, etc.
I thought the idea of having an AI team was genius, and started recreating it
myself. But I had never built a slack bot before.
Iteration 1: Zapier
I googled a bit and found that I could use Zapier to create the necessary glue
between OpenAI’s GPT and Slack:
Zapier monitors my slack channels
If message appears in #ai-copywriter, then…
Prompt GPT
Post GPT response back into slack channel
It got me started quickly, but I found Zapier’s GPT integrations to be buggy,
and it costs money to have the amount of Zaps that I need running. Their no-code
flow was barely easier, and surely more limited, than writing my own client.
Iteration 2: SlackGPT v1
Zapier seemed unnecessary middleware, so I asked ChatGPT if it could write me a
Node.JS-based Slack bot that basically does the four-step process outlined
above. It generated fairly decent code that got me 90% there, and I only had to
modify a few dependencies and functions to get it working.
This is a great next iteration, but has its limitations. Technically, every GPT
prompt is a single prompt-response pair, meaning it won’t hold in memory any
previous conversation. In my experience, GPT rarely gets anything right the
first time, the power is in the conversation.
Iteration 3: SlackGPT v2
So I decided to change the bot to create new threads every time I submit a
prompt. That way, it will be able to hold the thread in memory as a
conversation, but there’s also a benefit for me in that I have a mental model
for what I want it to remember. I.e. it’s memory is thread-based.
Limitations
Multiple channels instead of bots
I wanted each role to have it’s own user, like @Copywriter and
@WebDeveloper, but it’s only possible to have a single bot per Slack App. So
rather than having multiple bots, I have a single bot that monitors each
channel, and based on which channel it detects a mention in, it switches persona
per my prompt design:
These prompts are ok, but I know there is a lot more that can be done on this
aspect of SlackGPT to make it work a lot better.
I also need to mention @SlackGPT every time I want to prompt. This should be
fairly trivial to solve if I bothered reading the Slack API, but even when I’m
in a thread with @SlackGPT, I currently need to mention it to trigger it.
Within a thread, that shouldn’t be necessary.
Perspectives
This was a really fun experiment done in a couple of days, and a powerful
prototype. There are so many features that I can see being within reach:
Higher abstraction level tasks
Currently, each member of the team solve tasks for me based on a prompt. But
what if I could instead give the team higher level tasks, such as:
“I want to develop a new feature for our website where clients can book
consultancy hours with me. It should appear as a small link on the website,
and when clicked, go to my Calendly site”
Perhaps a #ai-product-manager role could take this task, break it into
separate parts, and assign it to the various team members who will need to write
copy, design visual appearance and develop the code necessary to launch the
feature. Maybe the team members could even have seniority or rank, e.g.
#ai-web-developer takes orders from #ai-product-manager or when in conflict,
#ai-designer outranks #ai-web-developer. Suddenly introducing social
dynamics into a team.
How I spend my time
I’m curious how this approach could enable me to spend less time coding and
doing production work, and more time at higher level problems, which is how I’d
like to spend my time nowadays. The AI team can work countless iterations,
around the clock, so I would also be able to run many more iterations
significantly faster.
As we work together more, I could pick the best examples of work the team has
done and include it in their prompt design or even fine-tune a model so that my
team is refined over time.
Being able to ask a bot to generate applications or components is fantastic,
already making me significantly more efficient. But the idea of having a team
that works around the clock and solving things under my direction is, at the
time of writing, a superpower.
Cost and Scale
I’m currently running the first iteration of an AI-team of three roles for a few
dollars per month. For example, the NodeJS server that runs the SlackGPT bot
cost 490 OpenAI tokens to generate, which translates to around $0.01. And then
of course 20-30 min of my time to iterate and implement the code, which
hopefully will decrease as the systems become more sophisticated.
But you get the idea, right? As a generalist who knows just enough to get by in
full-stack design and engineering, I can single-handedly build products at
speed.
Role of Automation
The idea isn’t to replace humans, I dream of having a human partner to build
products with, but I see an AI-team as a great supplement, and potentially the
main resource for lower-level production tasks relating to code, design and
text.
A source of inspiration for me has been the No Man’s Sky team. They built a
procedurally generated game so vast that it would be impossible for them to do
quality control on every planet. So
they built space probes
that allows them to at least do quality control on a significantly larger number
of planets to fine tune the parameters.
Presentation and Curation
This system of thinking seems critical for this age of AI-enabled design and
development. With smaller constraints on the amount of work that can be
produced, my role will be to direct and curate the work, so that process needs
to be effective and constructive too.
I’m curious how the team could present the work, whether it be code or design or
text. Maybe there could be a Notion integration so that I simply click onto a
“Week 109” page, where multiple solutions are laid out in front of me, and I can
give the team feedback and consolidate the work?
What’s next
In summary, I think what’s next is to:
Fix limitations outlined above if possible
Write better prompts for each team member
Add ability for team to collaborate on tasks
Add a designer to the team, and ability to generate visual results
Add integrations with systems that allow the team to present work
The code for SlackGPT, along with instructions for how to deploy it on an online
hosting service like Render, is available at
https://github.com/knandersen/slackgpt.
This post was originally written March 21, but wasn’t posted until late
April…
It’s been quiet for a few weeks! Once you miss a weeknote it’s easier to miss
the next one too, unfortunately. But I’ve been busy.
CADGPT
A few weeks ago I thought, sure 2D bitmaps through generative AI is interesting,
but I’m even more curious about how generative AI could be a useful tool in
physical product making.
I’ve worked with parametric CAD modeling for years, but the problem there is
that you still have to know which steps are needed to get from intention to
result. GPT-style generation holds the promise that you don’t have to know how,
in order to get to where you want. And allows for effortless mixing of concepts
and mental models.
I’m a Rhino-user, so I decided to try and built “CADGPT”, a chatbot for
generating Rhino-files. See an example here:
I saw a few different ways to get to the result above:
Write a standalone Python/JavaScript GPT client that generates Rhino-files
Write a Rhino integration that allows for querying GPT
Approach 2 seemed difficult and like I would spend unnecessary time figuring out
how to write Rhino plug-ins, so I went for Approach 1.
For Approach 1, I was inspired by
Nat Friedman’s natbot, where he uses a library
called Playwright to navigate a browser based on an elaborate GPT prompt design
(demo here).
It was fun to go from idea to prototype in just a couple of hours, but I also
quickly realised I might be solving the right problem with the wrong solution.
What’s our language for describing shape? How would you tell CADGPT to generate
anything but principal shapes, say, a dog. How would you describe a dog’s shape
without using the word dog? Maybe text or chat isn’t the right modality for
this. I’ll stop for now, but I have more thoughts on this which I might get to
at a later time.
Editing this post a month later, I realized I could have just asked GPT how to
think about approaching this task, i.e. approach 1 or 2. When I began using GPT,
I was mostly thinking low-level tasks. Now that I have become more experienced
and absorbed more approaches from people I follow on Twitter, I find myself
giving GPT increasingly higher level abstraction tasks.
I mentioned last week that I was trying out LAVIS, but that the Ubuntu machine I was running it on died. I spend some time reinstalling and now have a bit to share.
It’s really cool how easy it is to play with image analysis, and like I also mentioned last week, feels like reverse diffusion. Here’s a photo I found on unsplash:
If you like this photo, please check out the photographer FETHI BOUHAOUCHINE. Running BLIP Caption on it, I get the caption: “boy smiling while holding leaf”
That’s cool, but what is even cooler is that you can use GradCam to give you a heatmap of the parts of the image that are “boy smiling while holding leaf”:
And you can even get gradient maps for the individual keywords.
I think I’ll leave LAVIS there for now, but I have a feeling I will return to it at some point.
Self-Employment Status
I’m not a solo-entrepeneurial type, and I have always worked in teams and mostly in large organizations, so this is a journey filled with a lot of new learnings. About myself, about living with uncertainty MOREMOREMORE
Most people have responded ranging from curiosity and support to envy. For a lot of people, I’m living the dream. I don’t quite see it that way, or maybe it’s just about defining what dream means.
As I have also experienced in my work, working on a dream-product or dream-project doesn’t mean it’s fun all the time. “If you love what you do, you’ll never work a day in your life”, seems complete bullshit to me. A better way of framing it, in my opinion, is the Fun Scale-system that I learned about when I read The Ultimate Hiker’s Gear Guide:
Fun Scale
Type 1 fun is fun to do and fun to talk about later.
Type 2 fun is not fun to do but fun to talk about later.
Type 3 fun is not fun to do and not fun to talk about later.
Type 2 and 3, over time, yield the most memorable and significant kinds of fun. But they wear you out, so you need some type 1 fun during your day or week to keep you going.
Being my own manager now, although I have total freedom, setting that balance for myself is challenging. In a lot of ways I feel way more busy than when I had a regular job. Without the pressures of deadlines or other constraints, I am ultimately faced with the vastness of being able to do whatever I want.
I’m finding it hard to prioritize my projects and time. There are so many things I want to do. Should I prioritize doing:
What I feel most like doing?
What might make me unique from other people out there?
What might be the most exciting career path?
What might be the safest career path?
One thing that’s been helping is listening to the Pathless Path podcast, which is about the art of not having a regular job. I love how it ranges from talking about the more emotional aspects of loneliness, social pressures, etc., to things like how to think about financials, budget, or marketing.
Particularly, “The Art of Sabbaticals” episode was really helpful. I don’t know a lot of people who have taken sabbaticals or broken out as freelancers, and even fewer within my domain, so it’s nice to hear from someone with more experience.
A brief status from my side at this point after 6 weeks:
What’s going well
Cooking more, fermenting food and drinks again
Sleeping a lot better
Absorbing a lot of research and literature
Meeting lots of interesting people
Exploring Copenhagen more
Not going so well
Learnings feel scattered and unfocused
Starting to feel lonely working by myself
Struggle sitting at home
What I’m doing about it
To deal with the things not going so well, a few things are in the works. I already use cafees and workspaces a lot, but I can mostly do writing work.
To be able to do physical prototyping work and to feel less lonely, I am applying for a 12-week startup incubator course which will start in April. I think it could be a great opportunity to meet other people facing similar challenges, and have a place to work from.
To modularize my learning, I am creating timeboxes for myself and planning projects to work on during the coming months. I am trying to fold learning into projects. A few topics I’m wanting to learn and the preliminary plan:
For Calculus, I did actually take (and surprisingly pass) it at uni, but I never really learned and it seems to be foundational for learning ML.
Volunteering and local clubs
I have also volunteered to help start up a local Coding Pirates club where kids can learn to code. I’m really excited to play and learn with kids, and hopefully reconnect with some of the products I’ve helped design.
I also want to be part of things, however, where I don’t have to lead or drive things, but can just have fun. Together with a friend, I recently started going to CPH MUSIC MAKER SPACE which is a sound hacking community that meets every week to build music machines.
This week, I soldered up a couple of Breadboard Friends, a set of helper modules designed by the legendary Eurorack synthesizer manufacturer Mutable Instruments aka. Émilie Gillet.
I have a number of MI modules, but what I especially love by Émilie is how incredibly thoughtful the modules are designed and that all hardware and software is open source.
My current big picture plan is to use the Raspberry Pico as my MCU and MicroPython for use in my eurorack synth, and write a set of modules for doing audio/CV stuff. The Pico W will soon have BLE support, and I was thinking I could make a sequencer app on a phone and use it to talk to a Pico W in a eurorack. Would also love to play with AI x Eurorack. More on this later probably.
Blog updates
Fixed an issue that content was overflowing on mobile. Didn’t show up when using responsive mode in my desktop browser. Good reminder to always test on device.
Updating my portfolio always seems exciting to begin with and then I get stuck in a technical detail or find myself unhappy with the creative direction. But this week I did it! It felt necessary, as I can see a lot of people visit my portfolio and it doesn’t include any of my professional work the past 8 years.
When interviewing I have been building private portfolios which is fun and allows a more targeted approach, but it’s also a big timesuck. These days I’m really trying to optimize my time for research and creative pursuits, so updating my public portfolio felt right.
It’s always difficult. How many projects do you include? how many details? which ones? I made a decision to include what I feel most represent the kind of work I’m excited to do in the future. And I wanted the entire portfolio to just be a single page, no navigating in and out of pages except for contact. Others might not like this (feedback is welcome), but it’s based on my experience hiring designers and looking through hundreds of portfolios the past couple of years.
I see portfolios as being about catching someone’s attention, and I find that the faster I can navigate through the information and find something appealing, the better. The portfolio isn’t end, but the beginning of a conversation. For all the projects I have on display, I have tons more stories and details.
I already got some good feedback from friends, and based on that will likely make bigger changes, but I feel like I have a good foundation to work from now.
LAVIS
A friend recently mentioned LAVIS, an open source deep learning library for Language-Vision Intelligence by Salesforce. It provides really easy means to use a bunch of language-vision models. I decided to take it for a spin and try out BLIP.
BLIP is basically “stable diffusion in reverse”, in that you input an image, and it returns a textual description of the image. I was going to show some of the experiments here, but unfortunately the ubuntu partition I was running everything corrupted, so that’ll have to wait for another time.
This week has been less about creating and more about gathering.
Weekly reminder, please keep booking 30 min with me to talk about anything. I’m having a lot of fun learning and exploring by myself, but also starting to miss the social interactions and sense of purpose from working with a team. So if you’ve got an interesting project and could use a hand - please do reach out.
Writing The Why
As I mentioned last week, I have started writing a book. The working title is “Tools & Interactions”, based on the name of my former team at Bang & Olufsen.
I haven’t written anything substantial since my Master’s thesis, and who knows if I’ll ever finish or publish it, but so far I’m excited to work on it.
Currently, it’s a way of looking back and reflecting on my experience working in the fuzzy 0-to-1 end of product design. On trying to bring interaction design to companies where it didn’t seem to propery exist previously, while still defining it for myself.
It’s also about how after all this time, I’m still not sure what the role of interaction design is or if it’s even understood. How the current state of “UX/UI Design” seems to mean making high-fidelity flows in Figma. How Figma seems to have become the hammer that makes every problem look like a nail. And how this single-tool, single-mind mentality has reduced design to being a styling discipline rather than using design to explore new mental models or material qualities of technology. Interaction Design doesn’t really seem to be concerned with coming up with new embodied interfaces anymore, settling for multi-touch glass rectangles. I think we can and should do better.
I think designers lack tools that let them use utilize computation and machine intelligence to augment their thinking and creating, and my hope is that the book will be successful at arguing why we should build them, and provide examples of how we can think and make our way there.
I’m also trying to argue that in the tool-building, you form a deep understanding of technology and information as material, which in turn allows you to be more creative with it.
Steve Jobs did a great job of articulating the potential of the computer as a “bicycle for the mind”. In the video he references the chart below from the Scientific American. We are indeed not the only animal that makes tools, but we are the animal most capable of making tools that augment our capabilities by orders of magnitude.
Reading
Thinking about my writing as a book is proving to have good side-effects. Books should be properly researched, and when I find myself wanting to recount history or make a certain claim, this forces me to do more research.
Thinking about computers as tools brought me back to reading MINDSTORMS by Seymour Papert. I first read it when we were building LEGO Education SPIKE Prime and forming a relationship with the Lifelong Kindergarten group at the Media Lab. In designing the product, the programming hub user interface, bringing Scratch Blocks as the visual programming language to LEGO, and forming the overall technology concept, it was a huge inspiration.
The book is 30 years old this year, but more relevant than ever. Papert talks about using the computer as an “object-to-think-with”, and hammers it home with this quote:
One might say the computer is being used to program the child. In my vision, the child programs the computer and, in doing so, both acquires a sense of mastery over a piece of the most modern and powerful technology and establishes an intimate contact with some of the deepest ideas from science, from mathematics, and from the art of intellectual model building.