Collaborative Assistants

5 May, 2024

Before OpenAI released the assistant API, I used state machines to emulate long running, multi-step language model generations.

These workflows closely resembled the behavior of an autonomous agent that exhibited a good degree of reasoning to call the right functions in parallel/sequence and complete tasks.

Since each assistant was a state machine, I wanted to see what it would look like if I spawned multiple assistants with specific roles and gave them the ability to communicate with each other.

For the example below, I've used an imaginary scenario where the CEO of a company notices an issue and shares it with the team.

In the above experiment, the following things happened:

The product manager creates an issue and sends a message to a developer.
The developer receives the message, creates a pull request, and sends a message to another developer for review.
The second developer reviews the pull request and sends a message back to the first developer that it's been approved.
The first developer merges the pull request and sends a message back to the product manager that the issue has been resolved.
The product manager then finally closes the issue.

It turns out that by assuming roles, the assistants are able to communicate with each other and distribute the work amongst themselves.

One thing that surprised me was the backward propagation of messages that happened during execution. There was no explicit instruction in the system prompts to encourage this behavior, but the assistants realized that they needed to update their peers in order to close the issue.

Since the relationship was hierarchical, it makes sense for the messages to follow a top to bottom, and bottom to top flow of communication.

However, for a distributed system I anticipate the messages to be passed without any specific order, leading to a more complex flow of communication.

For example, if I spawned three teams for executive, engineering, and design, it is possible that the product managers from design and engineering team communicate with each other to triage tasks before assigning them to the developers and designers, while the executive team communicates with the product managers to prioritize the company's overall business goals.

Spawning multiple assistants and making them collaborate with each other reliably well is an interesting problem because it reflects the same set of problems that humans face when working in teams, like communication, coordination, and task distribution.

At the same time, it is also a very powerful abstraction because it allows us to build complex systems that can solve problems that are too large for a single assistant to handle - just like how humans build companies to solve problems that are too large for a single person to handle.

With multiple assistants, it is possible to:

Use different models for different kinds of tasks, like using a better code-gen model for developer assistants, and a better planning model for product manager assistants.
Allow and restrict access to resources, like databases, files, and APIs, to different assistants based on their roles.
Free the assistant from having a bloated context, like a single assistant that has to know everything about the company, and instead allow them to focus on their specific roles and responsibilities.
Go beyond single threaded execution and parallelize tasks across multiple assistants to speed up work.
Prevent having a single point of failure when an assistant errors out.

I'm excited to see the consequences of this abstraction and how it will shape the future of assistants.

Of course, it is fair to be skeptical of its reliability at its current state, but I'm hopeful that it will improve as these models get better over time.

Posted by Jeremy