AI Agents
An AI agent is an autonomous entity that observes an environment, decides what to do (based on its internal policy), and performs actions towards achieving its goals. In terms of the actor model, an agent can be considered an actor that can:
- Receive events, such as an instruction on what to do next, which goal to accomplish, or an observation of the environment
- Send events, which would cause actions to be performed on the environment
- Store state, which can be used to remember contextual information about the environment
- Spawn other agents, which can be used to create a hierarchy of agents that can work together and coordinate their actions to achieve a goal
The Stately Agent (@statelyai/agent
) package makes it simple to create agents and agent behavior based on the actor model and state machines. These agents can do much more than generate text and execute function calls; Stately Agent is a framework for:
- Storing message history between the user and assistant when using the generative text features
- Making observations of an environment, recording the transitions (previous state, event, next state) so it can understand the environment
- Receiving feedback on decisions it makes, so it can retrieve and corrolate feedback so that it can make more informed decisions
- Making plans in its decision-making progress, so that it not only predicts the very next decision to make, but a sequence of decisions that ideally reaches the goal
- Short-term and long-term memory for remembering message history, observations, feedback, and plans that it makes.
Installationβ
Install the following dependencies:
@statelyai/agent@beta
βΒ Stately.ai Agent, currently in beta@ai-sdk/openai
βΒ The Vercel AI SDK for OpenAI, which provides access to the OpenAI APIxstate
β Library for managing state machines and statechartszod
β Library for type-safe schema validation
- npm
- pnpm
- yarn
npm install @statelyai/agent @ai-sdk/openai xstate zod
pnpm install @statelyai/agent @ai-sdk/openai xstate zod
yarn add @statelyai/agent @ai-sdk/openai xstate zod
Quick startβ
- Add your provider's API key to your
.env
file.
OPENAI_API_KEY="sk-abCDE..."
- Create an agent.
import { openai } from '@ai-sdk/openai';
import { createAgent } from '@statelyai/agent';
const agent = createAgent({
name: 'todo',
model: openai('gpt-4-turbo'),
events: {},
});
- Add event schemas using Zod. These are the events that the agent is allowed to "cause" (i.e. send to the actor)
import { openai } from '@ai-sdk/openai';
import { createAgent } from '@statelyai/agent';
import { z } from 'zod';
const agent = createAgent({
model: openai('gpt-4-turbo'),
name: 'todo',
events: {
'todo.add': z.object({
todo: z
.object({
title: z.string().describe('The title of the todo'),
content: z.string().describe('The content of the todo'),
completed: z
.boolean()
.describe('The completed value of the todo')
.optional(),
})
.describe('Adds a new todo'),
}),
'todo.toggle': z.object({
todoId: z.string().describe('The ID of the todo to toggle'),
completed: z.boolean().describe('The new completed value').optional(),
}),
},
});
- Interact with a state machine actor that accepts those events.
import { setup, createActor } from 'xstate';
import { agent } from './agent';
const todoMachine = setup({
types: {
// Add the event types that the agent understands
events: agent.types.events,
},
// ...
}).createMachine({
// ...
});
const todoActor = createActor(todoMachine);
agent.interact(todoMachine);
todoActor.start();
Creating an agentβ
You can create an agent using the createAgent(settings)
function. There are required settings:
name
- The name of the agent, used for logging and agent learning purposesmodel
- The AI SDK language model to use for generating text and making tool callsevents
- A mapping of event types to Zod event schemas that the agent can trigger (i.e. events it can send to some live environment that it is interacting with)
import { createAgent } from '@statelyai/agent';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';
const agent = createAgent({
name: 'barista',
model: openai('gpt-4-turbo'),
events: {
'barista.makeDrink': z
.object({
drink: z.enum(['espresso', 'latte', 'cappuccino']),
})
.describe('Makes a drink'),
// ...
},
});
You can specify additional settings to customize the agent's behavior:
description
- A description of the agent, used for logging and agent learning purposes, as well as for agents that call other agents (multi-agent systems)planner
- An async function that takes theagent
and plannerinput
and resolves with anAgentPlan
that potentially includes thesteps
and thenextEvent
to execute to achieve theinput.goal
. This function is used to determine what the agent should do next when making a decision based on the current state (input.state
) and goal (input.goal
).logic
- The agent logic function that is used to determine what an agent does when it receives an agent event, such as"agent.feedback"
,"agent.observe"
,"agent.message"
, or"agent.plan"
.
Making decisionsβ
The most important feature of a Stately agent is the ability to make decisions based on the current state and goal. This is done using the agent.decide(input)
async function, which takes an input
object that contains the current state, state machine, and goal, and resolves with an AgentPlan
.
For example, suppose you have the following baristaMachine
state machine:
import { createMachine } from 'xstate';
export const baristaMachine = createMachine({
initial: 'idle',
states: {
idle: {
on: {
'barista.makeDrink': 'makingDrink',
},
},
makingDrink: {
on: {
'barista.drinkMade': 'idle',
},
},
},
});
You can then use the agent.decide(input)
function to determine what the agent should do next:
import { createAgent } from '@statelyai/agent';
import { baristaMachine } from './baristaMachine';
const agent = createAgent({
name: 'barista',
model: openai('gpt-4-turbo'),
events: {
'barista.makeDrink': z
.object({
drink: z.enum(['espresso', 'latte', 'cappuccino']),
})
.describe('Makes a drink'),
},
});
async function handleOrder(order, state) {
const resolvedState = baristaMachine.resolveState(state);
const plan = await agent.decide({
state: resolvedState,
machine: baristaMachine,
goal: `A customer made this order: ${order}`,
});
return plan;
}
handleOrder('I want a latte please', { value: 'idle' });
// Resolves with an `AgentPlan` that includes:
// {
// // ...
// nextEvent: { type: 'barista.makeDrink', drink: 'latte' },
// }
Agent memoryβ
Stately agents can have two types of memory: short-term (local) memory and long-term memory.
- Short-term (local) memory is memory that can be synchronously retrieved, but might not be persisted.
- Long-term memory is memory that is asynchronously retrieved from persistent storage, such as a database.
Agents remember four kinds of things in their memory:
- Messages between the user and the assistant
- Observations of state transitions (previous state, event, current state) that occur in the environment that the agent is observing
- Feedback
- Plans
Messagesβ
agent.getMessages()
β
Returns chat messages that occur between the user and the assistant from short-term memory.
agent.addMessage(message)
β
If you want to manually add a message between the assistant and user to agent memory, you can call agent.addMessage(message)
to do so. This is automatically called when calling agent.generateText(β¦)
, agent.streamText(β¦)
, or the fromText(β¦)
and fromTextStream(β¦)
actor logic creators. You should avoid calling this manually.
Observationsβ
agent.getObservations()
β
Returns observations that the agent observes from short-term memory.
agent.addObservation(observation)
β
You can add an observation ({ prevState, event, state, β¦ }
) to an agent's memory by calling agent.addObservation(observation)
. This function returns an AgentObservation
object that includes the provided observation details as well as an observation id
so that the observation can be referenced in feedback, if applicable.
const observation = agent.addObservation({
prevState: { value: 'idle', context: {} },
event: { type: 'grindBeans' },
state: { value: 'grindingBeans', context: {} },
});
Feedbackβ
agent.getFeedback()
β
Returns feedback that is given to the agent from short-term memory.
agent.addFeedback(feedback)
β
const observation = agent.addObservation({
// ...
});
const feedback = agent.addFeedback({
observationId: observation.id,
goal: 'Make an iced coffee',
attributes: {
feedback: 'Water should not be boiled for an iced coffee',
score: -10,
},
});
Plansβ
agent.getPlans()
β
Returns plans that the agent has made from short-term memory.
agent.addPlan(plan)
β
TODO
Interacting with state machinesβ
An agent can interact with existing state machine actors to determine what to do next. While the state machine actor is running, the agent will do the following cycle:
- The agent observes state changes
- The observation is remembered in the agent's state
- The agent determines if it needs to make a decision based on the current state
- If it does, the agent makes a decision in the form of an
AgentPlan
. - If an
AgentPlan
is formed, the agent triggers the next event (plan.nextEvent
) on the state machine actor.
- The plan is remembered in the agent's state.
- The agent goes back to step 1, and the cycle continues.
import { createAgent } from '@statelyai/agent';
import { createActor } from 'xstate';
import { jokeMachine } from './jokeMachine';
const agent = createAgent({
name: 'joke-teller',
model: openai('gpt-4'),
events: {
'agent.tellJoke': z.object({
joke: z.string().describe('The joke text'),
}),
'agent.rateJoke': z.object({
rating: z.number().min(1).max(10),
explanation: z.string(),
}),
// ...
},
});
const jokeActor = createActor(jokeMachine).start();
agent.interact(jokeActor, (observed) => {
if (observed.state.matches('tellingJoke')) {
return { goal: `Tell a joke about ${observed.state.context.topic}` };
}
if (observed.state.matches('ratingJoke')) {
return { goal: `Rate this joke: ${observed.state.context.joke}` };
}
});
State machine agentsβ
You can invoke Stately agents as part of a state machine, ensuring that it will follow the state machine's transitions as specified and trigger the appropriate events. This is done by using any of the following actor logic creators:
fromDecision(agent)
β
Returns promise actor logic that resolves with the agent plan that should accomplish the goal (input.goal
), if it is able to create one.
When invoked/spawned, this actor will also add the user and assistant messages to agent memory, as well as the plan that it created.
fromText(agent)
β
Returns promise actor logic that resolves with the generated text result from the Vercel AI SDK.
When invoked/spawned, this actor will also add the user and assistant messages to agent memory.
import { createAgent, fromText } from '@statelyai/agent';
import { setup } from 'xstate';
const agent = createAgent(/* ... */);
const machine = setup({
actors: {
assistant: fromText(agent),
},
}).createMachine({
initial: 'greeting',
context: (x) => ({
time: x.input.time,
}),
states: {
greeting: {
invoke: {
src: 'assistant',
input: ({ context }) => ({
context: {
time: context.time,
},
goal: 'Produce a greeting depending on the time of day.',
}),
onDone: {
target: 'greeted',
actions: ({ event }) => {
console.log(event.output.text);
},
},
},
},
greeted: {
type: 'final',
},
},
});
const actor = createActor(machine, {
input: { time: Date.now() },
});
actor.start();
fromTextStream(agent)
β
Returns observable actor logic that streams the text from the Vercel AI SDK.
When invoked/spawned, this actor will also add the user and assistant messages to agent memory.
TODO: example
Observabilityβ
- Can observe observations, plans, messages, and feedback via
agent.on('message', (message) => {})
- Can manually add feedback observations via
agent.addFeedback(β¦)
Generating textβ
You can use the agent.generateText(input)
method to generate text from an input. This extends the generateText(β¦)
function from the Vercel AI SDK by:
- Adding the user and assistant messages to agent memory
- Providing the ability to retrieve previous observations, feedback, plans and messages from agent memory
Streaming textβ
You can use the agent.streamText(input)
method to stream text from an input. This extends the streamText(β¦)
function from the Vercel AI SDK by:
- Adding the user and assistant messages to agent memory
- Providing the ability to retrieve previous observations, feedback, plans and messages from agent memory
Agent logicβ
Agent logic is actor logic that has the specific purpose of performing LLM tasks for the agent. Agent logic goes beyond just being a wrapper and provides the ability to use the agent's state machine to intelligently determine which action to take next.
Agent logic is most powerful when used with a state-machine-powered agent, but you can also create standalone actors from agent logic, which is useful for testing and simple tasks.
Examplesβ
See the current examples in the examples directory.