Absurd Workflows: Durable Execution With Just Postgres
好的,我现在需要帮用户总结一篇文章的内容,控制在100字以内,而且不需要特定的开头。首先,我得仔细阅读文章,理解其主要观点。 文章主要讲的是Absurd这个库,它基于Postgres实现 durable workflows。作者提到现在很多人都在构建 durable execution 系统,但很多都很复杂,需要第三方服务。而Absurd用Postgres就能实现,不需要额外扩展。 接下来,文章解释了durable execution的基本概念,强调可靠性和持久性。然后详细介绍了Absurd的设计:一个SQL文件和一个简单的SDK,任务分解成步骤,存储状态在Postgres里。 还提到了如何处理任务的暂停、失败和重试,并且支持事件和睡眠功能。最后作者总结说这只是一个队列和状态存储的结合,适用于不需要复杂系统的场景。 现在我要把这些信息浓缩到100字以内。重点包括:Absurd是基于Postgres的库,用于durable workflows,任务分解成步骤存储状态,支持事件和睡眠,简单易用。 确保语言简洁明了,直接描述内容。 </think> 文章介绍了一个基于Postgres的轻量级库Absurd,用于实现持久化工作流(durable workflows)。通过将任务分解为可检查点的步骤,并利用Postgres的队列和状态存储能力,Absurd支持可靠的任务执行、暂停、重试以及事件等待功能。 2025-11-3 00:0:0 Author: lucumr.pocoo.org(查看原文) 阅读量:8 收藏

written on November 03, 2025

It’s probably no surprise to you that we’re building agents somewhere. Everybody does it. Building a good agent, however, brings back some of the historic challenges involving durable execution.

Entirely unsurprisingly, a lot of people are now building durable execution systems. Many of these, however, are incredibly complex and require you to sign up for another third-party service. I generally try to avoid bringing in extra complexity if I can avoid it, so I wanted to see how far I can go with just Postgres. To this end, I wrote Absurd 1, a tiny SQL-only library with a very thin SDK to enable durable workflows on top of just Postgres — no extension needed.

Durable Execution 101

Durable execution (or durable workflows) is a way to run long-lived, reliable functions that can survive crashes, restarts, and network failures without losing state or duplicating work. Durable execution can be thought of as the combination of a queue system and a state store that remembers the most recently seen execution state.

Because Postgres is excellent at queues thanks to SELECT ... FOR UPDATE SKIP LOCKED, you can use it for the queue (e.g., with pgmq). And because it’s a database, you can also use it to store the state.

The state is important. With durable execution, instead of running your logic in memory, the goal is to decompose a task into smaller pieces (step functions) and record every step and decision. When the process stops (whether it fails, intentionally suspends, or a machine dies) the engine can replay those events to restore the exact state and continue where it left off, as if nothing happened.

Absurd At A High Level

Absurd at the core is a single .sql file (absurd.sql) which needs to be applied to a database of your choice. That SQL file’s goal is to move the complexity of SDKs into the database. SDKs then make the system convenient by abstracting the low-level operations in a way that leverages the ergonomics of the language you are working with.

The system is very simple: A task dispatches onto a given queue from where a worker picks it up to work on. Tasks are subdivided into steps, which are executed in sequence by the worker. Tasks can be suspended or fail, and when that happens, they execute again (a run). The result of a step is stored in the database (a checkpoint). To avoid repeating work, checkpoints are automatically loaded from the state storage in Postgres again.

Additionally, tasks can sleep or suspend for events and wait until they are emitted. Events are cached, which means they are race-free.

With Agents

What is the relationship of agents with workflows? Normally, workflows are DAGs defined by a human ahead of time. AI agents, on the other hand, define their own adventure as they go. That means they are basically a workflow with mostly a single step that iterates over changing state until it determines that it has completed. Absurd enables this by automatically counting up steps if they are repeated:

absurd.registerTask({name: "my-agent"}, async (params, ctx) => {
  let messages = [{role: "user", content: params.prompt}];
  let step = 0;
  while (step++ < 20) {
    const { newMessages, finishReason } = await ctx.step("iteration", async () => {
      return await singleStep(messages);
    });
    messages.push(...newMessages);
    if (finishReason !== "tool-calls") {
      break;
    }
  }
});

This defines a single task named my-agent, and it has just a single step. The return value is the changed state, but the current state is passed in as an argument. Every time the step function is executed, the data is looked up first from the checkpoint store. The first checkpoint will be iteration, the second iteration#2, iteration#3, etc. Each state only stores the new messages it generated, not the entire message history.

If a step fails, the task fails and will be retried. And because of checkpoint storage, if you crash in step 5, the first 4 steps will be loaded automatically from the store. Steps are never retried, only tasks.

How do you kick it off? Simply enqueue it:

await absurd.spawn("my-agent", {
  prompt: "What's the weather like in Boston?"
}, {
  maxAttempts: 3,
});

And if you are curious, this is an example implementation of the singleStep function used above:

Single step function
async function singleStep(messages) {
  const result = await generateText({
    model: anthropic("claude-haiku-4-5"),
    system: "You are a helpful agent",
    messages,
    tools: {
      getWeather: { /* tool definition here */ }
    },
  });

  const newMessages = (await result.response).messages;
  const finishReason = await result.finishReason;

  if (finishReason === "tool-calls") {
    const toolResults = [];
    for (const toolCall of result.toolCalls) {
      /* handle tool calls here */
      if (toolCall.toolName === "getWeather") {
        const toolOutput = await getWeather(toolCall.input);
        toolResults.push({
          toolName: toolCall.toolName,
          toolCallId: toolCall.toolCallId,
          type: "tool-result",
          output: {type: "text", value: toolOutput},
        });
      }
    }
    newMessages.push({
      role: "tool",
      content: toolResults
    });
  }

  return { newMessages, finishReason };
}

Events and Sleeps

And like Temporal and other solutions, you can yield if you want. If you want to come back to a problem in 7 days, you can do so:

await ctx.sleep(60 * 60 * 24 * 7); // sleep for 7 days

Or if you want to wait for an event:

const eventName = `email-confirmation-${userId}`;
try {
  const payload = await ctx.waitForEvent(eventName, {timeout: 60 * 5});
  // handle event payload
} catch (e) {
  if (e instanceof TimeoutError) {
    // handle timeout
  } else {
    throw e;
  }
}

Which someone else can emit:

const eventName = `email-confirmation-${userId}`;
await absurd.emitEvent(eventName, { confirmedAt: new Date().toISOString() });

That’s it!

Really, that’s it. There is really not much to it. It’s just a queue and a state store — that’s all you need. There is no compiler plugin and no separate service or whole runtime integration. Just Postgres. That’s not to throw shade on these other solutions; they are great. But not every problem necessarily needs to scale to that level of complexity, and you can get quite far with much less. Particularly if you want to build software that other people should be able to self-host, that might be quite appealing.

This entry was tagged ai and announcements

copy as / view markdown


文章来源: https://lucumr.pocoo.org/2025/11/3/absurd-workflows/
如有侵权请联系:admin#unsafe.sh