ViteHub is still experimental. Expect bugs and breaking changes.

Rate limit

Check or consume a trusted invocation budget before the Agent runs.

rateLimit() adds a pre-invocation budget check. It uses trusted invocation identity, records a typed decision, and can reject before the main Agent Invocation proceeds.

Installation

Import the Capability factory from -hub/agent/capabilities and add it to defineAgent({ capabilities }). Use the configuration example below as the starting point, then tighten modes, policies, stores, and providers for the Agent boundary.

What it adds

The Capability checks or consumes one budget unit for the configured identity, scope, limit, and window. It records the Rate Limit Decision as an Agent Invocation Context Value and exposes it as a finish extension.

Configuration

Use the memory store only for local development, tests, or single-process hosts. Hosted runtimes require an explicit Rate Limit Store.

server/agents/support.ts
import { defineAgent } from '@vite-hub/agent'
import { rateLimit } from '@vite-hub/agent/capabilities'

export default defineAgent({
  driver: { model },
  capabilities: [
    rateLimit({
      limit: 20,
      store: 'memory',
      window: '1m',
    }),
  ],
})

Runtime behavior

rateLimit() runs during the input phase. It resolves identity from the Agent Invoker by default, falls back through run metadata or trusted IP headers when configured, then calls the store's check() or consume() method.

Rejected requests throw RateLimitRejectedError with status code 429 and retry headers.

Requirements

rateLimit() requires a positive integer limit and a window such as 60s, 15m, or 1h. Hosted runtimes require an explicit store with check() and consume() methods.

Trusted IP identity requires configured trusted headers. Use Agent Invoker identity when the host can provide it.

Driver support

Agent DriverSupport
Model-backedRuns before model execution and records the decision.
Harness-backedRuns before harness execution and records the decision.
Custom-run-backedRuns before driver.run; rejected requests do not reach custom code.

Inspect and verify

Run repeated invocations with the same identity until the budget is exhausted. Inspect the Rate Limit Decision for limit, used, remaining, reset time, identity source, and scope.

On a hosted runtime, run without an explicit store during development or preview. The Capability should fail with a store requirement instead of using process memory.

Options

OptionTypeDefaultDescription
limitnumber | functionrequiredMaximum allowed count for the window.
window${number}ms|s|m|h|drequiredFixed window duration.
action"check" | "consume""consume"Store operation.
idstring"rate-limit"Capability id and invocation context key.
identity"auto" | "invoker" | "ip" | "run" | function"auto"Identity used to build the rate-limit key.
scopestring | functioncapability idExtra key partition.
storeRateLimitStore | "memory" | functionlocal memory outside hosted runtimesStore implementation.
trustedIpHeadersstring[]noneRequest headers trusted for identity: "ip" or auto IP fallback.
messagestring | functiondefault rejection messageError message when the limit rejects.
onDecisionfunctionnoneCallback after every rate-limit decision.
onAllowedfunctionnoneCallback after allowed decisions.
onRejectedfunctionnoneCallback after rejected decisions.

Reference

Copyright © 2026