Farther ShoreDocs
Go to Farther Shore
What is FartherShore
Install the CLI
Quickstart
Core concepts
The @Product class
Meters & resources
Features & routes
Capabilities & entitlements
Plans & pricing
The Manifest IR
Bring your own backend
Transport modes
Metering & verification
Runtime tokens
Frontend SDK
Root & data components
Auth & sessions
Entitlement gates
Connect Stripe
Subscriptions & usage
Plan changes & grandfathering
Billing strategies
Apply & deploy
Environments
Migrations
Docs versions & archive
Operate with an agent
Operation classes
MCP server
End-to-end via CLI/MCP
CLI reference
@farthershore/product
@farthershore/backend
@farthershore/farthershore-js
Environment variables
Response & deny codes
Add a metered capability
Gate a feature
Change a price
Prepaid credits
Meter AI tokens
Declare the token meters and the chat routeReport the real token counts from your backendBuild and verifyVerify it worksRelated
Operate via an agent
Prepare for launch
Status
Docs/Cookbook/Meter AI tokens

Meter AI tokens

Meter LLM input and output tokens end to end — declared in the product, reported from the backend.

PreviousPrepaid creditsNextOperate via an agent

An AI proxy bills on tokens, not request counts. The pattern is a two-part loop: declare token dimensions as @Meters and bind them to a chat route's reports in the product, then send the real per-request token counts from your backend with withUsage. The gateway reserves an estimate before the call and settles the actual value on the response.

This is the same reports + withUsage mechanism as Add a metered capability, specialized for LLM usage. We build a small llm-api product to keep the example self-contained.

Declare the token meters and the chat route

estimate is the pre-request admission value the gateway holds before the upstream reports the truth — set it to a typical input size so admission checks are realistic.

// product/product.config.ts
import { Product, Requests, Meter, Feature, Plan } from "@farthershore/product";

@Product({
  name: "llm-api",
  origin: "https://api.llm.example.com",
  displayName: "LLM API",
})
export default class LlmApi {
  @Requests()
  requests!: unknown;

  // SUM meters: the upstream reports a value per request; the platform adds
  // them up. `estimate` is reserved pre-request for input tokens.
  @Meter("input_tokens", { display: "Input Tokens", unit: "token", estimate: 500 })
  inputTokens!: unknown;

  @Meter("output_tokens", { display: "Output Tokens", unit: "token", estimate: 500 })
  outputTokens!: unknown;

  @Feature("chat", {
    plans: ["pro"],
    routes: {
      // Both token meters are dynamic reports; you can override the input-side
      // admission estimate per route if this endpoint runs larger prompts.
      "POST /v1/chat/completions": {
        reports: ["input_tokens", "output_tokens"],
        estimates: { input_tokens: 1000 },
      },
    },
  })
  chat!: unknown;

  // Pay-as-you-go: no recurring fee, priced per token. micros are micro-dollars
  // per unit — 15 micros = $0.000015 / input token.
  @Plan("pro", {
    name: "Pro",
    meter: {
      input_tokens: { micros: 15 },
      output_tokens: { micros: 60 },
    },
    limits: { requests: { rate: 600, interval: "minute", enforcement: "enforce" } },
  })
  pro!: unknown;
}

A meter listed under reports must carry an estimate (here, via the meter's own estimate or the route's estimates). A pre-request meter with no estimate fails the build — the gateway needs a number to reserve before your upstream runs. requests = 1 is still inherited on every metered route automatically.

Report the real token counts from your backend

Install @farthershore/backend, set FS_RUNTIME_TOKEN (see Operate via an agent for minting one), verify the gateway signature, then return the model's usage figures with withUsage. The helper signs usage into the gateway response path — no extra network call.

import { fartherShore, withUsage } from "@farthershore/backend";

const fs = fartherShore.initFromEnv(); // derives everything from FS_RUNTIME_TOKEN

export async function POST(request: Request) {
  const url = new URL(request.url);
  const body = new Uint8Array(await request.clone().arrayBuffer());

  // Fail-closed verification: identity comes only from the gateway's signature,
  // never from the plaintext X-FS-* headers.
  await fs.verifyRequest({
    method: request.method,
    path: url.pathname,
    query: url.search,
    headers: request.headers,
    body,
  });

  const completion = await callModel(await request.json());

  return withUsage(
    request,
    Response.json(completion),
    {
      input_tokens: completion.usage.prompt_tokens,
      output_tokens: completion.usage.completion_tokens,
    },
    // Optional free-form pricing/analytics context persisted with the event.
    { measureContext: { model: completion. } },
  );
}

The meter keys you pass to withUsage must match the @Meter keys in the product (input_tokens, output_tokens); a mismatch is rejected. Values are validated locally before signing — they must be non-negative finite numbers.

Express upstreams

If your upstream is Express, verify with the middleware and report the same way:

const fs = fartherShore.initFromEnv();
app.use(fs.middleware()); // fail-closed verify → req.fartherShore

Build and verify

farthershore build --format json

Push product/**, then drive a real chat request through a test persona and read the breakdown:

# Token meters appear per-dimension once traffic flows.
farthershore usage summary llm-api --format json

Verify it works

  • farthershore build succeeds and the IR lists input_tokens + output_tokens as SUM meters with estimates.
  • A POST /v1/chat/completions call is allowed and forwarded.
  • input_tokens / output_tokens in the usage summary match the model's usage.prompt_tokens / usage.completion_tokens.
  • The Pro invoice charges tokens at the per-unit micros you set.

Related

  • Add a metered capability — the general reports + withUsage loop.
  • Prepaid credits — meter tokens down against a prepaid balance.
  • Operate via an agent — mint FS_RUNTIME_TOKEN and a test persona via the CLI.
model