Overview

person.run tracks detailed metrics for every persona interaction. You can monitor conversation volume, response quality, token usage, and latency in real time from the dashboard — or export raw events via the API for custom analysis pipelines.

What's tracked

Every API operation emits a usage event. These events power the dashboard charts and are available for export.

Event	Tracked data
`persona_create`	`Tenant, persona seed, timestamp`
`persona_prompt`	`Persona ID, prompt text, response, session ID, tokens in/out, model, latency`
`persona_list`	`Tenant, query parameters, result count`
`persona_update`	`Persona ID, fields changed`
`persona_delete`	`Persona ID, tenant`
`persona_timeline_append`	`Persona ID, memory type, strength, event label`
`persona_timeline_supersede`	`Timeline entry ID, reason`
`persona_consistency_check`	`Persona ID, issue count, issue types`
`file_upload`	`Persona ID, file type, file size`
`timeline_reconcile_job`	`Persona ID, job status`

Dashboard metrics

The dashboard provides an overview of your most important metrics:

Prompt events — total persona prompt events over the current period.
Persona count — number of personas created for the tenant.
Timeline entries — total timeline memories across all personas.
Token usage — tokens consumed (input and output) across all personas.

Usage tracking and limits

Usage is tracked per tenant and enforced against your plan's limits. Non-credit usage limits return 403 when exceeded.

403 usage limit response

{
  "error": "Usage limit exceeded"
}

Prompt requests are credit-primary. If a prompt would exceed your credit balance, the API returns 402 with top-up/upgrade guidance.

402 insufficient credits response

{
  "error": "Insufficient credits",
  "requiredCredits": 1200,
  "availableCredits": 300,
  "action": "upgrade_or_topup",
  "billingPath": "/dashboard/billing"
}

NoteIncluded credits are added each billing cycle. Unused credits remain in your account balance and can be used later (rollover behavior).

Response metadata

Every prompt response (sync, async, and streaming) includes metadata you can use for your own analytics:

Field	Description
`sessionId`	`Unique identifier for this prompt/response pair.`
`modelName`	`The AI model used for generation (e.g., gpt-4.1-mini).`
`tokensIn`	`Number of input tokens (prompt + context + memories).`
`tokensOut`	`Number of output tokens (generated response).`
`latencyMs`	`Total generation time in milliseconds.`

For streaming responses, this metadata is included in the done event.

Rate limiting

Rate limits are enforced at two levels to protect the platform and ensure fair usage:

IP-level rate limiting

Mutation endpoints (POST, PUT, PATCH, DELETE) are rate-limited to 300 requests per minute per IP address. This prevents abuse from any single source.

Per-route rate limiting

Each endpoint has its own rate limit scoped to your API key. When exceeded, the response includes a Retry-After header indicating when you can retry.

Async job callbacks

When using async mode for prompts or document ingestion, you can provide a responseUrl to receive a callback when the job completes. Callbacks are delivered via QStash with at-least-once semantics.

Async prompt callback payload

{
  "kind": "persona.prompt.result",
  "jobId": "job-uuid",
  "status": "succeeded",
  "tenantId": "your-tenant-id",
  "personaId": "persona-uuid",
  "attemptCount": 1,
  "result": {
    "sessionId": "session-uuid",
    "prompt": "How do you approach design challenges?",
    "response": "As a product designer, I..."
  },
  "error": null,
  "completedAt": "2026-02-20T12:00:02.340Z"
}

TipThe callback URL must be a public HTTPS endpoint. Private IPs and localhost are rejected for security.

Overview

What's tracked

Every API operation emits a usage event. These events power the dashboard charts and are available for export.

Event	Tracked data
`persona_create`	`Tenant, persona seed, timestamp`
`persona_prompt`	`Persona ID, prompt text, response, session ID, tokens in/out, model, latency`
`persona_list`	`Tenant, query parameters, result count`
`persona_update`	`Persona ID, fields changed`
`persona_delete`	`Persona ID, tenant`
`persona_timeline_append`	`Persona ID, memory type, strength, event label`
`persona_timeline_supersede`	`Timeline entry ID, reason`
`persona_consistency_check`	`Persona ID, issue count, issue types`
`file_upload`	`Persona ID, file type, file size`
`timeline_reconcile_job`	`Persona ID, job status`

Dashboard metrics

The dashboard provides an overview of your most important metrics:

Prompt events — total persona prompt events over the current period.
Persona count — number of personas created for the tenant.
Timeline entries — total timeline memories across all personas.
Token usage — tokens consumed (input and output) across all personas.

Usage tracking and limits

Usage is tracked per tenant and enforced against your plan's limits. Non-credit usage limits return 403 when exceeded.

403 usage limit response

{
  "error": "Usage limit exceeded"
}

Prompt requests are credit-primary. If a prompt would exceed your credit balance, the API returns 402 with top-up/upgrade guidance.

402 insufficient credits response

{
  "error": "Insufficient credits",
  "requiredCredits": 1200,
  "availableCredits": 300,
  "action": "upgrade_or_topup",
  "billingPath": "/dashboard/billing"
}

NoteIncluded credits are added each billing cycle. Unused credits remain in your account balance and can be used later (rollover behavior).

Response metadata

Every prompt response (sync, async, and streaming) includes metadata you can use for your own analytics:

Field	Description
`sessionId`	`Unique identifier for this prompt/response pair.`
`modelName`	`The AI model used for generation (e.g., gpt-4.1-mini).`
`tokensIn`	`Number of input tokens (prompt + context + memories).`
`tokensOut`	`Number of output tokens (generated response).`
`latencyMs`	`Total generation time in milliseconds.`

For streaming responses, this metadata is included in the done event.

Rate limiting

Rate limits are enforced at two levels to protect the platform and ensure fair usage:

IP-level rate limiting

Mutation endpoints (POST, PUT, PATCH, DELETE) are rate-limited to 300 requests per minute per IP address. This prevents abuse from any single source.

Per-route rate limiting

Each endpoint has its own rate limit scoped to your API key. When exceeded, the response includes a Retry-After header indicating when you can retry.

Async job callbacks

When using async mode for prompts or document ingestion, you can provide a responseUrl to receive a callback when the job completes. Callbacks are delivered via QStash with at-least-once semantics.

Async prompt callback payload

{
  "kind": "persona.prompt.result",
  "jobId": "job-uuid",
  "status": "succeeded",
  "tenantId": "your-tenant-id",
  "personaId": "persona-uuid",
  "attemptCount": 1,
  "result": {
    "sessionId": "session-uuid",
    "prompt": "How do you approach design challenges?",
    "response": "As a product designer, I..."
  },
  "error": null,
  "completedAt": "2026-02-20T12:00:02.340Z"
}

TipThe callback URL must be a public HTTPS endpoint. Private IPs and localhost are rejected for security.

Analytics

Overview

What's tracked

Dashboard metrics

Usage tracking and limits

Response metadata

Rate limiting

Async job callbacks

Analytics

Overview

What's tracked

Dashboard metrics

Usage tracking and limits

Response metadata

Rate limiting

Async job callbacks