Custom Personas (Agents)

Agents are the reusable voice personas that power your calls. Each agent stores its identity, conversation structure, AI stack preferences, runtime behavior, and post-call intelligence settings.

Agent Structure

An agent is built from layered configuration:

text

┌──────────────────────────┐
│  System Prompt (raw)     │  ← Simple mode: just a string
├──────────────────────────┤
│  Persona                 │  ← Structured: role, tone, audience, voice
├──────────────────────────┤
│  Playbook                │  ← Conversation flow: opener, scripts, CTAs
├──────────────────────────┤
│  Advanced                │  ← Runtime controls: barge-in, silence, turns
├──────────────────────────┤
│  Features                │  ← Toggles: recording, transcription
├──────────────────────────┤
│  AI Stack                │  ← LLM, STT, TTS provider/model selection
├──────────────────────────┤
│  Post-Call Intelligence  │  ← Summary, extraction, evaluation config
└──────────────────────────┘

Simple Mode vs Structured Mode

Simple Mode

Pass a system_prompt string and Rymi uses it directly as the LLM context:

json

{
  "name": "Alex - Support Agent",
  "system_prompt": "You are Alex, a friendly customer support agent for TechCorp..."
}

Structured Mode

Use persona and playbook objects for more control. Rymi's Prompt Compiler merges these into an optimized system prompt at call time.

json

{
  "name": "Priya - Sales Specialist",
  "persona": {
    "name": "Priya",
    "role": "Insurance sales specialist",
    "toneOverride": "Warm and confident",
    "audienceDescription": "Small business owners in India",
    "companyName": "Acme Insurance",
    "successCriteria": ["Qualify the lead", "Book a follow-up call"],
    "voiceConfig": {
      "voiceId": "Aoede",
      "language": "en-US"
    },
    "callerPersonas": [
      { "type": "interested", "approach": "Mirror enthusiasm, move to qualification" },
      { "type": "skeptical", "approach": "Lead with social proof and case studies" }
    ]
  },
  "playbook": {
    "opener": "Hi, this is Priya from Acme Insurance. Is this a good time?",
    "qualificationFlow": [
      { "question": "How many employees does your company have?", "listensFor": "Company size" },
      { "question": "What's your current insurance provider?", "listensFor": "Current provider" }
    ],
    "objectionHandlers": [
      { "trigger": "too expensive", "response": "I understand cost is important. Our plans start at just..." }
    ],
    "closingCTA": "I'd love to set up a quick demo. Does Thursday work for you?",
    "fallbackCTA": "Can I send you some information to review at your convenience?"
  }
}

The Prompt Compiler output is stored as compiled_prompt on the agent and returned in GET /agents/:id.

AI Stack Configuration

Each agent can be configured with specific LLM, STT, and TTS providers. The AI stack is organized by agent role:

The Executive role uses the executive API value.

Role	Pipeline	Best For
`operator`	Separate STT → LLM → TTS	Cost-efficient, flexible provider mix
`specialist`	Separate STT → LLM → TTS	Higher-quality models with Google Gemini Pro TTS by default; ElevenLabs can be selected as a premium/custom voice override
`executive`	Bundled realtime (Gemini Live / OpenAI Realtime)	Lowest latency, end-to-end

Setting the AI Stack

json

{
  "agent_role": "operator",
  "llm_model": "gemini-2.5-flash",
  "stt_provider": "google",
  "tts_provider": "google",
  "tts_model": "gemini-2.5-flash-preview-tts",
  "voice": "Aoede"
}

For the Executive role (executive API value), STT and TTS are handled by the realtime LLM itself:

json

{
  "agent_role": "executive",
  "llm_model": "gemini-live"
}

Use GET /v1/agents/llm-options to fetch the catalog of available models and voices.

Provider Config (Advanced)

For fine-grained control, use provider_config to define the full routing role:

json

{
  "provider_config": {
    "active_role": "operator",
    "tiers": {
      "operator": {
        "llm": { "primary": { "provider": "google", "model": "gemini-2.5-flash" } },
        "stt": { "primary": { "provider": "deepgram", "model": "nova-2-phonecall" } },
        "tts": { "primary": { "provider": "openai", "model": "tts-1" } }
      }
    }
  }
}

Runtime Controls

The advanced object tunes how the agent behaves during calls:

json

{
  "advanced": {
    "bargeInEnabled": true,
    "maxTurnLength": 30,
    "postSilenceHangup": 15,
    "endpointingThreshold": 500
  }
}

Control	Type	Description
`bargeInEnabled`	boolean	Allow user to interrupt the agent mid-response
`maxTurnLength`	number	Maximum agent response duration in seconds
`postSilenceHangup`	number	End call after this many seconds of user silence
`endpointingThreshold`	number	Silence duration (ms) before treating speech as complete

TIP

These controls are enforced at runtime by the gateway — they override any contradictory instructions in the system prompt.

Feature Flags

Toggle capabilities per agent:

json

{
  "features": {
    "recording": { "enabled": true },
    "transcription_enabled": true
  }
}

Feature	Effect When Disabled
`recording`	No LiveKit Egress recording is started
`transcription_enabled`	No transcript persistence, no transcript data packets, no post-call transcript analysis

Post-Call Intelligence

Configure what analysis runs after each call ends. See the Post-Call Intelligence guide for full details.

json

{
  "post_call": {
    "recording": { "enabled": true },
    "summary": { "enabled": true },
    "structured_extraction": {
      "json_schema": {
        "type": "object",
        "properties": {
          "appointment_booked": { "type": "boolean" },
          "follow_up_date": { "type": "string" }
        }
      }
    },
    "evaluation": {
      "rubric": "Did the agent successfully qualify the lead and book a follow-up?"
    }
  }
}

Auto-Generation

Describe your ideal agent in plain English and let Rymi generate the full persona/playbook bundle:

bash

curl -X POST https://api.rymi.live/v1/agents/generate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "A friendly female sales agent who speaks English with an American accent and sells insurance plans",
    "options": { "llm_provider": "gemini", "voice": "Aoede" }
  }'

The response includes a draft object and a compiled_prompt_preview you can review before creating the agent.

Custom Personas (Agents) ​

Agent Structure ​

Simple Mode vs Structured Mode ​

Simple Mode ​

Structured Mode ​

AI Stack Configuration ​

Setting the AI Stack ​

Provider Config (Advanced) ​

Runtime Controls ​

Feature Flags ​

Post-Call Intelligence ​

Auto-Generation ​

Custom Personas (Agents)

Agent Structure

Simple Mode vs Structured Mode

Simple Mode

Structured Mode

AI Stack Configuration

Setting the AI Stack

Provider Config (Advanced)

Runtime Controls

Feature Flags

Post-Call Intelligence

Auto-Generation