Skip to content

Custom Personas (Agents)

Agents are the reusable voice personas that power your calls. Each agent stores its identity, conversation structure, AI stack preferences, runtime behavior, and post-call intelligence settings.

Agent Structure

An agent is built from layered configuration:

text
┌──────────────────────────┐
│  System Prompt (raw)     │  ← Simple mode: just a string
├──────────────────────────┤
│  Persona                 │  ← Structured: role, tone, audience, voice
├──────────────────────────┤
│  Playbook                │  ← Conversation flow: opener, scripts, CTAs
├──────────────────────────┤
│  Advanced                │  ← Runtime controls: barge-in, silence, turns
├──────────────────────────┤
│  Features                │  ← Toggles: recording, transcription
├──────────────────────────┤
│  AI Stack                │  ← LLM, STT, TTS provider/model selection
├──────────────────────────┤
│  Post-Call Intelligence  │  ← Summary, extraction, evaluation config
└──────────────────────────┘

Simple Mode vs Structured Mode

Simple Mode

Pass a system_prompt string and Rymi uses it directly as the LLM context:

json
{
  "name": "Alex - Support Agent",
  "system_prompt": "You are Alex, a friendly customer support agent for TechCorp..."
}

Structured Mode

Use persona and playbook objects for more control. Rymi's Prompt Compiler merges these into an optimized system prompt at call time.

json
{
  "name": "Priya - Sales Specialist",
  "persona": {
    "name": "Priya",
    "role": "Insurance sales specialist",
    "toneOverride": "Warm and confident",
    "audienceDescription": "Small business owners in India",
    "companyName": "Acme Insurance",
    "successCriteria": ["Qualify the lead", "Book a follow-up call"],
    "voiceConfig": {
      "voiceId": "Aoede",
      "language": "en-US"
    },
    "callerPersonas": [
      { "type": "interested", "approach": "Mirror enthusiasm, move to qualification" },
      { "type": "skeptical", "approach": "Lead with social proof and case studies" }
    ]
  },
  "playbook": {
    "opener": "Hi, this is Priya from Acme Insurance. Is this a good time?",
    "qualificationFlow": [
      { "question": "How many employees does your company have?", "listensFor": "Company size" },
      { "question": "What's your current insurance provider?", "listensFor": "Current provider" }
    ],
    "objectionHandlers": [
      { "trigger": "too expensive", "response": "I understand cost is important. Our plans start at just..." }
    ],
    "closingCTA": "I'd love to set up a quick demo. Does Thursday work for you?",
    "fallbackCTA": "Can I send you some information to review at your convenience?"
  }
}

The Prompt Compiler output is stored as compiled_prompt on the agent and returned in GET /agents/:id.

AI Stack Configuration

Each agent can be configured with specific LLM, STT, and TTS providers. The AI stack is organized by agent role:

The Executive role uses the executive API value.

RolePipelineBest For
operatorSeparate STT → LLM → TTSCost-efficient, flexible provider mix
specialistSeparate STT → LLM → TTSHigher-quality models with Google Gemini Pro TTS by default; ElevenLabs can be selected as a premium/custom voice override
executiveBundled realtime (Gemini Live / OpenAI Realtime)Lowest latency, end-to-end

Setting the AI Stack

json
{
  "agent_role": "operator",
  "llm_model": "gemini-2.5-flash",
  "stt_provider": "google",
  "tts_provider": "google",
  "tts_model": "gemini-2.5-flash-preview-tts",
  "voice": "Aoede"
}

For the Executive role (executive API value), STT and TTS are handled by the realtime LLM itself:

json
{
  "agent_role": "executive",
  "llm_model": "gemini-live"
}

Use GET /v1/agents/llm-options to fetch the catalog of available models and voices.

Provider Config (Advanced)

For fine-grained control, use provider_config to define the full routing role:

json
{
  "provider_config": {
    "active_role": "operator",
    "tiers": {
      "operator": {
        "llm": { "primary": { "provider": "google", "model": "gemini-2.5-flash" } },
        "stt": { "primary": { "provider": "deepgram", "model": "nova-2-phonecall" } },
        "tts": { "primary": { "provider": "openai", "model": "tts-1" } }
      }
    }
  }
}

Runtime Controls

The advanced object tunes how the agent behaves during calls:

json
{
  "advanced": {
    "bargeInEnabled": true,
    "maxTurnLength": 30,
    "postSilenceHangup": 15,
    "endpointingThreshold": 500
  }
}
ControlTypeDescription
bargeInEnabledbooleanAllow user to interrupt the agent mid-response
maxTurnLengthnumberMaximum agent response duration in seconds
postSilenceHangupnumberEnd call after this many seconds of user silence
endpointingThresholdnumberSilence duration (ms) before treating speech as complete

TIP

These controls are enforced at runtime by the gateway — they override any contradictory instructions in the system prompt.

Feature Flags

Toggle capabilities per agent:

json
{
  "features": {
    "recording": { "enabled": true },
    "transcription_enabled": true
  }
}
FeatureEffect When Disabled
recordingNo LiveKit Egress recording is started
transcription_enabledNo transcript persistence, no transcript data packets, no post-call transcript analysis

Post-Call Intelligence

Configure what analysis runs after each call ends. See the Post-Call Intelligence guide for full details.

json
{
  "post_call": {
    "recording": { "enabled": true },
    "summary": { "enabled": true },
    "structured_extraction": {
      "json_schema": {
        "type": "object",
        "properties": {
          "appointment_booked": { "type": "boolean" },
          "follow_up_date": { "type": "string" }
        }
      }
    },
    "evaluation": {
      "rubric": "Did the agent successfully qualify the lead and book a follow-up?"
    }
  }
}

Auto-Generation

Describe your ideal agent in plain English and let Rymi generate the full persona/playbook bundle:

bash
curl -X POST https://api.rymi.live/v1/agents/generate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "A friendly female sales agent who speaks English with an American accent and sells insurance plans",
    "options": { "llm_provider": "gemini", "voice": "Aoede" }
  }'

The response includes a draft object and a compiled_prompt_preview you can review before creating the agent.