Skip to content

Custom Personas (Agents)

Agents are the reusable voice personas that power your calls. Each agent stores its identity, conversation structure, AI stack preferences, runtime behavior, and post-call intelligence settings.

Agent Structure

An agent is built from layered configuration:

text
┌──────────────────────────┐
│  System Prompt (raw)     │  ← Simple mode: just a string
├──────────────────────────┤
│  Persona                 │  ← Structured: role, tone, audience, voice
├──────────────────────────┤
│  Playbook                │  ← Conversation flow: opener, scripts, CTAs
├──────────────────────────┤
│  Settings                │  ← Runtime controls: barge-in, silence, turns
├──────────────────────────┤
│  Features                │  ← Toggles: recording, transcription
├──────────────────────────┤
│  AI Stack                │  ← LLM, STT, TTS provider/model selection
├──────────────────────────┤
│  Post-Call Intelligence  │  ← Summary, extraction, evaluation config
└──────────────────────────┘

Simple Mode vs Structured Mode

Simple Mode

Pass a system_prompt string and Rymi uses it directly as the LLM context:

json
{
  "name": "Alex - Support Agent",
  "system_prompt": "You are Alex, a friendly customer support agent for TechCorp..."
}

Structured Mode

Use persona and playbook objects for more control. Rymi's Prompt Compiler merges these into an optimized system prompt at call time.

json
{
  "name": "Priya - Sales Specialist",
  "persona": {
    "name": "Priya",
    "role": "Insurance sales specialist",
    "toneOverride": "Warm and confident",
    "audienceDescription": "Small business owners in India",
    "companyName": "Acme Insurance",
    "successCriteria": ["Qualify the lead", "Book a follow-up call"],
    "voiceConfig": {
      "voiceId": "Aoede",
      "language": "en-US"
    },
    "callerPersonas": [
      { "type": "interested", "approach": "Mirror enthusiasm, move to qualification" },
      { "type": "skeptical", "approach": "Lead with social proof and case studies" }
    ]
  },
  "playbook": {
    "opener": "Hi, this is Priya from Acme Insurance. Is this a good time?",
    "qualificationFlow": [
      { "question": "How many employees does your company have?", "listensFor": "Company size" },
      { "question": "What's your current insurance provider?", "listensFor": "Current provider" }
    ],
    "objectionHandlers": [
      { "trigger": "too expensive", "response": "I understand cost is important. Our plans start at just..." }
    ],
    "closingCTA": "I'd love to set up a quick demo. Does Thursday work for you?",
    "fallbackCTA": "Can I send you some information to review at your convenience?"
  }
}

Rymi compiles the agent instructions from the persona and playbook fields and returns the compiled prompt with the agent.

AI Stack Configuration

Each agent can be configured with specific LLM, STT, and TTS providers. The AI stack is organized by agent role:

The Executive role uses the executive API value.

RolePipelineBest For
operatorSeparate STT → LLM → TTSCost-efficient, flexible provider mix
specialistSeparate STT → LLM → TTSHigher-quality models with Google Gemini Pro TTS by default; ElevenLabs can be selected as a premium/custom voice override
executiveBundled realtime (Gemini Live / OpenAI Realtime)Lowest latency, end-to-end

Setting the AI Stack

json
{
  "agent_role": "operator",
  "llm_model": "gemini-2.5-flash",
  "stt_provider": "google",
  "tts_provider": "google",
  "tts_model": "gemini-2.5-flash-preview-tts",
  "voice": "Aoede"
}

For the Executive role (executive API value), STT and TTS are handled by the realtime LLM itself:

json
{
  "agent_role": "executive",
  "llm_model": "gemini-live"
}

Use GET /v1/agents/llm-options to fetch the catalog of available models and voices.

Language Routing

Rymi derives the provider route from the agent role, primary language, supported languages, and provider capabilities. Set language to a locale such as en-US or hi-IN, and use supported_languages for every language the agent may run. Rymi resolves the role-safe STT, LLM, and TTS stack for each selected language before save. Automatic language detection is not default MVP behavior.

Runtime Controls

The advanced object tunes how the agent behaves during calls:

json
{
  "advanced": {
    "bargeInEnabled": true,
    "maxTurnLength": 30,
    "postSilenceHangup": 15,
    "endpointingThreshold": 500
  }
}
ControlTypeDescription
bargeInEnabledbooleanAllow user to interrupt the agent mid-response
maxTurnLengthnumberMaximum agent response duration in seconds
postSilenceHangupnumberEnd call after this many seconds of user silence
endpointingThresholdnumberSilence duration (ms) before treating speech as complete

TIP

These controls are enforced at runtime by the gateway — they override any contradictory instructions in the system prompt.

Feature Flags

Toggle capabilities per agent:

json
{
  "features": {
    "recording": { "enabled": true },
    "transcription_enabled": true
  }
}
FeatureEffect When Disabled
recordingNo call recording is started
transcription_enabledNo transcript persistence, no transcript data packets, no post-call transcript analysis

Post-Call Intelligence

Configure what analysis runs after each call ends. See the Post-Call Intelligence guide for full details.

json
{
  "post_call": {
    "recording": { "enabled": true },
    "summary": { "enabled": true },
    "structured_extraction": {
      "json_schema": {
        "type": "object",
        "properties": {
          "appointment_booked": { "type": "boolean" },
          "follow_up_date": { "type": "string" }
        }
      }
    },
    "evaluation": {
      "rubric": "Did the agent successfully qualify the lead and book a follow-up?"
    }
  }
}

Auto-Generation

Describe your ideal agent in plain English and let Rymi generate the full persona/playbook bundle:

bash
curl -X POST https://api.rymi.live/v1/agents/generate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "A friendly female sales agent who speaks English with an American accent and sells insurance plans",
    "options": { "llm_provider": "gemini", "voice": "Aoede" }
  }'

The response includes a draft object and a compiled_prompt_preview you can review before creating the agent.