Agent Identity Guide

Agent Identity is who the agent is and how it sounds on the call — the parts callers experience first. Get this layer right and everything else has a foundation to build on.

You configure Identity in the Agent Studio under the "Agent Identity" section. It has five fields, covered below.

Your callers

What it is: A plain-language description of the people your agent talks to.

Why it matters: This is the single biggest lever for how the agent speaks. A dentist's patients and a SaaS CTO don't want to be talked to the same way. When the agent knows who's on the other end, it adjusts vocabulary, pacing, and assumed expertise.

Good examples

Appointment booking:

"People booking dental appointments — patients of all ages, some nervous, most non-technical."

Customer support:

"Users of our SaaS product. Mixed technical level. Usually frustrated and want a quick resolution."

Sales outreach:

"Heads of RevOps at mid-market B2B SaaS companies. Technical, busy, skeptical of cold calls."

Tips

Include emotional context. "Nervous", "frustrated", "skeptical" — these shape delivery as much as vocabulary.
Describe technical level. "Non-technical" vs. "technical" changes whether the agent uses jargon.
Two to three sentences is plenty. More is fine, but this isn't a marketing persona — just enough for the agent to calibrate.

Language

What it is: The primary language the agent speaks first.

Why it matters: Picks the baseline language for STT (speech recognition), TTS (speech synthesis), and the LLM. Setting it to Auto-detect (inbound only) lets the agent switch languages based on what the caller speaks first, which is useful for multilingual businesses.

When to use "Auto-detect"

You serve customers who speak multiple languages and you don't know which they'll use.
Inbound calls only — outbound calls still need an explicit language because there's no incoming speech to detect.

When to pick a specific language

Most of your callers speak one language.
You need consistent voice selection (voice catalogs differ by language).

Voice tone

What it is: The personality your agent projects across every sentence.

Why it matters: Voice tone is the cross-cutting flavor on top of what the agent says. Same script, different tone → different caller experience. A "Warm & Friendly" confirmation and a "Professional & Direct" confirmation feel completely different.

Presets

Rymi ships six preset tones that cover most cases:

Warm & Friendly — approachable, conversational. Good default for consumer-facing agents (booking, restaurants, wellness).
Professional & Direct — clear, efficient. Good for B2B, sales, and anything where callers value their time.
Casual & Conversational — relaxed, like a friend. Good for lifestyle brands and modern consumer apps.
Empathetic & Patient — supportive, takes its time. Good for support, healthcare, and grief-sensitive contexts.
Energetic & Confident — upbeat, assertive. Good for fitness, sales outreach, and motivational contexts.
Formal & Precise — measured, authoritative. Good for legal, financial, and regulated industries.

Custom tone

If none of the presets fit, type your own in the "Or type your own…" field. Good custom tones are two or three adjectives: "Warm, efficient, and accommodating" or "Direct and confident, with a dry sense of humor."

Tips

Don't overthink it. A preset that's 80% right beats a custom description that tries to be perfect.
Mismatch tone and audience and the call will feel off. A legal agent with an "Energetic & Confident" tone will sound like an ambulance-chaser ad.

Voice

What it is: The TTS (text-to-speech) voice identity your agent uses — the actual audio on the call.

Why it matters: The voice is the part callers hear most. A clear, natural voice makes everything easier; a robotic or inconsistent one undermines trust no matter how good the script is.

Availability by tier

Operator tier — Standard voice catalog. Good for reminders, FAQs, and high-volume simple calls.
Specialist tier — Premium voice catalog. Higher naturalness, wider selection, multiple languages.
Executive tier — Studio-grade real-time audio. Voice is bound to the selected real-time AI model; you don't pick a voice independently.

Deepgram Aura

If your TTS provider is Deepgram Aura, the voice is tied to the selected model. Change the model in Advanced Settings to switch voices.

ElevenLabs

If you've connected an ElevenLabs account, voices you've created in ElevenLabs show up in the selector. Use the "Refresh Voices" button next to the Voice field to pull newly added voices without waiting for the next automatic sync.

Tips

Listen before you commit. The preview button plays a short sample of the selected voice — use it.
Match voice to audience. A clinical, neutral voice works for support; a warmer voice works for booking.
Voice consistency matters. Once users hear your agent, they'll map that voice to your brand. Changing it later can feel jarring.

Speaking style

What it is: An optional hint passed to the TTS engine — accent, delivery style, pacing preference.

Why it matters: Most agents don't need this, which is why the field is collapsed by default. But when you do need it, it gives a fine-grained lever over voice delivery that goes beyond the tone preset.

When to use it

Accent: "Neutral American", "British Professional", "Southern US with gentle pacing".
Delivery style: "Theatrical delivery", "Measured, deliberate pacing", "Conversational with natural pauses".
Context-specific: "Like a news anchor reading the weather", "Relaxed and reassuring, no medical jargon".

Tips

Only fill this in if you have a specific delivery in mind. Leaving it empty uses the TTS default, which is usually fine.
Not all TTS engines honor all hints. Experimental — test and iterate.

Putting it together: a worked example

A complete Identity block for a dental booking agent:

yaml

name: Maya
persona:
  audienceDescription: "Patients of all ages booking dental appointments. Some nervous, most non-technical."
  toneOverride: "Warm & Friendly"
  voiceConfig:
    voiceId: sarah-v2
    language: en-US
    accent: "Relaxed and reassuring. No medical jargon."
language: en-US
voice: sarah-v2

Next steps

Write Your callers first — it's the field with the biggest ripple effect on the rest of the config.
Pick a tone preset and a voice, then run a test call to hear how they sound together.
Iterate. The first voice you pick rarely stays. Listen to 10 real calls and refine.

Agent Identity Guide ​

Your callers ​

Good examples ​

Tips ​

Language ​

When to use "Auto-detect" ​

When to pick a specific language ​

Voice tone ​

Presets ​

Custom tone ​

Tips ​

Voice ​

Availability by tier ​

Deepgram Aura ​

ElevenLabs ​

Tips ​

Speaking style ​

When to use it ​

Tips ​

Putting it together: a worked example ​

Next steps ​

Agent Identity Guide

Your callers

Good examples

Tips

Language

When to use "Auto-detect"

When to pick a specific language

Voice tone

Presets

Custom tone

Tips

Voice

Availability by tier

Deepgram Aura

ElevenLabs

Tips

Speaking style

When to use it

Tips

Putting it together: a worked example

Next steps