Skip to content

Model Roles Introduction

ByteBuddy's model role system allows you to select the most suitable model configuration for different task types, optimizing performance and cost-effectiveness.

What are Model Roles?

Model roles are pre-configured model settings for specific task types, including:

  • Model selection
  • Parameter tuning
  • Context management
  • Performance optimization

Supported Role Types

💬 Chat

For conversation and interactive tasks:

  • Natural language understanding
  • Contextual dialogue
  • Multi-turn conversations
  • User support

🔧 Autocomplete

For code and text completion:

  • Real-time code suggestions
  • Smart text completion
  • Context-aware completion
  • Fast response

✏️ Edit

For text and code editing:

  • Content refactoring
  • Format adjustment
  • Syntax correction
  • Style optimization

🎯 Apply

For specific application scenarios:

  • Code generation
  • Document creation
  • Data processing
  • Automated tasks

🔍 Embeddings

For vectorization and semantic search:

  • Text vectorization
  • Semantic similarity
  • Retrieval augmentation
  • Clustering analysis

📊 Reranking

For result sorting and filtering:

  • Relevance ranking
  • Quality assessment
  • Priority adjustment
  • Precise matching

Role Configuration Structure

Basic Configuration Example

Configure model roles in config.yaml:

yaml
models:
  - name: "chat-assistant"
    provider: "openai"
    model: "gpt-4"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["chat"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 8192 # ⚠️ IMPORTANT: Use model's maximum supported value
      topP: 0.9

  - name: "code-editor"
    provider: "anthropic"
    model: "claude-3-sonnet"
    apiKey: "${ANTHROPIC_API_KEY}"
    roles: ["edit", "apply"]
    defaultCompletionOptions:
      temperature: 0.3
      maxTokens: 200000 # ⚠️ IMPORTANT: Use model's maximum supported value

  - name: "autocomplete-engine"
    provider: "together"
    model: "codellama/CodeLlama-13b-Instruct-hf"
    apiKey: "${TOGETHER_API_KEY}"
    roles: ["autocomplete"]
    defaultCompletionOptions:
      temperature: 0.1
      maxTokens: 256 # Autocomplete can use smaller values

⚠️ CRITICAL CONFIGURATION NOTE:

When configuring maxTokens for model roles, always set it to the model's maximum supported value (except for autocomplete roles). Setting maxTokens too low will cause output truncation, potentially leaving responses incomplete or code snippets unfinished.

Why this matters:

  • Chat roles: Need sufficient tokens for detailed explanations and conversations
  • Edit roles: Require enough tokens for complete code refactoring and documentation
  • Apply roles: Need maximum tokens for complex task execution and code generation
  • Autocomplete roles: Can use smaller values (128-256 tokens) as suggestions are brief

Model-specific maximums:

  • GPT-4: 8,192 tokens
  • GPT-4 Turbo: 128,000 tokens
  • Claude 3 models: 200,000 tokens
  • Gemini Pro: 32,768 tokens
  • Local models: Check model documentation

Always check the model provider's documentation for the exact maximum context length and configure maxTokens accordingly.

Multi-Role Configuration

A single model can serve multiple roles:

yaml
models:
  - name: "versatile-model"
    provider: "openai"
    model: "gpt-4"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["chat", "edit", "apply"]
    defaultCompletionOptions:
      temperature: 0.5
      maxTokens: 4096

Role-Specific Configuration

Chat Role Configuration

yaml
models:
  - name: "chat-model"
    provider: "openai"
    model: "gpt-4"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["chat"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 2000
      topP: 0.9

Edit Role Configuration

yaml
models:
  - name: "edit-model"
    provider: "anthropic"
    model: "claude-3-sonnet"
    apiKey: "${ANTHROPIC_API_KEY}"
    roles: ["edit"]
    defaultCompletionOptions:
      temperature: 0.2
      maxTokens: 4000

Autocomplete Role Configuration

yaml
models:
  - name: "autocomplete-model"
    provider: "together"
    model: "codellama/CodeLlama-13b-Instruct-hf"
    apiKey: "${TOGETHER_API_KEY}"
    roles: ["autocomplete"]
    defaultCompletionOptions:
      temperature: 0.1
      maxTokens: 256

Apply Role Configuration

yaml
models:
  - name: "apply-model"
    provider: "openai"
    model: "gpt-4"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["apply"]
    defaultCompletionOptions:
      temperature: 0.3
      maxTokens: 4096

Embed Role Configuration

yaml
models:
  - name: "embed-model"
    provider: "openai"
    model: "text-embedding-3-large"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["embed"]

Rerank Role Configuration

yaml
models:
  - name: "rerank-model"
    provider: "cohere"
    model: "rerank-english-v3.0"
    apiKey: "${COHERE_API_KEY}"
    roles: ["rerank"]

Best Practices

1. Role Design Principles

  • Single Responsibility: Each role focuses on specific tasks
  • Clear Boundaries: Define clear applicable scenarios
  • Performance Optimization: Balance quality and efficiency

2. Configuration Management

  • Version Control: Track role configuration changes
  • Test Validation: Ensure role behavior meets expectations
  • Documentation: Keep configuration documentation updated

3. Usage Guidelines

  • Start Small: Begin with predefined roles
  • Gradual Customization: Adjust based on actual needs
  • Continuous Optimization: Improve roles based on feedback

Next Steps

  1. Learn About Roles: Review specific role type documentation
  2. Configure Models: Set up model roles in config.yaml
  3. Test and Adjust: Continuously improve based on usage
  4. Share Experience: Share quality configurations with team