Skip to content

Model Capabilities Deep Dive

ByteBuddy's model capabilities system defines specific functions and operations that AI models can perform, helping you better leverage the unique advantages of different models.

Capability Types

Tool Use (tool_use)

Allows models to call external tools and functions.

Configuration Example

yaml
models:
  - name: "tool-using-model"
    provider: "openai"
    model: "gpt-4"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["chat"]
    capabilities: ["tool_use"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 4000

Supported Tools

  • File System: Read and write files
  • Network Requests: HTTP/API calls
  • Database: Query databases
  • Command Execution: Run system commands
  • Custom Tools: User-defined tools

Use Cases

  • Code execution and testing
  • Data retrieval and analysis
  • File operations and management
  • External API integration

Image Input (image_input)

Allows models to process image inputs.

Configuration Example

yaml
models:
  - name: "vision-model"
    provider: "openai"
    model: "gpt-4-vision-preview"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["chat"]
    capabilities: ["image_input"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 4096

Supported Image Types

  • PNG: Lossless compression format
  • JPEG: Lossy compression format
  • WebP: Modern image format
  • GIF: Animated and static images

Use Cases

  • UI/UX design analysis
  • Screenshot understanding
  • Chart and graph parsing
  • Document image processing

Next Edit (next_edit)

Predicts and suggests the next code edit.

Configuration Example

yaml
models:
  - name: "next-edit-model"
    provider: "anthropic"
    model: "claude-3-sonnet"
    apiKey: "${ANTHROPIC_API_KEY}"
    roles: ["edit"]
    capabilities: ["next_edit"]
    defaultCompletionOptions:
      temperature: 0.3
      maxTokens: 4000

Functional Features

  • Intelligent Prediction: Predict next edit operation
  • Context Awareness: Based on current code state
  • Multi-Step Planning: Plan series of edits
  • Refactoring Suggestions: Provide refactoring solutions

Use Cases

  • Continuous code editing
  • Refactoring optimization
  • Pattern recognition and application
  • Code improvement suggestions

Multi-Capability Configuration

Combined Capabilities

A single model can have multiple capabilities:

yaml
models:
  - name: "multi-capable-model"
    provider: "openai"
    model: "gpt-4-vision-preview"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["chat", "edit"]
    capabilities: ["tool_use", "image_input"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 4096

Role-Based Capability Assignment

Different roles may need different capabilities:

yaml
models:
  # Chat role - needs tool use and image input
  - name: "chat-assistant"
    provider: "openai"
    model: "gpt-4-vision-preview"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["chat"]
    capabilities: ["tool_use", "image_input"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 4000

  # Edit role - needs next edit
  - name: "edit-assistant"
    provider: "anthropic"
    model: "claude-3-sonnet"
    apiKey: "${ANTHROPIC_API_KEY}"
    roles: ["edit"]
    capabilities: ["next_edit"]
    defaultCompletionOptions:
      temperature: 0.3
      maxTokens: 4000

  # Apply role - needs tool use
  - name: "apply-assistant"
    provider: "openai"
    model: "gpt-4"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["apply"]
    capabilities: ["tool_use"]
    defaultCompletionOptions:
      temperature: 0.5
      maxTokens: 4096

Provider Capability Support

OpenAI

yaml
models:
  # GPT-4 - supports tool use
  - name: "openai-gpt4"
    provider: "openai"
    model: "gpt-4"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["chat"]
    capabilities: ["tool_use"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 4000

  # GPT-4 Vision - supports image input and tool use
  - name: "openai-gpt4-vision"
    provider: "openai"
    model: "gpt-4-vision-preview"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["chat"]
    capabilities: ["tool_use", "image_input"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 4096

Anthropic

yaml
models:
  # Claude 3 Opus - supports tool use
  - name: "claude-opus"
    provider: "anthropic"
    model: "claude-3-opus"
    apiKey: "${ANTHROPIC_API_KEY}"
    roles: ["chat"]
    capabilities: ["tool_use"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 4096

  # Claude 3 Sonnet - supports next edit
  - name: "claude-sonnet"
    provider: "anthropic"
    model: "claude-3-sonnet"
    apiKey: "${ANTHROPIC_API_KEY}"
    roles: ["edit"]
    capabilities: ["next_edit"]
    defaultCompletionOptions:
      temperature: 0.3
      maxTokens: 4000

Google

yaml
models:
  # Gemini Pro - supports tool use
  - name: "gemini-pro"
    provider: "google"
    model: "gemini-pro"
    apiKey: "${GOOGLE_API_KEY}"
    roles: ["chat"]
    capabilities: ["tool_use"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 2048

  # Gemini Pro Vision - supports image input
  - name: "gemini-pro-vision"
    provider: "google"
    model: "gemini-pro-vision"
    apiKey: "${GOOGLE_API_KEY}"
    roles: ["chat"]
    capabilities: ["image_input"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 2048

Capability Validation

Automatic Detection

ByteBuddy automatically detects model capabilities:

  • Based on provider and model name
  • Verification through API responses
  • Runtime capability testing

Manual Configuration

Explicitly specifying model capabilities can:

  • Override automatic detection results
  • Enable experimental features
  • Disable certain capabilities

Usage Recommendations

Tool Use Best Practices

yaml
models:
  - name: "tool-expert"
    provider: "openai"
    model: "gpt-4"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["chat", "apply"]
    capabilities: ["tool_use"]
    defaultCompletionOptions:
      temperature: 0.5 # Lower temperature for accurate tool calls
      maxTokens: 4000

Recommendations:

  • Lower Temperature: Improve tool call accuracy
  • Clear Prompts: Clarify tool usage scenarios
  • Error Handling: Handle tool call failures
  • Permission Control: Limit tool access permissions

Image Input Best Practices

yaml
models:
  - name: "vision-expert"
    provider: "openai"
    model: "gpt-4-vision-preview"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["chat"]
    capabilities: ["image_input"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 4096 # Image analysis may need more tokens

Recommendations:

  • Image Quality: Use clear images
  • Size Limits: Note image size restrictions
  • Format Selection: Use supported formats
  • Context Combination: Combine with text descriptions

Next Edit Best Practices

yaml
models:
  - name: "edit-expert"
    provider: "anthropic"
    model: "claude-3-sonnet"
    apiKey: "${ANTHROPIC_API_KEY}"
    roles: ["edit"]
    capabilities: ["next_edit"]
    defaultCompletionOptions:
      temperature: 0.2 # Low temperature ensures edit accuracy
      maxTokens: 4000

Recommendations:

  • Low Temperature: Ensure edit accuracy
  • Sufficient Context: Provide complete code context
  • Incremental Edits: Make one edit at a time
  • Validate Results: Check edit outcomes

Advanced Configuration

Conditional Capability Enabling

yaml
models:
  # Development environment - enable all capabilities
  - name: "dev-model"
    provider: "openai"
    model: "gpt-4"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["chat"]
    capabilities: ["tool_use", "image_input"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 4000

  # Production environment - limit capabilities
  - name: "prod-model"
    provider: "openai"
    model: "gpt-4"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["chat"]
    capabilities: [] # No special capabilities
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 2000

Capability Fallback

Fallback strategy when preferred model is unavailable:

yaml
models:
  # Primary model - all capabilities
  - name: "primary-model"
    provider: "openai"
    model: "gpt-4-vision-preview"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["chat"]
    capabilities: ["tool_use", "image_input"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 4096

  # Fallback model - partial capabilities
  - name: "fallback-model"
    provider: "openai"
    model: "gpt-4"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["chat"]
    capabilities: ["tool_use"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 4000

Troubleshooting

Capability Unavailable

Solutions:

  • Check if model supports the capability
  • Verify provider configuration
  • Confirm API version
  • Review provider documentation

Tool Call Failures

Solutions:

  • Check tool definition
  • Verify parameter format
  • Confirm permission settings
  • Review error logs

Performance Issues

Solutions:

  • Optimize tool count
  • Reduce image size
  • Adjust token limits
  • Use faster models

Environment Variables

bash
# ~/.bashrc or ~/.zshrc
export OPENAI_API_KEY="your-openai-api-key"
export ANTHROPIC_API_KEY="your-anthropic-api-key"
export GOOGLE_API_KEY="your-google-api-key"

Through fully understanding and leveraging model capabilities, you can build a more powerful and intelligent AI-assisted development environment.