Model Configuration
ByteBuddy supports multiple AI models, allowing you to choose the most suitable model according to your needs.
Supported Model Types
Large Language Models
- OpenAI Models: GPT-3.5, GPT-4 series
- Anthropic Models: Claude series models
- Open Source Models: Local models running through Ollama, LM Studio, etc.
- Other Providers: Support for OpenAI-compatible APIs
Model Roles
ByteBuddy supports configuring different models for different roles:
- chat: For conversation interaction and complex task processing
- edit: For code editing tasks
- apply: For code application operations
- autocomplete: For real-time code completion
- embed: For embedding vector generation
- rerank: For search result reranking
Basic Configuration
Configuration File Structure
Configure models in the project root directory or ~/.bytebuddy/config.yaml:
# config.yaml
name: My ByteBuddy Config
version: 0.0.1
schema: v1
models:
- name: "gpt-4"
provider: "openai"
model: "gpt-4"
apiKey: "${OPENAI_API_KEY}"
roles:
- chat
- edit
- apply
- name: "claude-3-sonnet"
provider: "anthropic"
model: "claude-3-sonnet"
apiKey: "${ANTHROPIC_API_KEY}"
roles:
- chat
- autocomplete
- name: "local-llama"
provider: "ollama"
model: "llama2"
apiBase: "http://localhost:11434"
roles:
- chatBasic Configuration Parameters
Each model supports the following basic parameters:
- name: Name of the model configuration
- provider: Model provider (openai, anthropic, ollama, etc.)
- model: Specific model name
- apiKey: API key (can use environment variables)
- apiBase: API base URL (optional)
- roles: List of roles the model plays
Advanced Configuration Options
Completion Options Configuration
models:
- name: "gpt-4-turbo"
provider: "openai"
model: "gpt-4-turbo"
apiKey: "${OPENAI_API_KEY}"
roles: ["chat"]
defaultCompletionOptions:
temperature: 0.7
maxTokens: 2000
topP: 0.9
presencePenalty: 0.1
frequencyPenalty: 0.1
stop: ["\n\n", "###"]Autocomplete Options
models:
- name: "claude-autocomplete"
provider: "anthropic"
model: "claude-3-haiku"
apiKey: "${ANTHROPIC_API_KEY}"
roles: ["autocomplete"]
autocompleteOptions:
maxPromptTokens: 2000
debounceDelay: 300
modelTimeout: 10000
useCache: true
useImports: true
useRecentlyEdited: trueRequest Options
models:
- name: "configured-model"
provider: "openai"
model: "gpt-4"
apiKey: "${OPENAI_API_KEY}"
roles: ["chat"]
requestOptions:
timeout: 30000
verifySsl: true
headers:
"User-Agent": "ByteBuddy/1.0"
extraBodyProperties:
custom_field: "value"Environment Variable Configuration
For security reasons, it is recommended to use environment variables to store API keys:
# Environment variable setup
export OPENAI_API_KEY="your-openai-key"
export ANTHROPIC_API_KEY="your-anthropic-key"Use in configuration file:
models:
- name: "secure-model"
provider: "openai"
model: "gpt-4"
apiKey: "${OPENAI_API_KEY}"Configuration Examples for Different Providers
OpenAI Models
models:
- name: "gpt-4"
provider: "openai"
model: "gpt-4"
apiKey: "${OPENAI_API_KEY}"
apiBase: "https://api.openai.com/v1"
roles: ["chat", "edit"]Anthropic Models
models:
- name: "claude-3-opus"
provider: "anthropic"
model: "claude-3-opus-20240229"
apiKey: "${ANTHROPIC_API_KEY}"
roles: ["chat", "edit"]Local Ollama Models
models:
- name: "local-llama3"
provider: "ollama"
model: "llama3:8b"
apiBase: "http://localhost:11434"
roles: ["chat"]Custom OpenAI-Compatible API
models:
- name: "custom-provider"
provider: "openai-compatible"
model: "custom-model"
apiKey: "${CUSTOM_API_KEY}"
apiBase: "https://your-custom-api.com/v1"
roles: ["chat"]Using Company Internal Models
If your company has deployed an internal model service that complies with the OpenAI API specification, you can use it directly with ByteBuddy—even if the model isn't in the official supported provider list!
Simply set provider to "openai", then configure your company's internal model name and API address:
models:
- name: "company-internal-model"
provider: "openai" # Key: use openai provider
model: "your-company-model-name" # Your company's internal model name
apiKey: "${COMPANY_API_KEY}" # Company internal API key
apiBase: "https://your-company-ai-api.com/v1" # Company internal API address
roles: ["chat", "edit", "autocomplete"] # Assign roles as neededConfiguration Key Points:
providermust be set to"openai": This ensures ByteBuddy uses the standard OpenAI API call formatmodelfield should contain your company's internal model name: such as"qwen-max","glm-4","internlm2", etc.apiBaseshould point to your company's internal API gateway: Make sure the URL includes the/v1path (if your company's API follows the OpenAI standard)apiKeyshould use your company-assigned key: Also recommended to manage through environment variables
This approach allows you to seamlessly integrate your company's internal AI model services, enjoying the same experience as official OpenAI models while keeping your data secure within your organization.
Model Capability Configuration
ByteBuddy supports specifying specific capabilities for models:
models:
- name: "gpt-4-vision"
provider: "openai"
model: "gpt-4-vision-preview"
apiKey: "${OPENAI_API_KEY}"
roles: ["chat"]
capabilities:
- "tool_use"
- "image_input"Supported model capabilities:
- tool_use: Support for tool calling
- image_input: Support for image input
- next_edit: Support for next edit mode
Cache Configuration
models:
- name: "cached-model"
provider: "openai"
model: "gpt-4"
apiKey: "${OPENAI_API_KEY}"
cacheBehavior:
cacheSystemMessage: true
cacheConversation: trueTroubleshooting
Common Issue Resolution
API Connection Failure
yaml# Increase timeout requestOptions: timeout: 60000Slow Model Response
yaml# Use faster model - name: "fast-model" provider: "openai" model: "gpt-3.5-turbo" # ... other configurationsAPI Key Issues
- Ensure environment variables are set correctly
- Check API key permissions
- Verify the model is in the available list
Best Practices
Security
- Always use environment variables to store API keys
- Rotate API keys regularly
- Limit API key permission scope
Performance Optimization
- Choose appropriate models for different tasks
- Reasonably set temperature and maxTokens
- Enable caching to reduce duplicate requests
Cost Control
- Monitor token usage
- Use smaller models for simple tasks
- Set reasonable context length limits
Reliability
- Configure multiple models as backups
- Set appropriate timeout times
- Handle API rate limiting errors