Skip to content

Custom Providers Deep Dive

The custom provider system allows users to extend ByteBuddy's functionality, integrating various third-party services and custom data sources to create a personalized AI development environment.

Provider Types

Officially Supported Providers

ByteBuddy supports multiple mainstream AI providers:

  • OpenAI: GPT-4, GPT-3.5
  • Anthropic: Claude 3 series
  • Google: Gemini Pro
  • Azure OpenAI: Enterprise OpenAI services
  • AWS Bedrock: AWS-managed AI models
  • Cohere: Embedding and reranking models
  • Together: Open-source model hosting
  • Ollama: Local model execution

Custom Provider Integration

You can integrate any provider compatible with the OpenAI API format.

Custom API Providers

Basic Configuration

yaml
models:
  - name: "custom-api"
    provider: "openai-compatible"
    model: "custom-model-name"
    apiKey: "${CUSTOM_API_KEY}"
    apiBase: "https://api.custom-provider.com/v1"
    roles: ["chat"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 2000

OpenAI Compatible APIs

Many providers offer OpenAI-compatible APIs:

yaml
models:
  # DeepSeek
  - name: "deepseek"
    provider: "openai-compatible"
    model: "deepseek-chat"
    apiKey: "${DEEPSEEK_API_KEY}"
    apiBase: "https://api.deepseek.com/v1"
    roles: ["chat"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 4000

  # Perplexity
  - name: "perplexity"
    provider: "openai-compatible"
    model: "llama-3.1-sonar-large-128k-online"
    apiKey: "${PERPLEXITY_API_KEY}"
    apiBase: "https://api.perplexity.ai"
    roles: ["chat"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 4096

  # Fireworks AI
  - name: "fireworks"
    provider: "openai-compatible"
    model: "accounts/fireworks/models/llama-v3p1-70b-instruct"
    apiKey: "${FIREWORKS_API_KEY}"
    apiBase: "https://api.fireworks.ai/inference/v1"
    roles: ["chat"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 4096

Local Model Providers

Ollama Configuration

yaml
models:
  - name: "local-llama"
    provider: "ollama"
    model: "llama2"
    apiBase: "http://localhost:11434"
    roles: ["chat"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 2000

  - name: "local-codellama"
    provider: "ollama"
    model: "codellama:13b"
    apiBase: "http://localhost:11434"
    roles: ["autocomplete", "edit"]
    defaultCompletionOptions:
      temperature: 0.2
      maxTokens: 1024

LM Studio Configuration

yaml
models:
  - name: "lmstudio-model"
    provider: "lmstudio"
    model: "local-model"
    apiBase: "http://localhost:1234/v1"
    roles: ["chat"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 2000

llama.cpp Server Configuration

yaml
models:
  - name: "llamacpp-model"
    provider: "llamacpp"
    model: "llama-2-13b"
    apiBase: "http://localhost:8080"
    roles: ["chat"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 2000

Cloud Service Providers

Azure OpenAI

yaml
models:
  - name: "azure-gpt4"
    provider: "azure-openai"
    model: "gpt-4"
    apiKey: "${AZURE_OPENAI_API_KEY}"
    apiBase: "${AZURE_OPENAI_API_BASE}"
    env:
      deploymentName: "gpt-4-deployment"
      apiVersion: "2024-02-15-preview"
    roles: ["chat", "edit"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 4000

AWS Bedrock

yaml
models:
  - name: "bedrock-claude"
    provider: "bedrock"
    model: "anthropic.claude-3-sonnet-20240229-v1:0"
    env:
      region: "us-east-1"
      accessKeyId: "${AWS_ACCESS_KEY_ID}"
      secretAccessKey: "${AWS_SECRET_ACCESS_KEY}"
    roles: ["chat"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 4096

Google Vertex AI

yaml
models:
  - name: "vertexai-gemini"
    provider: "vertexai"
    model: "gemini-pro"
    env:
      projectId: "${GOOGLE_CLOUD_PROJECT_ID}"
      location: "us-central1"
      credentials: "${GOOGLE_APPLICATION_CREDENTIALS}"
    roles: ["chat"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 2048

Specialized Service Providers

Embedding Model Providers

yaml
models:
  # OpenAI Embeddings
  - name: "openai-embed"
    provider: "openai"
    model: "text-embedding-3-large"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["embed"]

  # Cohere Embeddings
  - name: "cohere-embed"
    provider: "cohere"
    model: "embed-english-v3.0"
    apiKey: "${COHERE_API_KEY}"
    roles: ["embed"]

  # Voyage AI Embeddings
  - name: "voyage-embed"
    provider: "openai-compatible"
    model: "voyage-large-2"
    apiKey: "${VOYAGE_API_KEY}"
    apiBase: "https://api.voyageai.com/v1"
    roles: ["embed"]

Reranking Model Providers

yaml
models:
  # Cohere Reranking
  - name: "cohere-rerank"
    provider: "cohere"
    model: "rerank-english-v3.0"
    apiKey: "${COHERE_API_KEY}"
    roles: ["rerank"]

  # Jina AI Reranking
  - name: "jina-rerank"
    provider: "openai-compatible"
    model: "jina-reranker-v1-base-en"
    apiKey: "${JINA_API_KEY}"
    apiBase: "https://api.jina.ai/v1"
    roles: ["rerank"]

Custom Request Options

Request Headers Configuration

yaml
models:
  - name: "custom-headers"
    provider: "openai-compatible"
    model: "custom-model"
    apiKey: "${API_KEY}"
    apiBase: "https://api.example.com/v1"
    requestOptions:
      headers:
        "X-Custom-Header": "value"
        "User-Agent": "ByteBuddy/1.0"
    roles: ["chat"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 2000

Timeout and Retry Configuration

yaml
models:
  - name: "timeout-config"
    provider: "openai"
    model: "gpt-4"
    apiKey: "${OPENAI_API_KEY}"
    requestOptions:
      timeout: 60000 # 60 seconds
      maxRetries: 3
      retryDelay: 1000 # 1 second
    roles: ["chat"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 2000

Best Practices

1. Provider Selection

  • Cloud: High performance, latest models
  • Local: Privacy protection, no network needed
  • Hybrid: Combine advantages of both

2. API Key Management

  • Use environment variables to store keys
  • Rotate keys regularly
  • Limit key permissions
  • Monitor key usage

3. Performance Optimization

  • Choose geographically closest servers
  • Configure appropriate timeout values
  • Implement retry strategies
  • Enable connection pooling

4. Cost Control

  • Monitor API usage
  • Set usage limits
  • Choose cost-effective models
  • Use caching to reduce calls

Troubleshooting

Connection Issues

Solutions:

  • Check API endpoint correctness
  • Verify network connectivity
  • Check firewall settings
  • Confirm API key validity

Authentication Failures

Solutions:

  • Verify API key format
  • Check key permissions
  • Confirm key not expired
  • Review provider status

Performance Issues

Solutions:

  • Optimize request size
  • Increase timeout values
  • Adjust retry strategy
  • Consider switching providers

Environment Variables

bash
# ~/.bashrc or ~/.zshrc

# OpenAI-compatible providers
export CUSTOM_API_KEY="your-custom-api-key"
export DEEPSEEK_API_KEY="your-deepseek-api-key"
export PERPLEXITY_API_KEY="your-perplexity-api-key"
export FIREWORKS_API_KEY="your-fireworks-api-key"

# Cloud service providers
export AZURE_OPENAI_API_KEY="your-azure-key"
export AZURE_OPENAI_API_BASE="https://your-resource.openai.azure.com"
export AWS_ACCESS_KEY_ID="your-aws-access-key"
export AWS_SECRET_ACCESS_KEY="your-aws-secret-key"
export GOOGLE_CLOUD_PROJECT_ID="your-project-id"
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/credentials.json"

# Specialized service providers
export VOYAGE_API_KEY="your-voyage-api-key"
export JINA_API_KEY="your-jina-api-key"

Through flexible custom provider configuration, you can fully leverage various AI services to build the development environment that best suits your needs.