Skip to content

Embeddings Role

The embeddings role is specialized for text vectorization and semantic search, converting text into numerical vectors to support various AI applications.

Configuration

Configure in config.yaml or ~/.bytebuddy/config.yaml:

yaml
models:
  - name: "embedding-model"
    provider: "openai"
    model: "text-embedding-3-large"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["embed"]

Core Features

Text Vectorization

  • Semantic Encoding: Convert text to semantic vectors
  • Similarity Calculation: Calculate text similarity
  • Clustering Analysis: Group texts together
  • Semantic Search: Retrieve based on semantic meaning

Application Scenarios

  • Code Search: Semantic code retrieval
  • Document Retrieval: Intelligent document lookup
  • RAG Systems: Retrieval-Augmented Generation
  • Recommendation Systems: Content recommendations

Embedding Model Configurations

OpenAI Embeddings

yaml
models:
  - name: "openai-embed"
    provider: "openai"
    model: "text-embedding-3-large"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["embed"]

Small Embedding Model

yaml
models:
  - name: "openai-embed-small"
    provider: "openai"
    model: "text-embedding-3-small"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["embed"]

Cohere Embeddings

yaml
models:
  - name: "cohere-embed"
    provider: "cohere"
    model: "embed-english-v3.0"
    apiKey: "${COHERE_API_KEY}"
    roles: ["embed"]

Multilingual Embeddings

yaml
models:
  - name: "multilingual-embed"
    provider: "cohere"
    model: "embed-multilingual-v3.0"
    apiKey: "${COHERE_API_KEY}"
    roles: ["embed"]

Use Case Configurations

yaml
models:
  - name: "code-embed"
    provider: "openai"
    model: "text-embedding-3-large"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["embed"]

Document Retrieval

yaml
models:
  - name: "doc-embed"
    provider: "cohere"
    model: "embed-english-v3.0"
    apiKey: "${COHERE_API_KEY}"
    roles: ["embed"]

RAG Application

yaml
models:
  - name: "rag-embed"
    provider: "openai"
    model: "text-embedding-3-large"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["embed"]

Multi-Model Configuration

yaml
models:
  - name: "high-quality-embed"
    provider: "openai"
    model: "text-embedding-3-large"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["embed"]

  - name: "fast-embed"
    provider: "openai"
    model: "text-embedding-3-small"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["embed"]

  - name: "multilingual-embed"
    provider: "cohere"
    model: "embed-multilingual-v3.0"
    apiKey: "${COHERE_API_KEY}"
    roles: ["embed"]

Vector Dimensions

Different models provide different vector dimensions:

  • text-embedding-3-large: 3072 dimensions
  • text-embedding-3-small: 1536 dimensions
  • text-embedding-ada-002: 1536 dimensions
  • embed-english-v3.0: 1024 dimensions

Best Practices

1. Model Selection

  • High Quality: Use large models
  • Performance First: Use small models
  • Multilingual: Use multilingual models
  • Cost Sensitive: Use smaller models

2. Batch Processing

  • Process texts in batches for efficiency
  • Set reasonable batch sizes
  • Handle failure cases

3. Vector Storage

  • Use specialized vector databases
  • Optimize indexing
  • Regularly update vectors

4. Similarity Calculation

  • Use cosine similarity
  • Set appropriate thresholds
  • Consider normalization

Integration Examples

Vector Database Integration

Common vector databases:

  • Pinecone: Managed vector database
  • Weaviate: Open-source vector search engine
  • Milvus: High-performance vector database
  • ChromaDB: Lightweight vector storage

RAG System Configuration

yaml
models:
  # Embedding model
  - name: "embed-model"
    provider: "openai"
    model: "text-embedding-3-large"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["embed"]

  # Generation model
  - name: "chat-model"
    provider: "openai"
    model: "gpt-4"
    apiKey: "${OPENAI_API_KEY}"
    roles: ["chat"]
    defaultCompletionOptions:
      temperature: 0.7
      maxTokens: 2000

Performance Optimization

1. Batch Embedding

  • Process multiple texts at once
  • Reduce API call count
  • Improve throughput

2. Caching Strategy

  • Cache vectors for frequently used texts
  • Avoid redundant computation
  • Save costs

3. Chunking

  • Split long texts into chunks
  • Set appropriate chunk sizes
  • Maintain semantic integrity

Troubleshooting

Common Issues

  1. Dimension Mismatch

    • Ensure using the same embedding model
    • Check vector dimension settings
    • Regenerate vectors
  2. Performance Issues

    • Use batch processing
    • Optimize chunking strategy
    • Consider using smaller models
  3. High Costs

    • Use small models
    • Implement caching strategy
    • Optimize text length

Environment Variables

bash
# ~/.bashrc or ~/.zshrc
export OPENAI_API_KEY="your-openai-api-key"
export COHERE_API_KEY="your-cohere-api-key"

Through proper embeddings role configuration, you can build powerful semantic search and RAG systems.