Skip to content

Running ByteBuddy Without Internet

ByteBuddy can operate effectively in offline environments by leveraging local AI models and cached resources. This guide explains how to configure ByteBuddy for offline use.

Why Run Offline?

Privacy and Security

  • Data Protection: Keep sensitive code and information local
  • Compliance: Meet regulatory requirements for data handling
  • No External Dependencies: Eliminate reliance on external services

Reliability

  • Consistent Performance: No network latency or downtime
  • Predictable Operation: Stable environment without external factors
  • Bandwidth Conservation: No data transfer costs or limits

Accessibility

  • Remote Locations: Work in areas with limited connectivity
  • Air-Gapped Environments: Operate in highly secure environments
  • Travel: Continue development while traveling

Prerequisites

Local AI Models

Install and configure local AI models:

bash
# Install Ollama for local model serving
curl -fsSL https://ollama.ai/install.sh | sh

# Pull essential models
ollama pull llama3:8b        # General purpose model
ollama pull codellama:7b     # Coding-focused model
ollama pull mistral:7b       # Fast, efficient model

Cached Dependencies

Ensure all necessary dependencies are locally available:

bash
# For Node.js projects
npm ci --offline

# For Python projects
pip install --no-index --find-links ./packages -r requirements.txt

Configuration for Offline Use

Model Configuration

Configure ByteBuddy to use only local models:

yaml
# .bytebuddy/config.yaml
models:
  - name: "offline-chat"
    provider: "ollama"
    model: "llama3:8b"
    baseURL: "http://localhost:11434"
    role: "chat"
    timeout: 120 # Increase timeout for local processing

  - name: "offline-coding"
    provider: "ollama"
    model: "codellama:7b"
    baseURL: "http://localhost:11434"
    role: "chat"

  - name: "offline-fast"
    provider: "ollama"
    model: "mistral:7b"
    baseURL: "http://localhost:11434"
    role: "autocomplete"

# Disable external services
preferences:
  telemetry: false
  crashReports: false
  anonymousUsage: false

Disable External Services

Turn off internet-dependent features:

yaml
# .bytebuddy/config.yaml
preferences:
  # Disable online features
  telemetry: false
  crashReports: false
  anonymousUsage: false
  checkForUpdates: false
  downloadModels: false

  # Enable offline features
  cacheEnabled: true
  useContext: true
  useDocumentation: true
  useHistory: true

Local Documentation

Ensure documentation is available locally:

yaml
# .bytebuddy/config.yaml
documentation:
  enabled: true
  paths:
    - "README.md"
    - "docs/**/*.md"
    - "wiki/**/*.md"
  exclude:
    - "node_modules/**"
    - ".git/**"
  cacheLocally: true

Setting Up Local Model Servers

Ollama Setup

Complete offline Ollama configuration:

bash
# Start Ollama service
ollama serve

# Verify models are available
ollama list

# Test model functionality
ollama run llama3:8b "Hello, test offline mode"

Custom Model Server

For enterprise environments, set up custom model servers:

python
# simple-model-server.py
from flask import Flask, request, jsonify
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

app = Flask(__name__)

# Load model (this would be done at startup)
tokenizer = AutoTokenizer.from_pretrained("./models/llama3-8b")
model = AutoModelForCausalLM.from_pretrained("./models/llama3-8b")

@app.route('/api/generate', methods=['POST'])
def generate():
    data = request.json
    prompt = data.get('prompt', '')

    inputs = tokenizer(prompt, return_tensors='pt')
    outputs = model.generate(**inputs, max_new_tokens=100)
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)

    return jsonify({'response': response})

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=8000)

Caching Strategies

Response Caching

Enable aggressive caching for common queries:

yaml
# .bytebuddy/config.yaml
preferences:
  cacheEnabled: true
  cacheTTL: 86400 # 24 hours
  cacheMaxSize: "1GB"

  # Cache layers
  cacheLayers:
    - type: "memory"
      maxSize: "100MB"

    - type: "disk"
      path: "~/.bytebuddy/cache"
      maxSize: "1GB"

Context Caching

Cache frequently accessed code context:

yaml
# .bytebuddy/config.yaml
rag:
  enabled: true
  codeRag:
    enabled: true
    cacheEnabled: true
    cacheTTL: 43200 # 12 hours
    cacheMaxSize: "500MB"

Documentation Caching

Cache documentation for offline access:

bash
# Pre-download documentation
# This would typically be done through a script that fetches
# and stores documentation locally during the setup phase

Offline Workflow Optimization

Pre-loading Context

Load project context at startup:

bash
# Index project documentation and code
bytebuddy index --full

# Pre-cache common queries
bytebuddy cache warmup

Batch Processing

Process multiple requests in batches:

bash
# Batch process code reviews
bytebuddy batch review --files="src/**/*.js" --output=reviews.json

# Batch generate documentation
bytebuddy batch document --files="src/**/*.js" --output=docs/

Security Considerations

Air-Gap Security

Ensure complete isolation:

yaml
# .bytebuddy/config.yaml
preferences:
  # Disable all external communications
  telemetry: false
  crashReports: false
  anonymousUsage: false
  checkForUpdates: false
  downloadModels: false
  sendCodeSnippets: false

  # Enable local security features
  sandboxMode: true
  restrictNetwork: true

Data Protection

Protect sensitive data in offline environments:

yaml
# .bytebuddy/config.yaml
security:
  encryption:
    enabled: true
    keyPath: "~/.bytebuddy/encryption.key"

  sandboxMode: true
  restrictFileAccess: true
  allowedPaths:
    - "${PROJECT_ROOT}/**"
    - "~/.bytebuddy/**"

Resource Management

Memory Optimization

Optimize memory usage for offline operation:

yaml
# .bytebuddy/config.yaml
models:
  - name: "memory-optimized"
    provider: "ollama"
    model: "llama3:8b"
    baseURL: "http://localhost:11434"
    options:
      num_ctx: 2048 # Reduce context window
      num_thread: 4 # Limit CPU threads
      num_gpu: 1 # Limit GPU usage

Storage Management

Manage storage for offline use:

bash
# Clean up old caches
bytebuddy cache clean --older-than=7d

# Monitor storage usage
bytebuddy cache stats

# Export/import caches for transfer
bytebuddy cache export --file=caches.tar.gz
bytebuddy cache import --file=caches.tar.gz

Testing Offline Capability

Connection Testing

Verify offline operation:

bash
# Test without internet
# Disconnect from network and run:

# Check model availability
curl -s http://localhost:11434/api/tags

# Test ByteBuddy functionality
echo "2+2" | bytebuddy chat --model offline-fast

Performance Testing

Benchmark offline performance:

bash
# Measure response times
bytebuddy benchmark --model offline-chat --iterations 10

# Test with different context sizes
bytebuddy benchmark --context-size 1024,2048,4096

Troubleshooting Offline Issues

Common Problems

Model Not Available

bash
# Check if Ollama is running
ps aux | grep ollama

# Verify models are pulled
ollama list

# Test model directly
ollama run llama3:8b "test"

Slow Performance

bash
# Check system resources
htop
nvidia-smi  # if using GPU

# Adjust model parameters
yaml
models:
  - name: "optimized-offline"
    provider: "ollama"
    model: "llama3:8b"
    options:
      num_thread: 6
      num_ctx: 2048

Cache Issues

bash
# Clear and rebuild cache
bytebuddy cache clear
bytebuddy index --full

# Check cache status
bytebuddy cache stats

Debugging Commands

bash
# Enable debug mode
bytebuddy --debug --offline

# Check configuration
bytebuddy config show

# View logs
bytebuddy logs --tail=100

Best Practices

Preparation

  1. Pre-install Models: Download all needed models beforehand
  2. Cache Documentation: Ensure all documentation is locally available
  3. Test Thoroughly: Verify all functionality works offline
  4. Document Setup: Keep detailed setup instructions

Operation

  1. Monitor Resources: Keep track of CPU, memory, and storage
  2. Regular Maintenance: Update models and clear old caches
  3. Backup Data: Regularly backup important configurations and caches
  4. Security Audits: Regularly check security settings

Performance

  1. Optimize Models: Use appropriately sized models for tasks
  2. Use Caching: Enable aggressive caching for repeated queries
  3. Batch Operations: Process multiple requests together when possible
  4. Monitor Usage: Track performance and adjust as needed

Enterprise Offline Deployment

Multi-User Setup

Deploy for teams working offline:

yaml
# Shared configuration
models:
  - name: "enterprise-offline"
    provider: "ollama"
    model: "llama3:8b"
    baseURL: "http://models.internal:11434" # Internal model server
    timeout: 300

security:
  authentication:
    enabled: true
    method: "certificate"

Disaster Recovery

Plan for offline disaster recovery:

bash
# Regular backup procedures
# Offline update mechanisms
# Emergency procedure documentation

Next Steps

After setting up offline operation, explore these related guides: