Skip to content

数据库集成工具

ByteBuddy 的数据库集成功能通过 databaseClient 工具实现,允许您连接到各种数据库并执行 SQL 查询进行故障排除和数据分析。

工具参数

参数类型必需描述
databaseTypestring数据库类型 (postgresql, mysql, sqlite, oracle, sqlserver, mongodb, redis, clickhouse, bigquery, snowflake)
serverNamestring数据库服务器主机名或IP地址
portnumber数据库服务器端口
databasestring要连接的数据库名称
usernamestring数据库用户名
passwordstring数据库密码
connectionStringstring完整连接字符串(替代单独的连接参数)
queriesstring[]要执行的SQL查询数组
queryFilestring包含要执行的SQL查询的文件路径
timeoutnumber查询超时时间(秒,默认:30)
outputFormatstring查询结果的输出格式(default, csv, json, vertical)
maxRowsnumberSELECT查询返回的最大行数(默认:1000)
dryRunboolean如果为true,则只验证查询而不执行(默认:false)

支持的数据库

关系型数据库

  • PostgreSQL (postgresql) - 完整功能支持,高级 SQL 能力
  • MySQL (mysql) - 完整的 MySQL 数据库支持
  • SQLite (sqlite) - 轻量级文件型数据库支持
  • SQL Server (sqlserver) - Microsoft SQL Server 集成
  • Oracle (oracle) - Oracle 数据库连接

大数据分析

  • ClickHouse (clickhouse) - 高性能分析数据库
  • Google BigQuery (bigquery) - 云数据仓库
  • Snowflake (snowflake) - 云数据平台

NoSQL 数据库

  • MongoDB (mongodb) - 文档数据库支持
  • Redis (redis) - 内存数据结构存储

会话使用示例

在聊天中直接使用数据库工具

用户可以直接在聊天中请求数据库操作:

"请帮我检查我们用户表中的用户数量,并查看最近注册的10个用户。"

yaml
tools:
  - name: "check-users"
    tool: "databaseClient"
    args:
      databaseType: "postgresql"
      serverName: "localhost"
      database: "myapp"
      username: "postgres"
      password: "${DB_PASSWORD}"
      queries:
        - "SELECT COUNT(*) as total_users FROM users"
        - "SELECT id, name, email, created_at FROM users ORDER BY created_at DESC LIMIT 10"

结合配置文件使用

在项目中创建一个配置文件 .bytebuddy/tools.yaml 来预定义数据库连接:

yaml
tools:
  - name: "myapp-db"
    tool: "databaseClient"
    args:
      databaseType: "postgresql"
      serverName: "localhost"
      database: "myapp"
      username: "postgres"
      password: "${DB_PASSWORD}"

然后在聊天中引用这个预配置的工具:

"使用 myapp-db 工具查询我们的产品表,找出价格高于100的商品。"

yaml
tools:
  - name: "expensive-products"
    tool: "databaseClient"
    args:
      databaseType: "postgresql"
      serverName: "localhost"
      database: "myapp"
      username: "postgres"
      password: "${DB_PASSWORD}"
      queries:
        - "SELECT name, price FROM products WHERE price > 100 ORDER BY price DESC"

快速开始

基本查询

yaml
tools:
  - name: "basic-query"
    tool: "databaseClient"
    args:
      databaseType: "postgresql"
      serverName: "localhost"
      database: "myapp"
      username: "postgres"
      password: "${DB_PASSWORD}"
      queries:
        - "SELECT COUNT(*) FROM users"
        - "SELECT * FROM users ORDER BY created_at DESC LIMIT 10"

使用连接字符串

yaml
tools:
  - name: "connection-string-example"
    tool: "databaseClient"
    args:
      databaseType: "mysql"
      connectionString: "mysql://user:${DB_PASSWORD}@localhost:3306/testdb"
      queries:
        - "SHOW TABLES"
        - "DESCRIBE users"

SQLite 数据库

yaml
tools:
  - name: "sqlite-example"
    tool: "databaseClient"
    args:
      databaseType: "sqlite"
      connectionString: "/path/to/database.db"
      queries:
        - "SELECT name FROM sqlite_master WHERE type='table'"
        - "SELECT * FROM products WHERE price > 100"

高级用法

复杂分析

yaml
tools:
  - name: "complex-analytics"
    tool: "databaseClient"
    args:
      databaseType: "postgresql"
      serverName: "analytics-db.example.com"
      database: "analytics"
      username: "analyst"
      password: "${ANALYTICS_DB_PASSWORD}"
      queries:
        - "SELECT DATE_TRUNC('day', created_at) as date, COUNT(*) as signups FROM users WHERE created_at >= NOW() - INTERVAL '30 days' GROUP BY DATE_TRUNC('day', created_at) ORDER BY date"
        - "SELECT u.name, COUNT(o.id) as order_count, SUM(o.total) as total_spent FROM users u LEFT JOIN orders o ON u.id = o.user_id WHERE o.created_at >= NOW() - INTERVAL '7 days' GROUP BY u.id, u.name ORDER BY total_spent DESC LIMIT 10"
      outputFormat: "json"

基于文件的查询

yaml
tools:
  - name: "file-based-query"
    tool: "databaseClient"
    args:
      databaseType: "postgresql"
      serverName: "localhost"
      database: "analytics"
      username: "analyst"
      password: "${DB_PASSWORD}"
      queryFile: "/path/to/complex-analytics.sql"
      outputFormat: "csv"
      maxRows: 5000

Dry Run 模式

yaml
tools:
  - name: "dry-run-example"
    tool: "databaseClient"
    args:
      databaseType: "mysql"
      serverName: "localhost"
      database: "production"
      username: "admin"
      password: "${ADMIN_PASSWORD}"
      queries:
        - "UPDATE users SET status = 'active' WHERE last_login >= '2024-01-01'"
      dryRun: true # 仅验证,不执行

安全特性

查询验证

  • SQL注入防护
  • 危险操作检测
  • 权限级别检查
  • 查询复杂度分析

访问控制

  • 基于角色的访问
  • 数据库特定权限
  • 只读模式选项
  • 审计日志记录

数据保护

  • 连接加密 (SSL/TLS)
  • 凭据掩码
  • 敏感数据过滤
  • 查询结果限制

性能优化

连接池

  • 重用数据库连接
  • 可配置的池大小
  • 连接超时管理
  • 健康检查监控

查询优化

  • 自动索引建议
  • 查询计划分析
  • 性能建议
  • 慢查询检测

使用场景

📈 商业智能

从您的数据中生成商业洞察:

yaml
tools:
  - name: "business-intelligence"
    tool: "databaseClient"
    args:
      databaseType: "postgresql"
      serverName: "bi-db"
      database: "warehouse"
      username: "analyst"
      password: "${BI_PASSWORD}"
      queries:
        - "SELECT category, SUM(amount) as revenue FROM sales WHERE date >= '2024-01-01' GROUP BY category ORDER BY revenue DESC"
        - "SELECT product_id, name, COUNT(*) as sales_count FROM sales JOIN products ON sales.product_id = products.id GROUP BY product_id, name ORDER BY sales_count DESC LIMIT 20"

🔍 数据质量分析

检查数据质量和完整性:

yaml
tools:
  - name: "data-quality-analysis"
    tool: "databaseClient"
    args:
      databaseType: "mysql"
      serverName: "db-server"
      database: "production"
      username: "analyst"
      password: "${DB_PASSWORD}"
      queries:
        - "SELECT COUNT(*) as total_users, COUNT(email) as users_with_email FROM users"
        - "SELECT 'duplicates' as issue, COUNT(*) as count FROM users GROUP BY email HAVING COUNT(*) > 1"
        - "SELECT 'missing_orders' as issue, COUNT(*) as count FROM users u LEFT JOIN orders o ON u.id = o.user_id WHERE o.id IS NULL"

📊 性能监控

监控数据库性能:

yaml
tools:
  - name: "performance-monitoring"
    tool: "databaseClient"
    args:
      databaseType: "postgresql"
      serverName: "prod-db"
      database: "postgres"
      username: "monitor"
      password: "${MONITOR_PASSWORD}"
      queries:
        - "SELECT datname, numbackends, xact_commit, xact_rollback, blks_read, blks_hit FROM pg_stat_database WHERE datname NOT IN ('template0', 'template1')"
        - "SELECT query, calls, total_time, mean_time FROM pg_stat_statements ORDER BY total_time DESC LIMIT 10"

🔧 数据库维护

执行数据库维护任务:

yaml
tools:
  - name: "database-maintenance"
    tool: "databaseClient"
    args:
      databaseType: "postgresql"
      serverName: "prod-db"
      database: "app_production"
      username: "dba"
      password: "${DBA_PASSWORD}"
      queries:
        - "VACUUM ANALYZE users"
        - "REINDEX TABLE users"
        - "SELECT schemaname, tablename, n_tup_ins, n_tup_upd, n_tup_del FROM pg_stat_user_tables"

最佳实践

  1. 使用连接池:启用连接重用以获得更好的性能
  2. 设置行限制:防止大数据集压垮系统
  3. 使用Dry Run:在生产环境中执行前验证查询
  4. 监控性能:定期检查查询性能和优化
  5. 保护凭据:使用具有适当安全性的连接字符串
  6. 批处理查询:将相关查询分组在一起
  7. 选择合适的格式:使用CSV/JSON进行数据导出,使用表格进行分析