数据库集成工具
ByteBuddy 的数据库集成功能通过 databaseClient 工具实现,允许您连接到各种数据库并执行 SQL 查询进行故障排除和数据分析。
工具参数
| 参数 | 类型 | 必需 | 描述 |
|---|---|---|---|
| databaseType | string | 是 | 数据库类型 (postgresql, mysql, sqlite, oracle, sqlserver, mongodb, redis, clickhouse, bigquery, snowflake) |
| serverName | string | 否 | 数据库服务器主机名或IP地址 |
| port | number | 否 | 数据库服务器端口 |
| database | string | 否 | 要连接的数据库名称 |
| username | string | 否 | 数据库用户名 |
| password | string | 否 | 数据库密码 |
| connectionString | string | 否 | 完整连接字符串(替代单独的连接参数) |
| queries | string[] | 否 | 要执行的SQL查询数组 |
| queryFile | string | 否 | 包含要执行的SQL查询的文件路径 |
| timeout | number | 否 | 查询超时时间(秒,默认:30) |
| outputFormat | string | 否 | 查询结果的输出格式(default, csv, json, vertical) |
| maxRows | number | 否 | SELECT查询返回的最大行数(默认:1000) |
| dryRun | boolean | 否 | 如果为true,则只验证查询而不执行(默认:false) |
支持的数据库
关系型数据库
- PostgreSQL (
postgresql) - 完整功能支持,高级 SQL 能力 - MySQL (
mysql) - 完整的 MySQL 数据库支持 - SQLite (
sqlite) - 轻量级文件型数据库支持 - SQL Server (
sqlserver) - Microsoft SQL Server 集成 - Oracle (
oracle) - Oracle 数据库连接
大数据分析
- ClickHouse (
clickhouse) - 高性能分析数据库 - Google BigQuery (
bigquery) - 云数据仓库 - Snowflake (
snowflake) - 云数据平台
NoSQL 数据库
- MongoDB (
mongodb) - 文档数据库支持 - Redis (
redis) - 内存数据结构存储
会话使用示例
在聊天中直接使用数据库工具
用户可以直接在聊天中请求数据库操作:
"请帮我检查我们用户表中的用户数量,并查看最近注册的10个用户。"
yaml
tools:
- name: "check-users"
tool: "databaseClient"
args:
databaseType: "postgresql"
serverName: "localhost"
database: "myapp"
username: "postgres"
password: "${DB_PASSWORD}"
queries:
- "SELECT COUNT(*) as total_users FROM users"
- "SELECT id, name, email, created_at FROM users ORDER BY created_at DESC LIMIT 10"结合配置文件使用
在项目中创建一个配置文件 .bytebuddy/tools.yaml 来预定义数据库连接:
yaml
tools:
- name: "myapp-db"
tool: "databaseClient"
args:
databaseType: "postgresql"
serverName: "localhost"
database: "myapp"
username: "postgres"
password: "${DB_PASSWORD}"然后在聊天中引用这个预配置的工具:
"使用 myapp-db 工具查询我们的产品表,找出价格高于100的商品。"
yaml
tools:
- name: "expensive-products"
tool: "databaseClient"
args:
databaseType: "postgresql"
serverName: "localhost"
database: "myapp"
username: "postgres"
password: "${DB_PASSWORD}"
queries:
- "SELECT name, price FROM products WHERE price > 100 ORDER BY price DESC"快速开始
基本查询
yaml
tools:
- name: "basic-query"
tool: "databaseClient"
args:
databaseType: "postgresql"
serverName: "localhost"
database: "myapp"
username: "postgres"
password: "${DB_PASSWORD}"
queries:
- "SELECT COUNT(*) FROM users"
- "SELECT * FROM users ORDER BY created_at DESC LIMIT 10"使用连接字符串
yaml
tools:
- name: "connection-string-example"
tool: "databaseClient"
args:
databaseType: "mysql"
connectionString: "mysql://user:${DB_PASSWORD}@localhost:3306/testdb"
queries:
- "SHOW TABLES"
- "DESCRIBE users"SQLite 数据库
yaml
tools:
- name: "sqlite-example"
tool: "databaseClient"
args:
databaseType: "sqlite"
connectionString: "/path/to/database.db"
queries:
- "SELECT name FROM sqlite_master WHERE type='table'"
- "SELECT * FROM products WHERE price > 100"高级用法
复杂分析
yaml
tools:
- name: "complex-analytics"
tool: "databaseClient"
args:
databaseType: "postgresql"
serverName: "analytics-db.example.com"
database: "analytics"
username: "analyst"
password: "${ANALYTICS_DB_PASSWORD}"
queries:
- "SELECT DATE_TRUNC('day', created_at) as date, COUNT(*) as signups FROM users WHERE created_at >= NOW() - INTERVAL '30 days' GROUP BY DATE_TRUNC('day', created_at) ORDER BY date"
- "SELECT u.name, COUNT(o.id) as order_count, SUM(o.total) as total_spent FROM users u LEFT JOIN orders o ON u.id = o.user_id WHERE o.created_at >= NOW() - INTERVAL '7 days' GROUP BY u.id, u.name ORDER BY total_spent DESC LIMIT 10"
outputFormat: "json"基于文件的查询
yaml
tools:
- name: "file-based-query"
tool: "databaseClient"
args:
databaseType: "postgresql"
serverName: "localhost"
database: "analytics"
username: "analyst"
password: "${DB_PASSWORD}"
queryFile: "/path/to/complex-analytics.sql"
outputFormat: "csv"
maxRows: 5000Dry Run 模式
yaml
tools:
- name: "dry-run-example"
tool: "databaseClient"
args:
databaseType: "mysql"
serverName: "localhost"
database: "production"
username: "admin"
password: "${ADMIN_PASSWORD}"
queries:
- "UPDATE users SET status = 'active' WHERE last_login >= '2024-01-01'"
dryRun: true # 仅验证,不执行安全特性
查询验证
- SQL注入防护
- 危险操作检测
- 权限级别检查
- 查询复杂度分析
访问控制
- 基于角色的访问
- 数据库特定权限
- 只读模式选项
- 审计日志记录
数据保护
- 连接加密 (SSL/TLS)
- 凭据掩码
- 敏感数据过滤
- 查询结果限制
性能优化
连接池
- 重用数据库连接
- 可配置的池大小
- 连接超时管理
- 健康检查监控
查询优化
- 自动索引建议
- 查询计划分析
- 性能建议
- 慢查询检测
使用场景
📈 商业智能
从您的数据中生成商业洞察:
yaml
tools:
- name: "business-intelligence"
tool: "databaseClient"
args:
databaseType: "postgresql"
serverName: "bi-db"
database: "warehouse"
username: "analyst"
password: "${BI_PASSWORD}"
queries:
- "SELECT category, SUM(amount) as revenue FROM sales WHERE date >= '2024-01-01' GROUP BY category ORDER BY revenue DESC"
- "SELECT product_id, name, COUNT(*) as sales_count FROM sales JOIN products ON sales.product_id = products.id GROUP BY product_id, name ORDER BY sales_count DESC LIMIT 20"🔍 数据质量分析
检查数据质量和完整性:
yaml
tools:
- name: "data-quality-analysis"
tool: "databaseClient"
args:
databaseType: "mysql"
serverName: "db-server"
database: "production"
username: "analyst"
password: "${DB_PASSWORD}"
queries:
- "SELECT COUNT(*) as total_users, COUNT(email) as users_with_email FROM users"
- "SELECT 'duplicates' as issue, COUNT(*) as count FROM users GROUP BY email HAVING COUNT(*) > 1"
- "SELECT 'missing_orders' as issue, COUNT(*) as count FROM users u LEFT JOIN orders o ON u.id = o.user_id WHERE o.id IS NULL"📊 性能监控
监控数据库性能:
yaml
tools:
- name: "performance-monitoring"
tool: "databaseClient"
args:
databaseType: "postgresql"
serverName: "prod-db"
database: "postgres"
username: "monitor"
password: "${MONITOR_PASSWORD}"
queries:
- "SELECT datname, numbackends, xact_commit, xact_rollback, blks_read, blks_hit FROM pg_stat_database WHERE datname NOT IN ('template0', 'template1')"
- "SELECT query, calls, total_time, mean_time FROM pg_stat_statements ORDER BY total_time DESC LIMIT 10"🔧 数据库维护
执行数据库维护任务:
yaml
tools:
- name: "database-maintenance"
tool: "databaseClient"
args:
databaseType: "postgresql"
serverName: "prod-db"
database: "app_production"
username: "dba"
password: "${DBA_PASSWORD}"
queries:
- "VACUUM ANALYZE users"
- "REINDEX TABLE users"
- "SELECT schemaname, tablename, n_tup_ins, n_tup_upd, n_tup_del FROM pg_stat_user_tables"最佳实践
- 使用连接池:启用连接重用以获得更好的性能
- 设置行限制:防止大数据集压垮系统
- 使用Dry Run:在生产环境中执行前验证查询
- 监控性能:定期检查查询性能和优化
- 保护凭据:使用具有适当安全性的连接字符串
- 批处理查询:将相关查询分组在一起
- 选择合适的格式:使用CSV/JSON进行数据导出,使用表格进行分析