AI Services
AI Butler is model-agnostic and provider-agnostic. You bring the keys, AI Butler handles the plumbing. This page covers the optional AI services (beyond the core text model).
Speech-to-Text (STT)
Section titled “Speech-to-Text (STT)”Used for voice messages on Telegram, Discord, Slack, WhatsApp, and the web chat microphone.
| Provider | Notes |
|---|---|
| whisper | Default. Whisper.cpp via local binary when present, otherwise Whisper API. |
| stub | No-op — useful for testing without audio |
configurations: voice: stt_provider: whisperCheck current status:
aibutler voice statusaibutler voice providersText-to-Speech (TTS)
Section titled “Text-to-Speech (TTS)”Used for voice replies on channels that support voice messages (Telegram, Discord, Slack, WhatsApp).
| Provider | Notes |
|---|---|
| stub | Default — no-op (no audio output) |
| piper | Fully local CPU-only TTS via the Piper binary |
configurations: voice: tts_provider: piperVision
Section titled “Vision”Image understanding is handled by the primary model if it’s vision-capable (Claude 3+, GPT-4o, Gemini 1.5+, LLaVA via Ollama). No extra configuration — just send an image in any channel.
AI Image Generation
Section titled “AI Image Generation”Tool-based image generation for creative tasks.
| Provider | Tool | Vault Key |
|---|---|---|
| DALL-E 3 | image.generate | openai_api_key |
| Stable Diffusion | image.generate | stability_api_key |
| Flux | image.generate | replicate_api_key |
AI Design Tools
Section titled “AI Design Tools”Integration with design platforms for generating branded assets.
| Provider | Tools | Vault Key |
|---|---|---|
| Canva | design.canva_create, design.canva_get | canva_api_key |
| Figma | design.figma_read, design.figma_comment | figma_api_key |
3D Generation
Section titled “3D Generation”Text-to-3D and image-to-3D for creative and smart-home projects.
| Provider | Tools | Vault Key |
|---|---|---|
| Meshy | threed.meshy_text_to_3d | meshy_api_key |
| Tripo | threed.tripo_text_to_3d | tripo_api_key |
| Luma | threed.luma_genie | luma_api_key |
BYOK Pattern
Section titled “BYOK Pattern”Every AI service follows the same pattern: tools are always registered, but they return "configure API key" errors until you store the credential in the vault:
aibutler vault set canva_api_key YOUR_KEYThis lets you enable services one at a time without restarting or touching config files.
Local-First Option
Section titled “Local-First Option”For a fully-local deployment with zero API keys:
- Text model — Ollama (Llama 3.3, Mistral, Qwen, etc.)
- STT — whisper.cpp or Ollama
- TTS — Piper
- Embeddings — Ollama
- Vision — Ollama with LLaVA
docker compose -f docker-compose.ollama.yml up -dSee Choose Your AI for a comparison of providers.