🎀

Audio Council

Speech-to-text (STT) and text-to-speech (TTS) models for transcription, voice synthesis, and audio processing.
πŸŽ™οΈ Use case: Transcription, voice synthesis
6
Models
4
Providers
STT+TTS
Capabilities
Free
Best price

πŸ“Š Council Models

ModelProviderTypeSpeedβ˜…PriceNotes
whisper-large-v3GroqSTT212 t/sβ˜…β˜…β˜…Β½β˜…FreeIndustry-standard transcription
whisper-large-v3-turboGroqSTT212 t/sβ˜…β˜…β˜…Β½β˜…FreeOptimized for speed
step-asrStepFunSTT93 t/sβ˜…β˜…β˜…β˜…β˜…StepFun ASR engine
step-ttsStepFunTTS93 t/sβ˜…β˜…β˜…β˜…β˜…StepFun voice synthesis
kokoro-82mInfermaticTTS38 t/sβ˜…β˜…β˜…β˜…β˜…Lightweight TTS, natural voice
Seed SpeechBytePlusTTS84 t/sβ˜…β˜…β˜…Β½β˜…BytePlus voice synthesis

βš–οΈ Pros & Cons

βœ… Pros

  • Groq Whisper is completely free and very fast
  • Full STT+TTS pipeline available across providers
  • Whisper-large-v3 is the industry standard for transcription
  • Multiple language support on most models
  • Real-time streaming available

⚠️ Cons

  • Fewer model options than other councils
  • TTS quality varies significantly between providers
  • Only Groq offers free STT
  • No single provider offers both STT and TTS free

πŸš€ Try This Council

Transcribe audio with Whisper (free) or synthesize speech.

# Audio Council: Transcribe audio with Whisper (free, unlimited)
curl -X POST https://api.groq.com/openai/v1/audio/transcriptions \
  -H "Authorization: Bearer $GROQ_API_KEY" \
  -F "file=@recording.mp3" \
  -F "model=whisper-large-v3" \
  -F "response_format=json"

# Audio Council: Text-to-speech
curl -X POST https://api.groq.com/openai/v1/audio/speech \
  -H "Authorization: Bearer $GROQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "playai-tts",
    "input": "Hello! Welcome to the LLM Council Audio service.",
    "voice": "Fritz-PlayAI"
  }' --output speech.mp3