Audio Council
Speech-to-text (STT) and text-to-speech (TTS) models for transcription, voice synthesis, and audio processing.
ποΈ Use case: Transcription, voice synthesis
6
Models
4
Providers
STT+TTS
Capabilities
Free
Best price
π Council Models
| Model | Provider | Type | Speed | β | Price | Notes |
|---|---|---|---|---|---|---|
| whisper-large-v3 | Groq | STT | 212 t/s | β β β Β½β | Free | Industry-standard transcription |
| whisper-large-v3-turbo | Groq | STT | 212 t/s | β β β Β½β | Free | Optimized for speed |
| step-asr | StepFun | STT | 93 t/s | β β β β β | $9/mo | StepFun ASR engine |
| step-tts | StepFun | TTS | 93 t/s | β β β β β | $9/mo | StepFun voice synthesis |
| kokoro-82m | Infermatic | TTS | 38 t/s | β β β β β | $9/mo | Lightweight TTS, natural voice |
| Seed Speech | BytePlus | TTS | 84 t/s | β β β Β½β | $9/mo | BytePlus voice synthesis |
βοΈ Pros & Cons
β Pros
- Groq Whisper is completely free and very fast
- Full STT+TTS pipeline available across providers
- Whisper-large-v3 is the industry standard for transcription
- Multiple language support on most models
- Real-time streaming available
β οΈ Cons
- Fewer model options than other councils
- TTS quality varies significantly between providers
- Only Groq offers free STT
- No single provider offers both STT and TTS free
π Try This Council
Transcribe audio with Whisper (free) or synthesize speech.
# Audio Council: Transcribe audio with Whisper (free, unlimited)
curl -X POST https://api.groq.com/openai/v1/audio/transcriptions \
-H "Authorization: Bearer $GROQ_API_KEY" \
-F "file=@recording.mp3" \
-F "model=whisper-large-v3" \
-F "response_format=json"
# Audio Council: Text-to-speech
curl -X POST https://api.groq.com/openai/v1/audio/speech \
-H "Authorization: Bearer $GROQ_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "playai-tts",
"input": "Hello! Welcome to the LLM Council Audio service.",
"voice": "Fritz-PlayAI"
}' --output speech.mp3