Audio

Text-to-speech (TTS) and speech-to-text (STT) capabilities for voice-enabled applications.

Prism Reference

Atlas audio wraps Prism's audio API. For detailed documentation including all provider options, see Prism Audio.

Text to Speech

Convert text to spoken audio:

php

use Atlasphp\Atlas\Atlas;

$response = Atlas::audio()
    ->using('openai', 'tts-1')
    ->withVoice('nova')
    ->withInput('Hello, welcome to our service!')
    ->asAudio();

// Save audio file
Storage::put('welcome.mp3', $response->audio->rawContent());

Speech to Text

Transcribe audio to text:

php

use Atlasphp\Atlas\Atlas;
use Prism\Prism\ValueObjects\Media\Audio;

$response = Atlas::audio()
    ->using('openai', 'whisper-1')
    ->withInput(Audio::fromLocalPath('/path/to/audio.mp3'))
    ->asText();

echo $response->text;  // Transcribed text

Text-to-Speech Examples

With Voice Selection

php

$response = Atlas::audio()
    ->using('openai', 'tts-1')
    ->withVoice('onyx')  // Deep, authoritative voice
    ->withInput('Important announcement.')
    ->asAudio();

OpenAI voices: alloy, echo, fable, onyx, nova, shimmer

HD Quality

php

$response = Atlas::audio()
    ->using('openai', 'tts-1-hd')
    ->withVoice('alloy')
    ->withInput('High-definition audio quality.')
    ->asAudio();

Speech-to-Text Examples

Basic Transcription

php

use Prism\Prism\ValueObjects\Media\Audio;

$response = Atlas::audio()
    ->using('openai', 'whisper-1')
    ->withInput(Audio::fromLocalPath($request->file('audio')->path()))
    ->asText();

return response()->json([
    'text' => $response->text,
]);

With Language Hint

php

use Prism\Prism\ValueObjects\Media\Audio;

$response = Atlas::audio()
    ->using('openai', 'whisper-1')
    ->withInput(Audio::fromLocalPath('/path/to/audio.mp3'))
    ->withProviderOptions(['language' => 'en'])
    ->asText();

Examples

Example: Voice Notification Service

php

class NotificationService
{
    public function sendVoiceNotification(User $user, string $message): void
    {
        $response = Atlas::audio()
            ->using('openai', 'tts-1')
            ->withVoice('nova')
            ->withInput($message)
            ->asAudio();

        $filename = 'notifications/' . Str::uuid() . '.mp3';
        Storage::put($filename, $response->audio->rawContent());

        $this->voiceService->call($user->phone, Storage::url($filename));
    }
}

Example: Meeting Transcription

php

use Prism\Prism\ValueObjects\Media\Audio;

class MeetingService
{
    public function processRecording(string $audioPath): array
    {
        // Transcribe
        $transcription = Atlas::audio()
            ->using('openai', 'whisper-1')
            ->withInput(Audio::fromLocalPath($audioPath))
            ->asText();

        // Summarize with AI
        $summary = Atlas::agent('summarizer')
            ->chat("Summarize this meeting transcript:\n\n{$transcription->text}");

        return [
            'transcript' => $transcription->text,
            'summary' => $summary->text,
        ];
    }
}

Voice Characteristics

Voice	Description
`alloy`	Neutral, balanced
`echo`	Clear, confident
`fable`	Warm, expressive
`onyx`	Deep, authoritative
`nova`	Friendly, natural
`shimmer`	Clear, energetic

Pipeline Hooks

Audio operations support pipeline middleware for observability:

Pipeline	Trigger
`audio.before_audio`	Before text-to-speech
`audio.after_audio`	After text-to-speech
`audio.before_text`	Before speech-to-text
`audio.after_text`	After speech-to-text

php

use Atlasphp\Atlas\Contracts\PipelineContract;

class LogAudioGeneration implements PipelineContract
{
    public function handle(mixed $data, Closure $next): mixed
    {
        $result = $next($data);

        Log::info('Audio generated', [
            'user_id' => $data['metadata']['user_id'] ?? null,
        ]);

        return $result;
    }
}

$registry->register('audio.after_audio', LogAudioGeneration::class);

API Reference

php

// Text-to-Speech fluent API
Atlas::audio()
    ->using(string $provider, string $model)              // Set provider and model
    ->withVoice(string $voice)                            // Voice selection
    ->withInput(string $text)                             // Text to convert
    ->withProviderOptions(array $options)                 // Provider-specific options
    ->withMetadata(array $metadata)                       // Pipeline metadata
    ->asAudio(): AudioResponse;

// Speech-to-Text fluent API
use Prism\Prism\ValueObjects\Media\Audio;

Atlas::audio()
    ->using(string $provider, string $model)              // Set provider and model
    ->withInput(Audio::fromLocalPath(string $path))            // Audio file to transcribe
    ->withProviderOptions(array $options)                 // Provider-specific options
    ->withMetadata(array $metadata)                       // Pipeline metadata
    ->asText(): TextResponse;

// Text-to-Speech response (AudioResponse)
$response->audio->rawContent();  // Raw audio bytes (save directly to file)

// Speech-to-Text response (TextResponse)
$response->text;                 // Transcribed text

// OpenAI TTS models
->using('openai', 'tts-1')      // Standard quality
->using('openai', 'tts-1-hd')   // High definition

// OpenAI STT models
->using('openai', 'whisper-1')  // Whisper transcription

// OpenAI voices
->withVoice('alloy')    // Neutral, balanced
->withVoice('echo')     // Clear, confident
->withVoice('fable')    // Warm, expressive
->withVoice('onyx')     // Deep, authoritative
->withVoice('nova')     // Friendly, natural
->withVoice('shimmer')  // Clear, energetic

// Common provider options (via withProviderOptions)
// OpenAI TTS:
->withProviderOptions([
    'speed' => 1.0,              // 0.25 to 4.0
    'response_format' => 'mp3',  // 'mp3', 'opus', 'aac', 'flac'
])

// OpenAI STT:
->withProviderOptions([
    'language' => 'en',          // ISO-639-1 language code
    'temperature' => 0,          // 0 to 1
])

Next Steps

Prism Audio — Complete audio reference
Chat — Combine with chat for voice assistants
Pipelines — Add observability to audio operations

Audio ​

Text to Speech ​

Speech to Text ​

Text-to-Speech Examples ​

With Voice Selection ​

HD Quality ​

Speech-to-Text Examples ​

Basic Transcription ​

With Language Hint ​

Examples ​

Example: Voice Notification Service ​

Example: Meeting Transcription ​

Voice Characteristics ​

Pipeline Hooks ​

API Reference ​

Next Steps ​