Processing audio
Some Language Models support generating audio, processing audio inputs, or both. The following are examples for how to use the capabilities of these types of models in your application.
For a list of models capable of audio input or output, please refer to the multimodal models on the Models page. On that page, you will can also lookup parameters which are model specific.
Transcribe audio into text
The following is an example of how you can use the audio input features to transcribe a sound file into text.
Generate audio from text (experimental)
The following is an example of how you can use the audio output: