OpenAI gpt-audio-1.5 × BibiGPT
On 2026-04-23 OpenAI shipped gpt-audio-1.5 alongside GPT-5.5 — an upgraded speech-in / speech-out model with lower latency and richer expression. BibiGPT pipes its multilingual subtitles, summaries, and podcast scripts straight into gpt-audio-1.5 to produce ready-to-publish video narration without a recording booth.
Key facts (90-second read)
OpenAI released gpt-audio-1.5 on 2026-04-23 alongside GPT-5.5 — a unified speech-in / speech-out model with lower latency and richer expressive control than gpt-audio. Pair it with BibiGPT's multilingual subtitles, AI summaries, and chaptered transcripts and you get an end-to-end video narration / dubbing / summary-to-podcast pipeline without booking voiceover talent.
Features
What is gpt-audio-1.5?
gpt-audio-1.5 is OpenAI's upgraded speech-in / speech-out model launched on 2026-04-23 alongside GPT-5.5. Same Realtime + Audio API surface, lower latency and stronger expressive control than gpt-audio.
Speech-in / speech-out in one model
Accept audio input and emit audio output without bouncing through a separate ASR + TTS stack. Cuts round-trip latency for live narration, dubbing, and conversational flows.
Tunable voice and expression
Inherits gpt-audio's expressive style controls and adds finer-grained pacing and emphasis steering — closer to studio narration without re-recording takes.
Released with GPT-5.5
Ships alongside the GPT-5.5 reasoning upgrade on 2026-04-23. Pair gpt-audio-1.5 for narration with GPT-5.5 for the underlying script and you stay inside one OpenAI stack.
Why it matters for BibiGPT users
BibiGPT already turns Bilibili / YouTube / podcasts into multilingual scripts, subtitles, and summaries. gpt-audio-1.5 is the missing last mile for narration, dubbing, and summary-to-podcast workflows.
Subtitle-driven AI narration
Pipe BibiGPT's translated subtitles or AI summary scripts into gpt-audio-1.5 and ship a redubbed video in zh / en / ja / ko without booking a voiceover artist or studio.
Long video to short narrated clip
Use BibiGPT to generate chapter highlights from a 60-minute lecture, then narrate just the highlight chunk through gpt-audio-1.5 — short-form social posts shipped in minutes.
Summary-to-podcast pipeline
Turn a BibiGPT-generated summary or follow-up Q&A into a hosted podcast episode. gpt-audio-1.5 handles the voice; BibiGPT handles the script, chaptering, and translation.
5 key changes (90-second read)
All sourced from the OpenAI API model docs and the 2026-04-23 release alongside GPT-5.5.
- 1
Released 2026-04-23 with GPT-5.5
gpt-audio-1.5 ships the same day as GPT-5.5 (codename Spud). Audio + Realtime API users picked it up day one; pricing and availability published in the OpenAI API model docs.
- 2
Speech-in / speech-out unified
One model handles both audio input understanding and audio output generation, removing the ASR + TTS round trip. Simpler stacks for live agents, dubbing, and conversational replies.
- 3
Lower latency than gpt-audio
Latency improvements compared to the original gpt-audio at the same expressive quality — better for real-time narration loops and live podcast / interview workflows.
- 4
Stronger expression and steering
Finer-grained pacing, emphasis, and emotion control versus gpt-audio. Same script can land as serious / playful / casual without re-recording takes.
- 5
Pairs with the GPT-5.5 reasoning upgrade
GPT-5.5 generates the script (Terminal-Bench 2.0 at 82.7%, FrontierMath at 35.4%); gpt-audio-1.5 narrates it. End-to-end OpenAI stack for narrated explainers, agent-driven dubbing, and summary podcasts.
3 typical scenarios for BibiGPT users
Grounded in real BibiGPT user personas; all already actionable today via the OpenAI Audio / Realtime API.
General creators — AI dubbing
Run a YouTube / Bilibili video through BibiGPT for translated subtitles in zh / en / ja / ko, then narrate the translated track via gpt-audio-1.5. One source video, four-language redub, no studio.
BibiGPT users — long video to short narrated clip
Students, teachers, and creators feed lecture or course videos into BibiGPT for chapter segmentation + highlight summaries, then narrate just the highlight chunks through gpt-audio-1.5 for short-form social posts.
Advanced combo — summary to podcast
BibiGPT summarizes a podcast episode or research video into a structured script → GPT-5.5 polishes and adds host / guest segments → gpt-audio-1.5 narrates → ship a recap podcast, entirely inside the OpenAI + BibiGPT stack.
FAQ'S
Frequently Asked Questions
Ask us anything!
Turn any video into narration-ready scripts with BibiGPT
BibiGPT summarizes YouTube, Bilibili, and podcasts into multilingual scripts and subtitles. Plug the output into OpenAI gpt-audio-1.5 (Audio / Realtime API) and you get publish-ready narration. No custom stack, no learning curve.