OpenAI gpt-audio-1.5 × BibiGPT

On 2026-04-23 OpenAI shipped gpt-audio-1.5 alongside GPT-5.5 — an upgraded speech-in / speech-out model with lower latency and richer expression. BibiGPT pipes its multilingual subtitles, summaries, and podcast scripts straight into gpt-audio-1.5 to produce ready-to-publish video narration without a recording booth.

Generate narration scripts in BibiGPT

Released · 2026-04-23 Speech-in / speech-out Ships with GPT-5.5

Key facts (90-second read)

OpenAI released gpt-audio-1.5 on 2026-04-23 alongside GPT-5.5 — a unified speech-in / speech-out model with lower latency and richer expressive control than gpt-audio. Pair it with BibiGPT's multilingual subtitles, AI summaries, and chaptered transcripts and you get an end-to-end video narration / dubbing / summary-to-podcast pipeline without booking voiceover talent.

What is gpt-audio-1.5?

gpt-audio-1.5 is OpenAI's upgraded speech-in / speech-out model launched on 2026-04-23 alongside GPT-5.5. Same Realtime + Audio API surface, lower latency and stronger expressive control than gpt-audio.

Speech-in / speech-out in one model

Accept audio input and emit audio output without bouncing through a separate ASR + TTS stack. Cuts round-trip latency for live narration, dubbing, and conversational flows.

Tunable voice and expression

Inherits gpt-audio's expressive style controls and adds finer-grained pacing and emphasis steering — closer to studio narration without re-recording takes.

Released with GPT-5.5

Ships alongside the GPT-5.5 reasoning upgrade on 2026-04-23. Pair gpt-audio-1.5 for narration with GPT-5.5 for the underlying script and you stay inside one OpenAI stack.

Why it matters for BibiGPT users

BibiGPT already turns Bilibili / YouTube / podcasts into multilingual scripts, subtitles, and summaries. gpt-audio-1.5 is the missing last mile for narration, dubbing, and summary-to-podcast workflows.

Subtitle-driven AI narration

Pipe BibiGPT's translated subtitles or AI summary scripts into gpt-audio-1.5 and ship a redubbed video in zh / en / ja / ko without booking a voiceover artist or studio.

Long video to short narrated clip

Use BibiGPT to generate chapter highlights from a 60-minute lecture, then narrate just the highlight chunk through gpt-audio-1.5 — short-form social posts shipped in minutes.

Summary-to-podcast pipeline

Turn a BibiGPT-generated summary or follow-up Q&A into a hosted podcast episode. gpt-audio-1.5 handles the voice; BibiGPT handles the script, chaptering, and translation.

5 key changes (90-second read)

All sourced from the OpenAI API model docs and the 2026-04-23 release alongside GPT-5.5.

1

Released 2026-04-23 with GPT-5.5

gpt-audio-1.5 ships the same day as GPT-5.5 (codename Spud). Audio + Realtime API users picked it up day one; pricing and availability published in the OpenAI API model docs.
2

Speech-in / speech-out unified

One model handles both audio input understanding and audio output generation, removing the ASR + TTS round trip. Simpler stacks for live agents, dubbing, and conversational replies.
3

Lower latency than gpt-audio

Latency improvements compared to the original gpt-audio at the same expressive quality — better for real-time narration loops and live podcast / interview workflows.
4

Stronger expression and steering

Finer-grained pacing, emphasis, and emotion control versus gpt-audio. Same script can land as serious / playful / casual without re-recording takes.
5

Pairs with the GPT-5.5 reasoning upgrade

GPT-5.5 generates the script (Terminal-Bench 2.0 at 82.7%, FrontierMath at 35.4%); gpt-audio-1.5 narrates it. End-to-end OpenAI stack for narrated explainers, agent-driven dubbing, and summary podcasts.

3 typical scenarios for BibiGPT users

Grounded in real BibiGPT user personas; all already actionable today via the OpenAI Audio / Realtime API.

General creators — AI dubbing

Run a YouTube / Bilibili video through BibiGPT for translated subtitles in zh / en / ja / ko, then narrate the translated track via gpt-audio-1.5. One source video, four-language redub, no studio.

BibiGPT users — long video to short narrated clip

Students, teachers, and creators feed lecture or course videos into BibiGPT for chapter segmentation + highlight summaries, then narrate just the highlight chunks through gpt-audio-1.5 for short-form social posts.

Advanced combo — summary to podcast

BibiGPT summarizes a podcast episode or research video into a structured script → GPT-5.5 polishes and adds host / guest segments → gpt-audio-1.5 narrates → ship a recap podcast, entirely inside the OpenAI + BibiGPT stack.

FAQ'S

Frequently Asked Questions

Ask us anything!

Turn any video into narration-ready scripts with BibiGPT

BibiGPT summarizes YouTube, Bilibili, and podcasts into multilingual scripts and subtitles. Plug the output into OpenAI gpt-audio-1.5 (Audio / Realtime API) and you get publish-ready narration. No custom stack, no learning curve.

Try BibiGPT free

OpenAI gpt-audio-1.5 × BibiGPT

Key facts (90-second read)

Features

What is gpt-audio-1.5?

Speech-in / speech-out in one model

Tunable voice and expression

Released with GPT-5.5

Why it matters for BibiGPT users

Subtitle-driven AI narration

Long video to short narrated clip

Summary-to-podcast pipeline

5 key changes (90-second read)

Released 2026-04-23 with GPT-5.5

Speech-in / speech-out unified

Lower latency than gpt-audio

Stronger expression and steering

Pairs with the GPT-5.5 reasoning upgrade

3 typical scenarios for BibiGPT users

General creators — AI dubbing

BibiGPT users — long video to short narrated clip

Advanced combo — summary to podcast

Frequently Asked Questions

More Free Tools

Gemini Flash TTS × BibiGPT

NotebookLM 2026 Update × BibiGPT

Cohere Transcribe 03-2026 × BibiGPT

DeepSeek-V4 1M

Turn any video into narration-ready scripts with BibiGPT