Blog Post
OpenClaw + BibiGPT Skill: AI Video Summary for Bilibili, Xiaohongshu & Douyin Too
OpenClaw has become the hottest AI agent framework of 2026, with 180,000+ GitHub stars. But if you're a Chinese internet user or researcher working with Chinese video platforms, there's a real problem: OpenClaw's native summarize skill doesn't support Bilibili, Xiaohongshu, or Douyin at all.
This isn't speculation. JimmyLv, the creator of BibiGPT, put it plainly:
"What I built with bibi is actually an advanced encapsulation of these requirements — for example, the native summarize definitely doesn't support Bilibili or Xiaohongshu."
bibigpt-skill is built to solve exactly this. One command gives your Claude Code or OpenClaw agent the ability to summarize not just YouTube, but Bilibili, Xiaohongshu, Douyin, podcasts, and local audio/video files.
5-minute setup (quick reference):
- Install BibiGPT Desktop (macOS / Windows)
- Run
npx skills add JimmyLv/bibigpt-skill - Verify:
bibi auth check - Test in Claude Code: "Summarize this video:
<url>" (works for Bilibili & YouTube alike) - Advanced: Configure OpenClaw's Claude Code spawn mode for full automation
Why Did OpenClaw Explode in 2026?
The shift from ChatGPT to OpenClaw represents a fundamental paradigm change: from "answering questions" to "executing tasks".
OpenClaw is a self-hosted, model-agnostic Node.js agent runtime (works with Claude, GPT-4, Gemini, DeepSeek, or local Ollama models). Your data stays on your hardware. It's MIT-licensed and completely free.
Its defining capabilities:
- Execute shell commands and manage files
- Operate external systems via WhatsApp, Telegram, Slack, and email
- Heartbeat mechanism: executes tasks autonomously every 30 minutes without human input
The 2026 breakthrough was the Claude Code spawn mode: OpenClaw agents can now summon Claude Code CLI as a sub-agent for complex coding tasks. This means OpenClaw gains the entire Claude Code skills ecosystem — including bibigpt-skill's video intelligence.
The Video Blind Spot: OpenClaw's Native Summarize vs. bibigpt-skill
OpenClaw's ClawHub marketplace has a native summarize skill. But there's a critical limitation most users discover only after installation:
| Capability | OpenClaw Native Summarize | bibigpt-skill |
|---|---|---|
| YouTube | ✅ | ✅ |
| Bilibili (B站) | ❌ Not supported | ✅ Full support |
| Xiaohongshu (小红书) | ❌ Not supported | ✅ Full support |
| Douyin / TikTok China | ❌ Not supported | ✅ Supported |
| Chinese podcasts | ❌ Not supported | ✅ Supported |
| Local audio/video files | ❌ Not supported | ✅ Supported |
| Chapter-by-chapter summary | ❌ | ✅ --chapter |
| Structured JSON output | ❌ | ✅ --json |
| Subtitle/transcript only | ❌ | ✅ --subtitle |
| Async mode for long videos | ❌ | ✅ --async |
The native summarize is a general-purpose "summarize webpage content with LLM" tool. It has no support for platform-specific video APIs — Bilibili's authentication flow, Xiaohongshu's content structure, or Douyin's short video format.
bibigpt-skill is built on BibiGPT's years of deep integration with Chinese video platforms. The bibi CLI handles platform-specific subtitle extraction, transcription, and multilingual processing before passing structured content to the AI summarization layer. This is what JimmyLv means by "advanced encapsulation" — not just wrapping an LLM, but doing the hard platform work first.
Cognitive science research confirms that structured summarization improves information retention by approximately 40% compared to passive viewing (Mayer, 2021, Cognitive Theory of Multimedia Learning). bibigpt-skill makes that structured output accessible from any AI agent pipeline.
bibigpt-skill: One Command to Give Your Agent Video Understanding

Prerequisites
Install BibiGPT Desktop (the CLI shares your login session automatically):
# macOS (recommended)
brew install --cask jimmylv/bibigpt/bibigpt
# Windows
# Download from bibigpt.co/download/desktop
Install bibigpt-skill
npx skills add JimmyLv/bibigpt-skill
After installation, Claude Code has full access to the bibi CLI. Verify:
bibi auth check # Check login status
bibi --help # View all commands
Command Reference
| Command | Description |
|---|---|
bibi summarize "<url>" | Standard summary |
bibi summarize "<url>" --chapter | Chapter-by-chapter breakdown |
bibi summarize "<url>" --subtitle | Fetch transcript/subtitles only |
bibi summarize "<url>" --json | Full JSON output (for programmatic use) |
bibi summarize "<url>" --async | Async mode (for long videos) |
Three Practical Paths: From Developer to Fully Autonomous Agent
Path 1: Pure Claude Code (Developer Workflow)
The simplest entry point. In any Claude Code session, just say:
"Summarize this video, focusing on implementation details: https://www.youtube.com/watch?v=xxxxx"
Claude Code will automatically recognize bibigpt-skill, invoke bibi summarize, and return a structured summary.
Best for:
- Summarizing technical talks while working in a code repository
- Extracting key features from competitor product demos
- Quickly digesting meeting recordings and product walkthroughs
Path 2: OpenClaw → Claude Code → bibigpt-skill
The most elegant combination in 2026. OpenClaw spawns Claude Code via spawn mode, Claude Code invokes bibigpt-skill, forming a complete video research agent chain:
OpenClaw (planning/scheduling)
→ spawn Claude Code (execution)
→ bibigpt-skill (video intelligence)
→ structured research output

Example configuration:
// In OpenClaw, configure Claude Code spawn
const result = await sessions_spawn({
mode: "claude-code",
prompt: `Use bibigpt-skill to summarize the following video.
Output JSON with: title, core arguments (3-5), key data points, action items:
${videoUrl}`
})
Best for:
- Competitive research: auto-analyze competitor product demo videos
- Academic tracking: regularly summarize conference talk videos
- Content creation: extract material from videos for writing
Path 3: OpenClaw Heartbeat (Fully Automated Research Assistant)
The ultimate form: your AI agent "watches videos" for you automatically, every day.
Configure an OpenClaw heartbeat task (runs at 8am daily):
- Fetch RSS from your subscribed YouTube channels for the past 24 hours
- Call
bibi summarize --jsonfor each new video - Aggregate into a Markdown daily digest
- Send to your Slack channel or inbox
Sample OpenClaw prompt:
Every day at 8am:
1. Use web_fetch to get RSS from these channels:
- https://www.youtube.com/@AndrejKarpathy/videos
- https://www.youtube.com/@3blue1brown/videos
2. Filter for videos published in the last 24 hours
3. For each video run: bibi summarize "<url>" --chapter --json
4. Aggregate all summaries into an AI Daily Digest in Markdown
5. Send via Slack skill to #ai-daily channel
This workflow achieves a critical shift: from "information finding you" to "you actively tracking information" — automated research at scale.
What's Behind bibigpt-skill: BibiGPT's Full Power
bibi summarize is the entry point into BibiGPT's complete AI video processing pipeline. In the Web and Desktop app, there's much more:
AI Video Chat with Source Tracing

Ask questions about video content — every AI answer includes clickable timestamps that jump directly to the source clip in the video. No more worrying about AI hallucination: every information point is traceable and verifiable.
This capability can already be leveraged in agent workflows: use --json to get subtitles, then have Claude Code perform deep analysis, Q&A, and information extraction on the transcript data.
AI Highlight Notes

Automatically extract timestamped highlight clips from videos, organized by theme. In OpenClaw's Path 3 scenario, the --json output includes structured key segments that can be further refined into your daily digest.
As BibiGPT gradually exposes more of these capabilities via API, bibigpt-skill will evolve from a "summary tool" into a full video intelligence platform for AI agents.
Why bibigpt-skill Matters
OpenClaw's explosion proves one thing: the value of an AI agent ecosystem is in the breadth and depth of its skills. The more types of information your agent can process, the more complex tasks it can complete.
Video is currently the largest information island in the AI agent ecosystem. YouTube receives 500 hours of uploads every minute. Bilibili processes millions of new videos daily. Most of this content will never be touched by AI agents — unless there's a bridge like bibigpt-skill.
That's the core motivation behind building bibigpt-skill: let AI agents truly understand the knowledge inside videos, not just relay a URL.
Get Started Now
# 1. Install BibiGPT Desktop
brew install --cask jimmylv/bibigpt/bibigpt # macOS
# 2. Install the skill
npx skills add JimmyLv/bibigpt-skill
# 3. Verify
bibi auth check
# 4. Test in Claude Code:
# > Summarize this video: https://www.youtube.com/watch?v=xxxxx
bibigpt-skill on GitHub: github.com/JimmyLv/bibigpt-skill
Start your AI efficient learning journey now:
- 🌐 Official Website: https://aitodo.co
- 📱 Mobile Download: https://aitodo.co/app
- 💻 Desktop Download: https://aitodo.co/download/desktop
- ✨ Learn More Features: https://aitodo.co/features
BibiGPT Team