top | item 46368162

Show HN: A JSON API for YouTube Transcript with MCP Support

2 points| nikhonit | 2 months ago |transcriptapi.com

I built this because running yt-dlp in production (especially on serverless/Vercel) is a nightmare of IP blocks, cold starts, and binary dependencies.

TranscriptAPI is a lightweight wrapper that handles the extraction, formatting, and proxy rotation. It prioritizes manual captions over auto-generated ones and returns clean JSON with timestamps, ready for RAG pipelines.

The MCP (Model Context Protocol) Integration: I recently added native MCP support. If you use Claude Desktop or other MCP-compliant agents, you can add this API as a tool to 'watch' videos directly in your chat context without manually copying transcripts.

Technical Stack:

Backend: Python (FastAPI) on AWS Lambda (for burst scaling)

Caching: Redis (to prevent hitting YouTube for the same video twice)

Challenge: Handling 'drifting' timestamps in long livestreams where the auto-generated captions lose sync with the video frame.

It has a free tier for hobbyists. I’m curious to hear how you’re handling the context-window limits when feeding full 3-hour transcripts to LLMs

discuss

order

paidev|1 month ago

[deleted]