FramesCLI is an MCP server that lets AI assistants and agents "watch" and analyze videos by extracting timestamped frames and transcripts from any video file. It processes videos locally, outputting structured JSON artifacts of both visual and audio content that Model Context Protocol-compatible clients (such as Claude, Cursor, Cline, and Windsurf) can consume and search. It is ideal for agents reviewing screen recordings, meetings, tutorials, or debugging sessions, and supports commands for previewing extraction cost, chunked transcription, and batch processing. FramesCLI integrates with any MCP-compatible client, runs on Go, and leverages ffmpeg and Whisper for frame and audio handling.
Visit FramesCLI's official website for product details and getting started.