Claude knows FFmpeg, but has no idea where it is

- AI
- Technology
- Projects
- 📚 Longer read
I spent last weekend exploring what an agentic video editor would look like for fun: connect Claude Desktop to an MCP server that wraps FFmpeg, and let it edit video autonomously.
To trim clips, join them, normalize audio, and package them for the web. No timeline. No GUI. Just tell Claude what you want and get the final result back as a link.
FFmpeg is a great stress test for any user. It is famously hostile. Every operation is a command with dozens of flags. Errors go to stderr, mixed with progress output, in a format that occasionally looks like a failure even when everything is fine.
When you see something like this in the output
frame= 240 fps= 22 q=28.0 size= 2048kB time=00:00:08.04 bitrate=2087.0kbits/s speed=0.736x
you might wonder if it worked or broke. But that’s just what a status update looks like.
My first experiment was to give Claude a raw run_ffmpeg tool through an MCP server. Claude writes the command, the server executes it, and the raw stdout and stderr come back.
I planned to watch it fail, document the failures, and build a self-correcting loop that feeds errors back so Claude could diagnose and retry. I suspected I might see hallucinated flags, confused syntax, and Claude inventing FFmpeg features that don't exist.
But that didn't happen.
What Claude got right
Across four sessions and several different editing tasks, Claude produced zero hallucinated FFmpeg flags. Not a single one.
When I prompted it to “Join these three clips with crossfade transitions” it correctly reached for xfade and nailed the offset calculations. And the offset math was correct on the very first attempt.
When asked to transcode in multiple steps it did that correctly too, and got all the parameters right.
When the conversion of my test file, the Big Buck Bunny commonly used for video testing, failed because it has an uneven pixel size, Claude self-corrected. In the next session it even caught the problem before running the command that would fail.
It turned out that Claude's mental model of FFmpeg was sound. However, its model of the execution environment was not.
What Claude got wrong
Claude would constantly assume it was in a specific execution environment with certain directories available. This happened in every session without exception.
The list_assets tool returns filenames (e.g., “big_buck_bunny_480p_h264.mov”) and Claude would automatically assume this file could be found in an “assets” directory.
However, after trying it Claude would read the error and correctly diagnose it as a path problem. The clever fix was to read the absolute path it had seen earlier in a probe_metadata response. It self-corrected successfully.
But it still hit this failure in each and every session, because nothing in the tool interface told it where the working directory was relative to the asset directory. It constructed an assumption, and that assumption was wrong.
In the same way, Claude assumed a parallel “output” directory would exist. It didn't. When the file failed to open, Claude's next attempt was to create it using mkdir.
The MCP server blocked this, only accepting commands that started with ffmpeg for security reasons. And then Claude adapted, dropped the directory creation, and wrote to a relative path instead. Problem solved.
Claude's failures weren't about a lack of knowledge about FFmpeg. They were about flawed inference and assumptions.
These are entirely reasonable assumptions to make, both for a human and an LLM model. They just happened to be wrong here.
Abstraction beats self-correction
So the next step was to abstract this further.
My current experiment exposes just abstracted tools instead of direct FFmpeg access. There are no paths anywhere. Claude passes an asset key and a time range, and eventually a structure of editing directions.
The server handles the FFmpeg invocation, the output location, and the verification. Claude gets back a job ID and, eventually, a public URL.
There is nothing to infer incorrectly about the filesystem because the filesystem is no longer visible. If you prevent the model from making an incorrect assumption, it doesn't need to recover from one.
Claude knows FFmpeg, but has no idea where it is was first published 2026‑05‑04