Turn short videos into clear voiceovers with one run. The automation pulls a video, writes a simple script from its frames, and produces audio you can store and share from Google Drive. Great for content teams, social clips, product demos, and quick explainers.
A manual test starts by downloading a video from a URL. Python with OpenCV captures up to 90 evenly spaced frames from the clip. Frames are split into groups of 15, resized, and sent to an OpenAI model that can read images to draft parts of the narration. Each round adds to the story so the script stays consistent. A wait step helps avoid rate limits. The final script goes to OpenAI text to speech, and the mp3 file is uploaded to Google Drive.
You will need an OpenAI API key and a Google Drive account. Keep videos small or limit frame count to avoid high memory use. Expect to reduce manual scripting and voice recording from hours to minutes for short clips. This setup fits teams that need frequent narrated content without studio work.