Turn any voice message into a helpful spoken answer. A caller or app sends audio to a public URL, and the system replies with clean, natural speech in seconds. Great for hotlines, help centers, and voice chat on websites.
Here is how it works. A webhook receives the audio file. OpenAI speech to text turns the voice into text. The memory nodes load past messages so answers stay on topic. An aggregate step gathers the context, and the Gemini model creates the reply through a basic chain. Both the user text and the AI reply are saved back to memory for the next turn. A limit node keeps only one item, then ElevenLabs turns the reply text into audio. The response is returned as a binary audio file to the caller.
Setup needs API keys for OpenAI, Google Gemini, and ElevenLabs, plus a voice ID from ElevenLabs. Put the voice ID into the HTTP request URL and add the xi api key header. Expect faster replies, fewer simple tickets, and more consistent answers. This is useful for support hotlines, kiosks, or a voice FAQ that speaks with your brand voice.