Turn a folder of crop images in cloud storage into a searchable image index. The flow reads image files, creates embeddings, and stores them in a vector database for fast visual search and anomaly checks. Ideal for teams that manage large photo sets and need quick quality control.
The run begins by setting cluster details and checking if a collection already exists. If it does not, the flow creates a Qdrant collection with a named vector called voyage and cosine distance, then builds an index on the crop_name field for fast filtering and counting. Images are listed from Google Cloud Storage by prefix, each file is turned into a public URL, and the crop name is taken from the folder path. Tomato images are removed to support a clean anomaly test. Items are split into batches, unique IDs are generated, and the batch is sent to Voyage AI for multimodal embeddings. The results are uploaded to Qdrant in one request with vectors and payloads aligned to point IDs.
Prepare a Google Cloud Storage bucket and prefix, a Voyage AI key, and a Qdrant cluster URL and API key. Tune batch size and the embedding dimension to match the chosen model. Expect a faster setup for image search, more consistent data, and a structure ready for counts and filters by crop type.