Finding outliers at scale is hard. This flow sets a center point for each crop group and a distance cutoff so unusual items can be flagged later. Teams in data labeling, agritech, and model ops can keep datasets clean without manual review.
It loads collection data from Qdrant, counts all items, and lists unique crop names. For each crop, it calls the Qdrant distance matrix, then a Python step builds a sparse matrix and chooses the medoid, which is the most central point. The flow marks this medoid in Qdrant, fetches its vector, and runs a search to set a cutoff based on the nth furthest neighbor. A second branch creates short crop text descriptions, embeds them with Voyage AI, finds a text based medoid, and writes its threshold too. Both branches rejoin to complete the setup.
You need Qdrant access and a Voyage AI key. Expect faster setup, repeatable thresholds, and fewer false alarms when data changes. This fits dataset audits, incoming batch checks, and monitoring of crop images or notes in production.