Sourcify
Effortlessly find and manage open-source dependencies for your projects.

Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk.

Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings designed for efficient nearest neighbor search. It constructs large, read-only, file-based data structures that are memory-mapped, enabling multiple processes to share the same data. Annoy supports Euclidean, Manhattan, cosine, Hamming, and Dot (Inner) Product distances. It's particularly useful when dealing with high-dimensional data (up to 1,000 dimensions), such as vector representations of users/items in recommendation systems. The library decouples index creation from loading, allowing indexes to be easily shared and distributed. It minimizes memory footprint and is suitable for applications where memory usage is a prime concern. Annoy allows building indexes on disk to handle datasets that do not fit into memory. It’s used at Spotify for music recommendations by finding similar users/items based on vector representations derived from matrix factorization algorithms.
Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings designed for efficient nearest neighbor search.
Explore all tools that specialize in recommendation systems. This domain focus ensures Annoy delivers optimized results for this specific requirement.
Indexes are memory-mapped, allowing multiple processes to share the same data structure, reducing memory overhead.
Supports Euclidean, Manhattan, cosine, Hamming, and Dot (Inner) Product distances, providing flexibility for different types of data and applications.
Allows building indexes on disk, enabling the processing of datasets that exceed available RAM.
The `search_k` parameter allows tuning the balance between search speed and accuracy during querying.
Supports multi-threaded index building using the `n_jobs` parameter, reducing the time required to create indexes.
Install Annoy via pip: `pip install --user annoy`
For C++, clone the repo and #include "annoylib.h"
Create an AnnoyIndex object specifying the vector length and metric (e.g., `AnnoyIndex(f, 'angular')`)
Add items to the index with their corresponding vectors using `add_item(i, v)`
Build the index by creating a forest of trees using `build(n_trees)`
Save the index to disk using `save('index.ann')`
Load the index from disk using `load('index.ann')` for fast memory-mapped access
Query for nearest neighbors using `get_nns_by_item(item_id, n)` or `get_nns_by_vector(vector, n)`
All Set
Ready to go
Verified feedback from other users.
"Users praise Annoy for its speed, low memory footprint, and ease of integration into existing systems."
Post questions, share tips, and help other users.
Effortlessly find and manage open-source dependencies for your projects.

End-to-end typesafe APIs made easy.

Page speed monitoring with Lighthouse, focusing on user experience metrics and data visualization.

Topcoder is a pioneer in crowdsourcing, connecting businesses with a global talent network to solve technical challenges.

Explore millions of Discord Bots and Discord Apps.

Build internal tools 10x faster with an open-source low-code platform.

Open-source RAG evaluation tool for assessing accuracy, context quality, and latency of RAG systems.

AI-powered synthetic data generation for software and AI development, ensuring compliance and accelerating engineering velocity.