The supported way to install Faiss is through conda. Stable releases are pushed regularly to the pytorch conda channel, as well as pre-release nightly builds.
- The CPU-only faiss-cpu conda package is currently available on Linux (x86-64 and aarch64), OSX (arm64 only), and Windows (x86-64)
- faiss-gpu, containing both CPU and GPU indices, is available on Linux (x86-64 only) for CUDA 11.4 and 12.1
- faiss-gpu-raft 1 package containing GPU indices provided by NVIDIA RAFT version 24.06, is available on Linux (x86-64 only) for CUDA 11.8 and 12.4.
To install the latest stable release:
# CPU-only version
$ conda install -c pytorch faiss-cpu=1.9.0
# GPU(+CPU) version
$ conda install -c pytorch -c nvidia faiss-gpu=1.9.0
# GPU(+CPU) version with NVIDIA RAFT
$ conda install -c pytorch -c nvidia -c rapidsai -c conda-forge faiss-gpu-raft=1.9.0
# GPU(+CPU) version using AMD ROCm not yet available
For faiss-gpu, the nvidia channel is required for CUDA, which is not published in the main anaconda channel.
For faiss-gpu-raft, the rapidsai, conda-forge and nvidia channels are required.
Nightly pre-release packages can be installed as follows:
# CPU-only version
$ conda install -c pytorch/label/nightly faiss-cpu
# GPU(+CPU) version
$ conda install -c pytorch/label/nightly -c nvidia faiss-gpu=1.9.0
# GPU(+CPU) version with NVIDIA cuVS (package built with CUDA 12.4)
conda install -c pytorch -c rapidsai -c conda-forge -c nvidia pytorch/label/nightly::faiss-gpu-cuvs 'cuda-version>=12.0,<=12.5'
# GPU(+CPU) version with NVIDIA cuVS (package built with CUDA 11.8)
conda install -c pytorch -c rapidsai -c conda-forge -c nvidia pytorch/label/nightly::faiss-gpu-cuvs 'cuda-version>=11.4,<=11.8'
# GPU(+CPU) version using AMD ROCm not yet available
In the above commands, pytorch-cuda=11 or pytorch-cuda=12 would select a specific CUDA version, if it’s required.
A combination of versions that installs GPU Faiss with CUDA and Pytorch (as of 2024-05-15):
conda create --name faiss_1.8.0
conda activate faiss_1.8.0
conda install -c pytorch -c nvidia faiss-gpu=1.8.0 pytorch=*=*cuda* pytorch-cuda=11 numpy
Faiss is also being packaged by conda-forge, the community-driven packaging ecosystem for conda. The packaging effort is collaborating with the Faiss team to ensure high-quality package builds.
Due to the comprehensive infrastructure of conda-forge, it may even happen that certain build combinations are supported in conda-forge that are not available through the pytorch channel. To install, use
# CPU version
$ conda install -c conda-forge faiss-cpu
# GPU version
$ conda install -c conda-forge faiss-gpu
# NVIDIA cuVS and AMD ROCm version not yet available
You can tell which channel your conda packages come from by using conda list
.
If you are having problems using a package built by conda-forge, please raise
an issue on the
conda-forge package "feedstock".
Faiss can be built from source using CMake.
Faiss is supported on x86-64 machines on Linux, OSX, and Windows. It has been found to run on other platforms as well, see other platforms.
The basic requirements are:
- a C++17 compiler (with support for OpenMP support version 2 or higher),
- a BLAS implementation (on Intel machines we strongly recommend using Intel MKL for best performance).
The optional requirements are:
- for GPU indices:
- nvcc,
- the CUDA toolkit,
- for AMD GPUs:
- AMD ROCm,
- for using NVIDIA cuVS implementations:
- libcuvs=24.12
- for the python bindings:
- python 3,
- numpy,
- and swig.
Indications for specific configurations are available in the troubleshooting section of the wiki.
The libcuvs dependency should be installed via conda:
- With CUDA 12.0 - 12.5:
conda install -c rapidsai -c conda-forge -c nvidia libcuvs=24.12 'cuda-version>=12.0,<=12.5'
- With CUDA 11.4 - 11.8
conda install -c rapidsai -c conda-forge -c nvidia libcuvs=24.12 'cuda-version>=11.4,<=11.8'
For more ways to install cuVS 24.12, refer to the RAPIDS Installation Guide.
$ cmake -B build .
This generates the system-dependent configuration/build files in the build/
subdirectory.
Several options can be passed to CMake, among which:
- general options:
-DFAISS_ENABLE_GPU=OFF
in order to disable building GPU indices (possible values areON
andOFF
),-DFAISS_ENABLE_PYTHON=OFF
in order to disable building python bindings (possible values areON
andOFF
),-DFAISS_ENABLE_CUVS=ON
in order to use the NVIDIA cuVS implementations of the IVF-Flat, IVF-PQ and CAGRA GPU-accelerated indices (default isOFF
, possible, values areON
andOFF
). Note:-DFAISS_ENABLE_GPU
must be set toON
when enabling this option.-DBUILD_TESTING=OFF
in order to disable building C++ tests,-DBUILD_SHARED_LIBS=ON
in order to build a shared library (possible values areON
andOFF
),-DFAISS_ENABLE_C_API=ON
in order to enable building C API (possible values areON
andOFF
),
- optimization-related options:
-DCMAKE_BUILD_TYPE=Release
in order to enable generic compiler optimization options (enables-O3
on gcc for instance),-DFAISS_OPT_LEVEL=avx2
in order to enable the required compiler flags to generate code using optimized SIMD/Vector instructions. Possible values are below:- On x86-64,
generic
,avx2
, 'avx512', andavx512_spr
(for avx512 features available since Intel(R) Sapphire Rapids), by increasing order of optimization, - On aarch64,
generic
andsve
, by increasing order of optimization,
- On x86-64,
-DFAISS_USE_LTO=ON
in order to enable Link-Time Optimization (default isOFF
, possible values areON
andOFF
).
- BLAS-related options:
-DBLA_VENDOR=Intel10_64_dyn -DMKL_LIBRARIES=/path/to/mkl/libs
to use the Intel MKL BLAS implementation, which is significantly faster than OpenBLAS (more information about the values for theBLA_VENDOR
option can be found in the CMake docs),
- GPU-related options:
-DCUDAToolkit_ROOT=/path/to/cuda-10.1
in order to hint to the path of the CUDA toolkit (for more information, see CMake docs),-DCMAKE_CUDA_ARCHITECTURES="75;72"
for specifying which GPU architectures to build against (see CUDA docs to determine which architecture(s) you should pick),-DFAISS_ENABLE_ROCM=ON
in order to enable building GPU indices for AMD GPUs.-DFAISS_ENABLE_GPU
must beON
when using this option. (possible values areON
andOFF
),
- python-related options:
-DPython_EXECUTABLE=/path/to/python3.7
in order to build a python interface for a different python than the default one (see CMake docs).
$ make -C build -j faiss
This builds the C++ library (libfaiss.a
by default, and libfaiss.so
if
-DBUILD_SHARED_LIBS=ON
was passed to CMake).
The -j
option enables parallel compilation of multiple units, leading to a
faster build, but increasing the chances of running out of memory, in which case
it is recommended to set the -j
option to a fixed value (such as -j4
).
If making use of optimization options, build the correct target before swigfaiss.
For AVX2:
$ make -C build -j faiss_avx2
For AVX512:
$ make -C build -j faiss_avx512
For AVX512 features available since Intel(R) Sapphire Rapids.
$ make -C build -j faiss_avx512_spr
This will ensure the creation of neccesary files when building and installing the python package.
$ make -C build -j swigfaiss
$ (cd build/faiss/python && python setup.py install)
The first command builds the python bindings for Faiss, while the second one generates and installs the python package.
$ make -C build install
This will make the compiled library (either libfaiss.a
or libfaiss.so
on
Linux) available system-wide, as well as the C++ headers. This step is not
needed to install the python package only.
To run the whole test suite, make sure that cmake
was invoked with
-DBUILD_TESTING=ON
, and run:
$ make -C build test
$ (cd build/faiss/python && python setup.py build)
$ PYTHONPATH="$(ls -d ./build/faiss/python/build/lib*/)" pytest tests/test_*.py
A basic usage example is available in
demos/demo_ivfpq_indexing.cpp
.
It creates a small index, stores it and performs some searches. A normal runtime is around 20s. With a fast machine and Intel MKL's BLAS it runs in 2.5s.
It can be built with
$ make -C build demo_ivfpq_indexing
and subsequently ran with
$ ./build/demos/demo_ivfpq_indexing
$ make -C build demo_ivfpq_indexing_gpu
$ ./build/demos/demo_ivfpq_indexing_gpu
This produce the GPU code equivalent to the CPU demo_ivfpq_indexing
. It also
shows how to translate indexes from/to a GPU.
A longer example runs and evaluates Faiss on the SIFT1M dataset. To run it,
please download the ANN_SIFT1M dataset from http://corpus-texmex.irisa.fr/
and unzip it to the subdirectory sift1M
at the root of the source
directory for this repository.
Then compile and run the following (after ensuring you have installed faiss):
$ make -C build demo_sift1M
$ ./build/demos/demo_sift1M
This is a demonstration of the high-level auto-tuning API. You can try setting a different index_key to find the indexing structure that gives the best performance.
The following script extends the demo_sift1M test to several types of indexes. This must be run from the root of the source directory for this repository:
$ mkdir tmp # graphs of the output will be written here
$ python demos/demo_auto_tune.py
It will cycle through a few types of indexes and find optimal operating points. You can play around with the types of indexes.
The example above also runs on GPU. Edit demos/demo_auto_tune.py
at line 100
with the values
keys_to_test = keys_gpu
use_gpu = True
and you can run
$ python demos/demo_auto_tune.py
to test the GPU code.
Footnotes
-
The vector search and clustering algorithms in NVIDIA RAFT have been formally migrated to NVIDIA cuVS. This package is being renamed to
faiss-gpu-cuvs
in the next stable release, which will use these GPU implementations from the pre-compiledlibcuvs=24.12
binary. ↩