FPSim2 is a small NumPy centric Python/C++ RDKit based package to run fast compound similarity searches. FPSim2 performs better with high search thresholds (>=0.7). Currently used in the ChEMBL and SureChEMBL interfaces.
Highlights:
- Using CPU POPCNT instruction
- Bounds for sublinear speedups from 10.1021/ci600358f
- A compressed file format with optimised read speed based in PyTables and BLOSC
- Fast multicore CPU and GPU similarity searches
- In memory and on disk search modes
- Distance matrix calculation
pip install fpsim2
or
conda install -c conda-forge fpsim2
Documentation is available at https://chembl.github.io/FPSim2/
To try out FPSim2 interactively in your web browser, just click on the binder icon