Skip to content

Commit

Permalink
more efficient BitStrToIntList (#111)
Browse files Browse the repository at this point in the history
* more efficient BitStrToIntList

* add SureChEMBL in the README
  • Loading branch information
eloyfelix authored Oct 31, 2024
1 parent 5f8fd40 commit 97904ae
Show file tree
Hide file tree
Showing 3 changed files with 9 additions and 4 deletions.
2 changes: 1 addition & 1 deletion FPSim2/io/chem.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from typing import Any, Callable, Iterable as IterableType, Dict, List, Tuple, Union
from typing import Any, Callable, Iterable as IterableType, Dict, Tuple, Union
from FPSim2.FPSim2lib.utils import BitStrToIntList, PyPopcount
from collections.abc import Iterable
from rdkit.Chem import rdMolDescriptors
Expand Down
9 changes: 7 additions & 2 deletions FPSim2/src/utils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,13 @@ namespace utils {

py::list BitStrToIntList(const std::string &bit_string) {
py::list efp;
for (size_t i = 0; i < bit_string.length(); i += 64) {
efp.append(std::stoull(bit_string.substr(i, 64), 0, 2));
size_t len = bit_string.length();
for (size_t i = 0; i < len; i += 64) {
uint64_t value = 0;
for (size_t j = 0; j < 64 && (i + j) < len; ++j) {
value = (value << 1) | (bit_string[i + j] - '0');
}
efp.append(value);
}
return efp;
}
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

# FPSim2: Simple package for fast molecular similarity searches

FPSim2 is a small NumPy centric Python/C++ RDKit based package to run fast compound similarity searches. FPSim2 performs better with high search thresholds (>=0.7). Currently used in the [ChEMBL](http://www.ebi.ac.uk/chembl/) interface.
FPSim2 is a small NumPy centric Python/C++ RDKit based package to run fast compound similarity searches. FPSim2 performs better with high search thresholds (>=0.7). Currently used in the [ChEMBL](http://www.ebi.ac.uk/chembl/) and [SureChEMBL](https://www.surechembl.org/) interfaces.

Highlights:
- Using CPU POPCNT instruction
Expand Down

0 comments on commit 97904ae

Please sign in to comment.