-
Notifications
You must be signed in to change notification settings - Fork 174
API Overview
OpenDHT offers the following features:
- Distributed shared key->value data-store.
- IPv4 and IPv6 support.
- Storage of arbitrary binary values up to 64 KiB. Keys are 160 bits long.
- Different values under a same key can be distinguished by a key-unique 64 bits ID.
- Every value also has a "value type". Each value type defines potentially complex storage, edition and expiration policies, allowing for instance different value expiration times. The set of supported "value types" is hardcoded and known by every node.
Note that OpenDHT is not compatible with the Mainline Bittorrent DHT.
An optional public-key cryptography layer on top of the DHT allows to put signed or encrypted data on the DHT. Signed values can then be edited, only by their owner (as verified cryptographically). Signed values retrieved from the DHT are automatically checked and will only be presented to the user if the signature verification succeeds.
The identity layer also publishes a (usually self-signed) certificate on the DHT that can be used to encrypt data for other nodes. Encrypted values are always signed, and the signature is part of the encrypted data, to hide the signer identity during transmission. For this reason, like other non-signed values, encrypted values can't be edited (because storage nodes can't verify the identity of the author).
OpenDHT uses the dht
C++ namespace and is composed by a few major classes :
-
Infohash represents a key or a node ID, which are 20 bytes/160 bits bitstrings. Infohash instances can be compared with the comparison operator ==. The user can compute hashes from strings or binary data using static methods
Infohash::get()
, for instanceInfohash::get("my_key")
returns the SHA1 hash of the string "my_key". -
Value represents a value potentially stored on the DHT.
dht::Value
is the result type of get operations and the argument type of put operations. Adht::Value
can be easily built from any binary object, for instance using the constructordht::Value::Value(const std::vector<uint8_t>&)
or C-style withdht::Value::Value(const uint8_t* ptr, size_t len)
. -
ValueType defines how data is stored on the DHT : preservation time, storage and edition constraints etc. Every stored
Value
has an associated value type. Note thatValueType
usually has no impact on data serialization. -
Value::Filter is a class inheriting from
std::function<bool(Value&)>
. It lets you define whether a value should be returned to the user. It also defines some useful methods likechain(Value::Filter&&)
andchainOr(Value::Filter&&)
. -
Query much like the filters, the
Query
lets you filter values, but also fields in each value. It pretty much defines an SQLSELECT, WHERE
statements. In fact, one of it's constructors literally takes an SQL-ish fromatted string as parameter. Fields on whichSELECT
andWHERE
operations are permitted are listed inValue::Fields
. This is a subset of the fields aValue
contains. The most meaningful distinction between the query and the filter is that the query is going to be executed by the remote nodes, giving you a better control over the traffic triggered by your usage of the library. -
Dht is the class implementing the actual distributed hash table and providing basic operations. It requires an already-open UDP socket to send packets. When used alone, the
Dht::periodic
method must be called regularly and when a packet is received. -
SecureDht is a child class of
dht::Dht
that exposes its APIs and transparently checks signed values (for get and listen operations), decrypt encrypted values, and provide additional methods to publish signed or encrypted values. - DhtRunner provides a thread-safe interface to SecureDht and manages UDP sockets. DhtRunner is what most applications implementing OpenDHT should use: the instance can be safely shared to be used independently by various components or threads, with networking managed transparently. DhtRunner can launch a dedicated thread or be integrated in the program loop.
Get/listen operations take a callback argument of type GetCallback or GetCallbackSimple (both can be used):
using GetCallback = std::function<bool(const std::vector<std::shared_ptr<dht::Value>>& values)>;
using GetCallbackSimple = std::function<bool(const std::shared_ptr<dht::Value>& value)>;
Listen operations can also take a callback argument of type ValueCallback allowing to know when a new value is found and expires (expired is false when the value is first found and true when it expires):
using ValueCallback = std::function<bool(const std::vector<std::shared_ptr<Value>>& values, bool expired)>;
Query operations take a callback argument of type QueryCallback, defined as:
using QueryCallback = std::function<bool(const std::vector<std::shared_ptr<dht::FieldValueIndex>>& fields)>;
Many operations also use an "operation completed" callback DoneCallback, defined as:
using DoneCallback = std::function<void(bool success)>
This class provides the core API. Important methods are:
- Constructor
Dht::Dht(int s, int s6, const InfoHash& id)
The constructor takes open IPv4, IPv6 UDP sockets used to send packets, and the node ID. At least one open socket must be provided for the Dht instance to be considered running. If a valid socket is not provided the value -1
should be passed instead.
Most apps using OpenDHT should use the class DhtRunner that will instantiate Dht, handle networking transparently and provide a thread-safe interface to the dht instance.
- Get
void Dht::get(const InfoHash& key, GetCallback cb, DoneCallback donecb={}, Value::Filter f = {}, Query q = {});
Get
initiates a search on the network for values associated with the provided key
. Results will be provided during the search through the second argument cb
. The callback will be called multiple times with new values when they are found on the network or until the callback returns false. An optional DoneCallback
is called on operation completion (success or failure), after which no further callback is called.
Filter
: optional predicate to pre-filter values before they are passed to the callback.
Query
: optional query to filter values on remote nodes.
Example using Dht::get:
//node is a running instance of dht::Dht
node.get(
dht::InfoHash::get("some_key"),
[](const std::vector<std::shared_ptr<dht::Value>>& values) {
for (const auto& v : values)
std::cout << "Got value: " << *v << std::endl;
return true; // keep looking for values
},
[](bool success) {
std::cout << "Get finished with " << (success ? "success" : "failure") << std::endl;
}
);
- Query
void Dht::query(const InfoHash& key, QueryCallback cb, DoneCallback done_cb = {}, Query&& q = {});
Query
initiates a search on the network at the provided key
for specific value fields. Results will be provided during the search through the second argument cb
. The callback will be called multiple times with new values when they are found on the network or until the callback returns false. An optional DoneCallback
is called on operation completion (success or failure), after which no further callback is called.
Filter
: optional predicate to pre-filter values before they are passed to the callback.
Query
: optional query to filter values on remote nodes.
Example using Dht::query:
//node is a running instance of dht::Dht
node.query(
dht::InfoHash::get("some_key"),
[](const std::vector<std::shared_ptr<dht::FieldValueIndex>>& fields) {
for (const auto& i : fields)
std::cout << "Got index: " << *i << std::endl;
return true; // keep looking for field value index
},
[](bool success) {
std::cout << "Get finished with " << (success ? "success" : "failure") << std::endl;
}
);
- Put
void Dht::put(const InfoHash& key, const std::shared_ptr<Value>& value, DoneCallback cb = {});
Put
initiates publication of a value on the network at the provided key
. See Data serialization for more information about how to build a dht::Value
instance. An optional DoneCallback
is called on operation completion (success or failure).
If the value ID is dht::Value::INVALID_ID
(0) when put
is called, the Value::id
field is set during the operation to identify the value.
A value remains on the network for its lifetime (default 10 minutes).
Use put
with the same key and value to refresh the expiration deadline.
Values can't be edited by default (with the exception of signed values).
If a value with the same value ID exists on the network, the new value is by default ignored by the network.
Example using Dht::put:
const char* my_data = "42 cats";
//node is a running instance of dht::Dht
node.put(
dht::InfoHash::get("some_key"),
dht::Value((const uint8_t*)my_data, std::strlen(my_data))
);
- Listen
size_t Dht::listen(const InfoHash& key, ValueCallback cb, Value::Filter q = {}, Query q = {});
Listen initiates a search on the network to find values associated with the provided key
and will keep being informed of new values published at key
, calling the provided callback function cb
every time there is a new or changed value at key, until the callback cb
returns false or the operation is canceled with bool cancelListen(const InfoHash& key, size_t token)
, where token
is the return value from listen
. Calling cancelListen
has the same effect as returning false from the callback.
Example using Dht::listen:
auto key = dht::InfoHash::get("some_key");
auto token = node.listen(key,
[](const std::vector<std::shared_ptr<dht::Value>>& values, bool expired) {
for (const auto& v : values)
std::cout << "Found value: " << *v << ", " << (expired ? "expired" : "added") << std::endl;
return true; // keep listening
}
);
// later
node.cancelListen(key, std::move(token));
Listen with type template for automatic deserialization:
struct Cloud {
uint32_t altitude;
double width, height;
bool rainbow;
MSGPACK_DEFINE_MAP(altitude, width, height, rainbow);
}
std::vector<Cloud> found_clouds;
auto key = dht::InfoHash::get("some_key");
auto token = node.listen<Cloud>(key, [](Cloud&& value) {
// warning: called from another thread
found_clouds.emplace_back(std::move(value));
}
);
// later
node.cancelListen(key, token);
A filter is an std::function<bool(const dht::Value&)>
predicate to filter values.
auto coolValueFilter = [](const dht::Value& v) {
return v.user_type == "cool" and v.data.size() < 64;
};
node.get(
dht::InfoHash::get("coolKey"),
[](const std::shared_ptr<dht::Value>& value) {
std::cout << "That's a cool value: " << *v << std::endl;
return true; // keep looking for values
},
[](bool success) {
std::cout << "Op went " << (success ? "cool" : "not cool") << std::endl;
},
coolValueFilter);
As you can see, the Value::Filter
class is really flexible. However, this filtering is only going to be processed on the local node upon receiving values in a response. What if you know that the storage you're interested in is hosting a high number of values and you don't want to trigger big traffic. Use queries!
An equivalent to the last example, but using queries is as follows:
dht::Where w;
w.id(5); /* the same as dht::Where w("WHERE id=5"); */
node.get(
dht::InfoHash::get("some_key"),
[](const std::vector<std::shared_ptr<dht::Value>>& values) {
for (const auto& v : values)
std::cout << "This value has passed through the remotes filters " << *v << std::endl;
return true; // keep looking for values
},
[](bool success) {
std::cout << "Get finished with " << (success ? "success" : "failure") << std::endl;
}, {}, w
);
All available fields are listed below:
Field |
---|
Id |
ValueType |
OwnerPk |
UserType |
Note: fields usage in string initialization is snake case!
A query can tell if it is satisfied by another query. For e.g.:
Query q1;
q1.where.id(5); // the whole value with id=5 will be sent
Query q2 {{"SELECT value_type"}};
// q2 the same as Query q("SELECT value_type WHERE value_type=10,user_type=foo_type");
q2.where.valueType(10).userType("foo_type");
Query q3("SELECT id WHERE id=5"); // only the id=5 will be sent
q1.isSatisfiedBy(q3); // false
q2.isSatisfiedBy(q1); // false
q3.isSatisfiedBy(q1); // true because q1 yields all the response data q3 would have or more
q2.isSatisfiedBy(q3); // false
This class extends dht::Dht
, and provides the same API methods (get, put, listen). It adds a public-key cryptography layer on top of the DHT.
A user-provided Identity (RSA key pair and optional Certificate) can be used for signing and decryption.
If SecureDht
is configured with a Certificate, it will be published on the DHT, and automatically retrieved by other nodes in order to identify, authentify and encrypt values exchanged on the DHT.
Values returned by SecureDht::get
and SecureDht::listen
are checked beforehand and filtered: signed values are dropped if their signature verification fails. Similarly, encrypted values that can't be decrypted are dropped.
As a layer on top of Dht
, SecureDht
can also be used for plain values. Methods like get
and put
will behave the same as Dht
for non-encrypted and non-signed values.
The user can know if a dht::Value
provided by ::get
and ::listen
is signed by checking the owner
field of the Value (which would be the public key of the signer). The public key ID of the signer can then be checked with value->owner->getId()
or value->owner->getLongId()
.
The following fields of dht::Value
are authenticated by the signature:
owner
recipient
seq
user_type
type
data
Note that the value ID is not part of the signed data and is not authenticated by the signature.
SecureDht adds the following method:
- PutSigned
void putSigned(const InfoHash& hash, const std::shared_ptr<Value>& val, DoneCallback callback);
This method requires SecureDht
to be configured with a private key, used for signing.
Value edition is only possible with signed values. It allows to replace a value at a specific key and value id with a different content.
To edit a value, perform multiple calls to putSigned
with the same key and value id.
It is possible to reuse and modify the same dht::Value
instance, as the signature is recomputed at every call to putSigned
. In that case, avoid modifying the value instance between the call to putSigned
and the completion callback.
Value edition in OpenDHT enforces the following requirements:
- The value must keep the same signer.
- The
seq
field of the value must be increasing. This is done automatically when performingputSigned
multiple times on the same SecureDht instance.
The user can know if a value was received encrypted by checking the recipient
field of the Value (which would be our public key ID if the value was encrypted for us).
On OpenDHT, encrypted values are always also privately signed (the signature is only visible to the recipient). Every field of dht::Value
authenticated by the signature is also concealed to everyone but the recipient when using encryption.
SecureDht adds the following methods:
- PutEncrypted
void putEncrypted(const InfoHash& hash, const InfoHash& to, std::shared_ptr<Value> val, DoneCallback callback, bool permanent = false);
This method requires SecureDht
to be configured with a private key, used for signing.
The public key of the recipient (to
) will be searched using the provided certificate or public key lookup callback, or automatically on the DHT.
void putEncrypted(const InfoHash& hash, const crypto::PublicKey& to, Sp<Value> val, DoneCallback callback, bool permanent = false);
This method allows to provide the full recipient public key directly.
DhtRunner provides a thread-safe access to the running DHT instance and exposes all methods from SecureDht. See more information here : Running a node in your program