-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: Add support for multi get operation for database queries #2396
base: master
Are you sure you want to change the base?
perf: Add support for multi get operation for database queries #2396
Conversation
f9e0289
to
2532b06
Compare
97f786f
to
5405e6c
Compare
2920ae8
to
cff7b47
Compare
cff7b47
to
b23e4be
Compare
// https://github.com/FuelLabs/fuel-core/issues/2344 | ||
let result = tx_ids | ||
pub async fn transactions(&self, tx_ids: &[TxId]) -> Vec<StorageResult<Transaction>> { | ||
let on_chain_results: BTreeMap<_, _> = tx_ids |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is nice that this function handles this hard use case.
In most cases, we have only on_chain
data unless we did regenesis. It would be nice if this did not affect performance for the primary use case.
Can we first request values from the on-chain database, and if any of them are not found, we will fall back into the logic with enumerate
and BTreeMap
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, I'll do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 5de80b7 let me know what you think.
crates/storage/src/kv_store.rs
Outdated
/// Returns multiple values from the storage. | ||
fn get_batch<'a>( | ||
&'a self, | ||
keys: BoxedIter<'a, Vec<u8>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is very sad that we need to convert &[u8]
into a vector. Are you use that we can't pass Cow<'a, [u8]>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for suggesting. Yeah I think a Cow
can be used here. Originally I wanted to take impl AsRef<[u8]>
but again that's a problem with object safety.
I'll see if I hit any walls with the Cow
solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Phew, had to battle some lifetimes in blueprint.rs
but now I've got the Cow
-based implementation done in f27bdbb.
This reverts commit 491c452.
…use it in get_batch
Proposals to the #2396
crates/storage/src/kv_store.rs
Outdated
@@ -41,7 +44,7 @@ pub trait StorageColumn: Copy + core::fmt::Debug { | |||
|
|||
/// The definition of the key-value inspection store. | |||
#[impl_tools::autoimpl(for<T: trait> &T, &mut T, Box<T>)] | |||
pub trait KeyValueInspect { | |||
pub trait KeyValueInspect: Send + Sync { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is very sad that we need to require Send + Sync
only because BoxedIter
requires Send
. Maybe we can define non Send boxed iterator instead and use it(because it seems we don't need Send
feature, but maybe I'm wrong)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this bound is just to satisfy the BoxedIter
requirement. I can see if I can define a non-send one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hitting some walls with this implementation. Will try again tomorrow with fresh eyes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright my mistake was trying to change some iterators that we turn into streams, and they need to be Send
. Now I have managed to get rid of these trait bounds cbb6efc.
Now we have two boxed iterators: BoxedIter
which doesn't require Send, and BoxedIterSend
for the cases when you need Send.
<Self as StorageBatchInspect<OldTransactions>>::get_batch(self, ids) | ||
.map(|result| result.and_then(|opt| opt.ok_or(not_found!(OldTransactions)))) | ||
.into_boxed() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want, you can add the same syntax sugars that we did with storage_as_ref
to avoid <Self as StorageInspect<M>>
usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh interesting, that would be nice. I'll look into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out this is not as trivial as I had thought, as we're hitting some lifetime issues.
The problem
get_batch
returns an iterator bound to the lifetime of the self
parameter. I.e. the storage we call it on. This is necessary since we do self.get(...)
within the KeyValueInspect::get_batch
implementation.
So while we can implement StorageBatchInspect
for the StorageRef
type as:
impl<'a, S, Type> StorageBatchInspect<Type> for StorageRef<'a, S, Type>
where
S: StorageBatchInspect<Type>,
Type: Mappable,
{
#[inline(always)]
fn get_batch<'b, Iter>(
&'b self,
_keys: Iter,
) -> impl Iterator<Item = Result<Option<Type::OwnedValue>>> + 'b
where
Iter: 'b + IntoIterator<Item = &'b Type::Key>,
Type::Key: 'b,
{
None.into_iter() // Note that we'd need access to `self.0` here which requires VM changes, or copying the `StorageRef` type into this crate. But that's a separate, and very manageable issue
}
}
when we try to use it as
fn old_transactions<'a>(
&'a self,
ids: BoxedIter<'a, &'a TxId>,
) -> BoxedIter<'a, StorageResult<Transaction>> {
self.storage::<OldTransactions>()
.get_batch(ids)
.map(|result| result.and_then(|opt| opt.ok_or(not_found!(OldTransactions))))
.into_boxed()
}
we're hitting this issue
error[E0716]: temporary value dropped while borrowed
--> crates/fuel-core/src/service/adapters/graphql_api/off_chain.rs:183:9
|
179 | fn old_transactions<'a>(
| -- lifetime `'a` defined here
...
183 | self.storage::<OldTransactions>()
| -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| |
| _________creates a temporary value which is freed while still in use
| |
184 | | .get_batch(ids)
| |___________________________- argument requires that borrow lasts for `'a`
...
187 | }
| - temporary value is freed at the end of this statement
Going forward
For this PR I suggest we don't go down this rabbit hole, but I'd be happy to do it as a follow-up if you think it's possible/interesting to explore it. Seems like a quite low prio though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to get some benchmarks(I guess you need to add new one that will use GraphQL) =)
Since this PR add `into_bytes` for the encoder, we can optimize the batch mutate operations as well #2396 (review)
Sure thing. I'm not sure if it makes sense to use criterion/cargo bench for end-to-end performance testing though as it is more geared towards micro-benchmarking, and we need to do something like the following.
I'll look into our options for this type of workload. Let me know if you have any opinions or thoughts on this. |
32993e0
to
3a3d967
Compare
3a3d967
to
cbb6efc
Compare
I'll aim at defining these workloads as a stand-alone binary, which would allow us to use hyperfine to actually execute the benchmarks and interpret the results. |
@@ -177,7 +177,7 @@ impl LowerHex for TxPointer { | |||
} | |||
} | |||
|
|||
#[derive(cynic::Scalar, Debug, Clone)] | |||
#[derive(cynic::Scalar, Debug, Clone, PartialEq, Eq)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these changes needed? I can't seem to find a place where these additional derives are necessary.
@@ -194,7 +194,7 @@ impl Deref for HexString { | |||
} | |||
} | |||
|
|||
#[derive(Debug, Clone)] | |||
#[derive(Debug, Clone, PartialEq, Eq)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
} | ||
} | ||
|
||
pub async fn extend_with_off_chain_results( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pub async fn extend_with_off_chain_results( | |
async fn extend_with_off_chain_results( |
} | ||
|
||
/// The trait encodes the type to the bytes and passes it to the `Encoder`, | ||
/// which stores it and provides a reference to it. That allows gives more | ||
/// flexibility and more performant encoding, allowing the use of slices and arrays | ||
/// instead of vectors in some cases. Since the [`Encoder`] returns `Cow<[u8]>`, | ||
/// it is always possible to take ownership of the serialized value. | ||
pub trait Encode<T: ?Sized> { | ||
pub trait Encode<T: ?Sized>: 'static { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not exactly sure why we need the 'static
bound here.
Linked Issues
Closes #2344
Description
This PR adds support for getting batches of values from the storage through a new
StorageBatchInspect
trait infuel-core-storage
.This trait is implemented for
StructuredStorage
andGenericDatabase
through a newget_batch
method added to theKeyValueInspect
andBluePrintInspect
traits. Theget_batch
method is implemented with themulti-get
operation for the RocksDB based storage implementations, and uses a default blanket implementation for in-memory implementations.Checklist
Before requesting review