-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for query pipelining? #155
Comments
This would be cool for sure. I wonder how to get pipelining and keep the
crystal/db interface. It seems like keeping track of which result sets go
where would be a bit tricky. I haven’t thought about this much though,
there might be some nice, easy way
…On Wed, Nov 7, 2018 at 00:37 Vladislav Zarakovsky ***@***.***> wrote:
Recently released Round 17 of the TechEmpower Framework Benchmarks has an
interesting change:
https://www.techempower.com/blog/2018/10/30/framework-benchmarks-round-17/
As you review Round 17 results, you'll notice that Postgres database tests
are stratified—there are groups of test implementations that seem
implausibly faster than other test implementations.
The underlying cause of this is use of a Postgres protocol feature we have
characterized as "query pipelining" because it is conceptually similar to
HTTP pipelining. We call it pipelining, but you could also call it
multiplexing. It's a feature of the "Extended Query" protocol in Postgres.
Query pipelining allows database clients to send multiple queries without
needing to wait for each response before sending the next. It's similar to
batching but provided invisibly in the driver.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#155>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAHtYdgWSdiW8vFv8Ue1saumiM0bbt4ks5uspvVgaJpZM4YR_fL>
.
|
This would be fantastic. Right now the DB connection is the only thing in my crystal webapp which feels a bit slow. |
Would love to work with someone on this. While looking for documentation I found the following explanation and implementation:
|
Also have anyone has considered binding LibPQ instead? https://www.postgresql.org/docs/9.5/libpq.html It seems to have solve all of these implementations already. We could bind to the library and create a Crystal-DB interface to it. Thoughts? |
This started out as a binding to libpq but moved away to remove some nasty problems with the linking on some platforms, and to take easier advantage of crystal’s async io.
… On Sep 30, 2019, at 11:37, Elias Perez ***@***.***> wrote:
Also have anyone has considered binding LibPQ instead? https://www.postgresql.org/docs/9.5/libpq.html
It seems to have solve all of these implementations already. We could bind to the library and create a Crystal-DB interface to it. Thoughts?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I think the community can benefit from a native crystal implementation too |
Just posted #253 to support query pipelining, if y'all could please have a look. I've been evaluating TimescaleDB for a side project and being able to pipeline |
@jgaskins Looks nice. Regarding your questions about the API design, perhaps it makes sense to focus on a very low level API at first. This enables using the feature, but you need to know what you're doing. Later there could be a more high level API on top of that. For a high level API I think it could be useful to combine the query and result interpretation into a promise object. I think for this there's not even a need for wrapping this into a query1 = connection.pipeline.query_one "SELECT 42", as: Int32
query2 = connection.pipeline.query_all "SELECT * FROM posts LIMIT $1", 10, as: Post
answer = query1.result
posts = query2.result Behind the scenes, |
@straight-shoota That was the first thing I tried. I wanted the pipeline to use the |
How so? I would think this should be perfectly representable in Crystal's type system. |
I thought it would be, too, until I tried it and realized that returning objects of a specific type is easy but storing them in memory for processing later is not. Feel free to try it out if you're curious, but it's a moot point anyway. That's not how query pipelines are used in practice, and interleaving pipelined vs immediate queries like you suggested would increase cognitive load and/or would be prone to bugs. For example, consider this: posts = pg.pipeline.query_one "...", as: Post
comments = pg.pipeline.query_all "...", as: Comment
author = pg.query_one "...", as: User
reactions = pg.pipeline.query_all "...", as: Reaction
posts = posts.result
comments = comments.result
reactions = reactions.result If you send the first two queries before the third, this code breaks. If you don't send the first two queries before the third, it'd be very confusing when debugging as to why the queries are appearing out of order. The pipeline block in #253 communicates a clear start and end point of the pipelined queries to anyone reading the code. Even if the queries are not being interleaved like that, the structure provided by the block is still better than without it: posts = pg.pipeline.query "..."
comments = pg.pipeline.query "..."
author = pg.pipeline.query "..."
reactions = pg.pipeline.query "..."
posts = posts.read_one(Post)
comments = comments.read_all(Comment)
author = author.read_one(User)
reactions = reactions.read_all(Reaction)
# vs
rs = pg.pipeline do |pipe|
pipe.query "..."
pipe.query "..."
pipe.query "..."
pipe.query "..."
end
posts = rs.read_one(Post)
comments = rs.read_all(Comment)
author = rs.read_one(User)
reactions = rs.read_all(Reaction) The only difference between these two is promise vs block and the second is still significantly easier to grok. |
My concern about a structure where query and reading the result are completely separate is that you have to make sure to use the same order in both. But this probably doesn't matter too much. Or at least it's already great to have a functioning lower level API. For such an advanced feature this might actually be good enough. You'll just have to make sure to use it properly. |
I hear you on this, but a query pipeline is effectively a channel, so this makes sense. That you're thinking about the problem in a different way is not a bad thing, but while a bespoke async I/O implementation with promises can be implemented on top of a pipeline, it is not itself a pipeline.
Agreed that it's pretty advanced, and will likely rarely be used. I imagine most folks are using ORMs for most DB interactions and will only dip into an API like this at all when they really need the performance. |
Recently released Round 17 of the TechEmpower Framework Benchmarks has an interesting change:
https://www.techempower.com/blog/2018/10/30/framework-benchmarks-round-17/
The text was updated successfully, but these errors were encountered: