Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate MLS DB To SQLC #380

Merged
merged 9 commits into from
Apr 25, 2024
Merged

Migrate MLS DB To SQLC #380

merged 9 commits into from
Apr 25, 2024

Conversation

neekolas
Copy link
Collaborator

@neekolas neekolas commented Apr 25, 2024

tl;dr

Bun has been bothering me for a while, so I've been working on a migration that will get rid of our ORM altogether and just use boring SQL queries for everything.

sqlc is a very slick tool to generate code based on plain SQL queries using placeholders for arguments. It's not perfect...I had to do some gymnastics to make a few of the query types work.

But the fact that there is no runtime other than the standard SQL driver and some generated code outweighs its limitations IMO. There's no fancy ORM library to worry about mangling your queries, and the learning curve is basically just "how well do you know SQL".

What's wrong with Bun?

  • No support for serializable transactions
  • The SQL driver is not as well maintained as PGX
  • High learning curve to build complex queries, even if you know SQL well
  • Relations system is not very powerful and ends up doing N+1 queries a lot of the time.
  • Configuring the database with struct tags is errorprone, and there aren't great checks to make sure the struct tags actually match the schema.

Things that suck right now with sqlc

  • I can't find a good way to have dynamic ORDER BY expressions. So I literally have separate queries for ASC and DESC versions. It's not the end of the world, but it's very frustrating. There's an issue to fix it, and some hacky workarounds using CASE statements, but it's not great.
  • Making the filters play nice with json_populate_recordset is a bit of a pain. Switching to the pgx driver helped, since I think there was a bug in Bun's pgdriver.

Migration plan

We use Bun in a lot of places and for a lot of things today.

  • It powers the authz database and all the migrations there
  • It powers the migrations for the message database (but not the queries)
  • It powers the mls database and all the queries in the mlsstore.

The priority right now is to remove it from the mlsstore. We will still use it for migrations (sqlc can read Bun migrations just fine).

This involves replacing the bun pgdriver with pgx (done in this PR) and replacing all the Bun ORM queries with sqlc queries. I have most of the queries written, but I'll split up the actual migration over several PRs. This can be done incrementally, but once the process is complete we can delete the Bun models.

We aren't using any of the fancy sqlc cloud features and have no plans to.

What knucklehead brought Bun into our codebase?

Ummmm. 😬. That was me.

Copy link
Collaborator Author

neekolas commented Apr 25, 2024

@neekolas neekolas changed the title Initial commit Migrate MLS DB To SQLC Apr 25, 2024
ORDER BY sequence_id ASC FOR UPDATE;

-- name: GetInboxLogFiltered :many
SELECT a.* FROM inbox_log AS a
Copy link
Collaborator Author

@neekolas neekolas Apr 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably the fanciest query we have.

We need to take in an array of (inbox_id, sequence_id) pairs and get the results of each inbox that are greater than the sequence_id. We are currently doing this in one query per inbox, but this change makes it all happen in one query total by joining against a virtual table.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

null::inbox_filter is a custom SQL data type (not a table) that we decode the JSON serialized filters into

@@ -0,0 +1,472 @@
// Code generated by sqlc. DO NOT EDIT.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is entirely generated based the the queries.sql

@neekolas neekolas marked this pull request as ready for review April 25, 2024 15:55
@neekolas neekolas requested a review from a team April 25, 2024 16:02
@insipx insipx mentioned this pull request Apr 25, 2024
Copy link

@37ng 37ng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Configuring the database with struct tags is errorprone, and there aren't great checks to make sure the struct tags actually match the schema.

How does sqlc manage db schema? From pkg/mls/store/queries/models.go I see that it's generating go structs based on migration history?

Copy link
Collaborator Author

neekolas commented Apr 25, 2024

Merge activity

  • Apr 25, 3:27 PM EDT: @neekolas started a stack merge that includes this pull request via Graphite.
  • Apr 25, 3:28 PM EDT: @neekolas merged this pull request with Graphite.

@neekolas neekolas merged commit ab5ea92 into main Apr 25, 2024
4 checks passed
@neekolas neekolas deleted the nm/sqlc-experiment branch April 25, 2024 19:28
neekolas added a commit that referenced this pull request Apr 25, 2024
Bun has been bothering me for a while, so I've been working on a migration that will get rid of our ORM altogether and just use boring SQL queries for everything.

[`sqlc`](https://sqlc.dev/) is a very slick tool to generate code based on plain SQL queries using placeholders for arguments. It's not perfect...I had to do some gymnastics to make a few of the query types work.

But the fact that there is no runtime other than the standard SQL driver and some generated code outweighs its limitations IMO. There's no fancy ORM library to worry about mangling your queries, and the learning curve is basically just "how well do you know SQL".

- No support for serializable transactions
- The SQL driver is not as well maintained as PGX
- High learning curve to build complex queries, even if you know SQL well
- Relations system is not very powerful and ends up doing N+1 queries a lot of the time.
- Configuring the database with struct tags is errorprone, and there aren't great checks to make sure the struct tags actually match the schema.

- I can't find a good way to have dynamic ORDER BY expressions. So I literally have separate queries for ASC and DESC versions. It's not the end of the world, but it's very frustrating. There's an [issue to fix it](sqlc-dev/sqlc#2061), and some hacky workarounds using CASE statements, but it's not great.
- Making the filters play nice with `json_populate_recordset` is a bit of a pain. Switching to the `pgx` driver helped, since I think there was a bug in Bun's pgdriver.

We use Bun in a lot of places and for a lot of things today.

- It powers the `authz` database and all the migrations there
- It powers the migrations for the `message` database (but not the queries)
- It powers the `mls` database and all the queries in the `mlsstore`.

The priority right now is to remove it from the `mlsstore`. We will still use it for migrations (`sqlc` can read Bun migrations just fine).

This involves replacing the bun `pgdriver` with `pgx` (done in this PR) and replacing all the Bun ORM queries with `sqlc` queries. I have most of the queries written, but I'll split up the actual migration over several PRs. This can be done incrementally, but once the process is complete we can delete the Bun models.

We aren't using any of the fancy `sqlc` cloud features and have no plans to.

Ummmm. 😬. That was me.
neekolas added a commit that referenced this pull request Apr 25, 2024
…382)

* Initial commit

* Set up sqlc

* Initial commit

* Set up sqlc

* Migrate first store methods to sqlc

* Migrate first store methods to sqlc

* Migrate first store methods to sqlc

* Add GetAddressLog SQL

* Add GetAddressLog SQL

* Merge branch '04-24-migrate_first_store_methods_to_sqlc' of github.com:xmtp/xmtp-node-go into 04-24-migrate_first_store_methods_to_sqlc

* Add GetAddressLog SQL

* Initial commit

* Set up sqlc

* Initial commit

* Set up sqlc

* Add InsertLog Query (#383)

* Add insertlog query

* validation service

* insert log

* revocation for removed members

* lint

* remove unnecessary log

* change test to use query from sqlc

* remove comments

* fix tests

* Migrate first store methods to sqlc

* Migrate MLS DB To SQLC (#380)

Bun has been bothering me for a while, so I've been working on a migration that will get rid of our ORM altogether and just use boring SQL queries for everything.

[`sqlc`](https://sqlc.dev/) is a very slick tool to generate code based on plain SQL queries using placeholders for arguments. It's not perfect...I had to do some gymnastics to make a few of the query types work.

But the fact that there is no runtime other than the standard SQL driver and some generated code outweighs its limitations IMO. There's no fancy ORM library to worry about mangling your queries, and the learning curve is basically just "how well do you know SQL".

- No support for serializable transactions
- The SQL driver is not as well maintained as PGX
- High learning curve to build complex queries, even if you know SQL well
- Relations system is not very powerful and ends up doing N+1 queries a lot of the time.
- Configuring the database with struct tags is errorprone, and there aren't great checks to make sure the struct tags actually match the schema.

- I can't find a good way to have dynamic ORDER BY expressions. So I literally have separate queries for ASC and DESC versions. It's not the end of the world, but it's very frustrating. There's an [issue to fix it](sqlc-dev/sqlc#2061), and some hacky workarounds using CASE statements, but it's not great.
- Making the filters play nice with `json_populate_recordset` is a bit of a pain. Switching to the `pgx` driver helped, since I think there was a bug in Bun's pgdriver.

We use Bun in a lot of places and for a lot of things today.

- It powers the `authz` database and all the migrations there
- It powers the migrations for the `message` database (but not the queries)
- It powers the `mls` database and all the queries in the `mlsstore`.

The priority right now is to remove it from the `mlsstore`. We will still use it for migrations (`sqlc` can read Bun migrations just fine).

This involves replacing the bun `pgdriver` with `pgx` (done in this PR) and replacing all the Bun ORM queries with `sqlc` queries. I have most of the queries written, but I'll split up the actual migration over several PRs. This can be done incrementally, but once the process is complete we can delete the Bun models.

We aren't using any of the fancy `sqlc` cloud features and have no plans to.

Ummmm. 😬. That was me.

* Migrate first store methods to sqlc

* Add InsertLog Query (#383)

* Add insertlog query

* validation service

* insert log

* revocation for removed members

* lint

* remove unnecessary log

* change test to use query from sqlc

* remove comments

* fix tests

* Add GetAddressLog SQL

* Merge branch '04-24-migrate_first_store_methods_to_sqlc' of github.com:xmtp/xmtp-node-go into 04-24-migrate_first_store_methods_to_sqlc

* Add InsertLog Query (#383)

* Add insertlog query

* validation service

* insert log

* revocation for removed members

* lint

* remove unnecessary log

* change test to use query from sqlc

* remove comments

* fix tests

* Add InsertLog Query (#383)

* Add insertlog query

* validation service

* insert log

* revocation for removed members

* lint

* remove unnecessary log

* change test to use query from sqlc

* remove comments

* fix tests

---------

Co-authored-by: Andrew Plaza <[email protected]>
@neekolas
Copy link
Collaborator Author

How does sqlc manage db schema? From pkg/mls/store/queries/models.go I see that it's generating go structs based on migration history?

That's right. You just point it at your migrations folder and it reads all of them to assemble the schema. You can also point it at a live database if you want, but we don't have that configured since the migration folder seems to work just fine.

@37ng
Copy link

37ng commented Apr 26, 2024

That's right. You just point it at your migrations folder and it reads all of them to assemble the schema. You can also point it at a live database if you want, but we don't have that configured since the migration folder seems to work just fine.

@neekolas So it's reversed Bun? Bun generates schema from defined go structs/tags.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants