Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: goose seed #725

Open
mfridman opened this issue Mar 19, 2024 · 8 comments
Open

Feature request: goose seed #725

mfridman opened this issue Mar 19, 2024 · 8 comments
Labels

Comments

@mfridman
Copy link
Collaborator

Opening this as more of a discussion on whether it's a good (or not) idea to have a goose seed command.

Quite often I find myself crafting the same queries over and over to populate a database with arbitrary data. Since goose knows the entire schema, I wonder if it's possible to automagically populate a database from the given schema.

... this idea needs to be flushed out a bit more, but using this issue as a placeholder.

@mfridman mfridman changed the title feat: goose seed Feature request: goose seed Mar 19, 2024
@rquadling
Copy link

I've been using Phinx for about a decade and the seeding capability / concept is pretty much essential.

One of the issues I had with Phinx is the built in DBAL not being as expressive as we want. So I added the template system to Phinx to allow the writing of your own SQL. But I don't need the DBAL as I only use 1 RDBMS for a project. Pros/cons on using/not using DBAL.

With seeds, though I've not used Goose yet (team is and so I'll be getting there eventually), having them is important I feel.

Pro: 1 set of credential/config for incrementing (migrations) and setting (seeds) the content in a DB (actually, if you extract the DB aspect, it can be for ANY store - migrations are NOT just for RDBMS!).
Con: Keeping seeds upto date with a changing schema can be a brand new pain to be endured by the developers.

@mfridman
Copy link
Collaborator Author

Fwiw goose does support "seeding" in the sense that you can apply ad-hoc migrations without versioning.

https://pressly.github.io/goose/blog/2021/no-version-migrations/

The idea is you point goose at a separate directory and run --no-versioning which applies the migrations.


Maybe I mislabeled this issue, but it's more about running a single command and goose somehow figures out a sane set of data to apply to your database without you having to actually write the logic.

@rquadling
Copy link

rquadling commented Mar 26, 2024

Seeding, in general, is one of those areas that has as many ideas/issues as any other thing.

Less about running the seed, but what should the seed contain, how is it implemented/maintained.

If Goose can handle a directory of seeds, then 1 command to run the seeds after the migrations ... is that good enough?

One area that Phinx (and I think Goose) DOESN'T have is "repeatables". Views, stored procedures, triggers, UDF's, etc. Had to write my own for this. But still up against the RDBMS on this. Some don't allow you to do any sort of atomic operation on DML, so you temporary need to drop the trigger to add the replacement. Best option is to NOT use triggers!

You can drop this comment if it is going off track - I tend to go off-track!

@mfridman
Copy link
Collaborator Author

mfridman commented Apr 26, 2024

Regarding the repeatable bit, there's an issue (#472) tracking this and it's on our radar

I haven't quite figured out a nice UX for this, so if you have comments/suggestions that's a good place to drop them.

This issue is specific to some command(s) that can be used in testing code to populate the database automagically, since goose already knows the entire schema and all the types, effectively a bunch if INSERT statements to model the relationships in some semi-clever way.

@stringintech
Copy link

@mfridman I assume the primary goal of the proposed command is to establish correct relationships between entities in the database. However, I’m curious whether generating realistic dummy data for specific fields (e.g., meaningful strings for a varchar column containing a name for something) is also a consideration, or if the focus is exclusively on the relationships.

@mfridman
Copy link
Collaborator Author

mfridman commented Aug 16, 2024

I think the motivation for opening this issue was to explore if it were possible to do something similar to https://www.npmjs.com/package/@snaplet/seed, but with Go

Automatically seed your database with production-like dummy data based on your schema for local development and testing.

I started exploring this area bit and ran into this snaplet command after reading A quick look at Snaplet Seed

@stringintech
Copy link

Thanks for the clarification. I’ll take a look at Snaplet Seed, and if I come across any ideas or questions, I’ll be sure to share them here.

@stringintech
Copy link

stringintech commented Aug 18, 2024

@mfridman I’ve been exploring Snaplet Seed to get a better understanding of how it handles the seeding process. Before the actual seeding, it goes through an initialization phase that includes introspecting the database to understand the schema and then generating TypeScript types, which helps make customizing the seeding process easier.

However, I’m curious about something you mentioned earlier:

goose already knows the entire schema and all the types

From what I’ve seen in Goose so far, it doesn’t seem to track the schema state explicitly as it runs migrations. Could you clarify what you meant by Goose knowing the schema? Are you suggesting that Goose can infer or track the schema as it applies migrations? Or do you think we need a separate introspection phase like Snaplet Seed to ensure we accurately understand the schema before seeding?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants