You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is an umbrella for many issues related to adding a Python VM to the database. This will require work in API, CLI, and the internals. There should be an easy way for users to define Python based plugins that run inside the database that are able to receive data, process it, interact with third party APIs and services, and send data back into the database. Ideally, the runtime would be able to import libraries from the broader Python ecosystem and work with them.
This issue is by no means exhaustive, but can serve as a jumping off point for further refinement and detail.
Here are the contexts under which we'd want to run:
On write (or rather on wal flush, send data to the VM)
On Parquet persist (when persistence is triggered, we'll want to either persist and then run the persisted data through the plugin, or run through the plugin and persist the output)
On a schedule (like cron)
Ad-hoc (submitted like a query)
Within the plugin context, the Python script should have some API automatically imported that allows it to make queries to the database, or write data out to the database. We'll also want to have an in-memory key/value store accessible by the script. Each script should have its own sandboxed store.
We'll need to have an API and CLI for submitting new Python scripts to the DB. We'll also want to collect logs from the scripts and make those accessible via system tables in the query API.
We'll need to have a method for storing and accessing secrets in plugins for connecting to services.
We ultimately want these scripts to be user defined plugins or functions. We'll run a service (like crates.io) for hosting these and will want a method to quickly and easily bring plugins or functions from that service into the database.
Some plugin ideas:
Consume data from Kafka to write into the DB
Collect system metrics to write into the DB
Monitor and alert for specific conditions (threshold values, deadman alerts, etc)
Write to an Iceberg Catalog (i.e. on Parquet persist)
We should create these plugins ourselves to test drive the developer experience of plugin creation, operation, and debugging.
The text was updated successfully, but these errors were encountered:
This is an umbrella for many issues related to adding a Python VM to the database. This will require work in API, CLI, and the internals. There should be an easy way for users to define Python based plugins that run inside the database that are able to receive data, process it, interact with third party APIs and services, and send data back into the database. Ideally, the runtime would be able to import libraries from the broader Python ecosystem and work with them.
This issue is by no means exhaustive, but can serve as a jumping off point for further refinement and detail.
Here are the contexts under which we'd want to run:
Within the plugin context, the Python script should have some API automatically imported that allows it to make queries to the database, or write data out to the database. We'll also want to have an in-memory key/value store accessible by the script. Each script should have its own sandboxed store.
We'll need to have an API and CLI for submitting new Python scripts to the DB. We'll also want to collect logs from the scripts and make those accessible via system tables in the query API.
We'll need to have a method for storing and accessing secrets in plugins for connecting to services.
We ultimately want these scripts to be user defined plugins or functions. We'll run a service (like crates.io) for hosting these and will want a method to quickly and easily bring plugins or functions from that service into the database.
Some plugin ideas:
We should create these plugins ourselves to test drive the developer experience of plugin creation, operation, and debugging.
The text was updated successfully, but these errors were encountered: