Rework Replies #20

right-mouse · 2023-03-04T20:49:23Z

Reworks the "reply" portion of eelbot so that it uses the database for selecting a reply from. Replies are also more dynamic in nature and support matching using regex.

boarnoah · 2023-07-01T21:05:11Z

This PR proposes the following IIUC,

A reply is defined as a regex pattern + some config (the prompt), along with a list of replies (some prompts might only have one reply but others might have lots) that Eelbot will reply with once there is a match.

The way this data is organized in the database is a unconventional. Conventional thinking to me suggests organizing the data as follows

A prompts table, which stores regex pattern(s) & the associated config
A replies table with a FK to the prompt table, by which we can select a reply that is suitable for a given prompt

This design is:

Prompts are stored in config, the pattern & the associated config
Prompts point to replies.<reply-type> tables, so in general there would be N reply tables for N prompts (you could have multiple prompts point to the same reply table)

There isn't any obvious examples of people doing this in relational DBs, and while I personally do feel a more conventional design would have suited our purpose, it will be interesting to investigate this further, to see

if it has been done before
if there are advantages to this approach when dealing with large data sets

So a bunch of thoughts & follow up work (not necessarily needed for this PR to merge) in no particular order

Some of these questions / points are better thought of in the context of "how would something similar to this behave with a very very large dataset, the typical: 'How would you build twitter?' realm of questions"

this looks a lot like how document databases typically do horizontal scaling by implementing the concept of a sharding with partition key
- not similar to a relational database's concept of partition, which is generally used to split a homogenous dataset by time etc... into separate tables under the hood.
- I can't find too many examples of a design like this with relational databases, but would like to gather them to see any pros/cons, closest I found was: https://news.ycombinator.com/item?id=28776786, there is plenty of material about read replicas for high availability for relational DBs out there, this is more complex than that
minor point, wonder if this will cause DB tools like PGAdmin etc..., to keel over quite soon since the number of tables in the schema grows with the database, in fairness I suspect if this kind of design was done at a large scale you are likely using bespoke tools to interact with it anyway
as you mentioned this makes maintaining a gapless sequence much easier
- in a typical one large table design, there is IIUC always a hard limit with trying to maintain a gapless sequence & scaling up for multiple writers, since with this design a typical write interacts with a much smaller table, this is not an issue
- in fairness to Postgres, https://www.postgresql.org/files/developer/concurrency.pdf as I understand it typical works at a row level, which means in most cases you can already do concurrent writes, you would need to sacrifice having a gapless sequence however. Curious if that is a worthy sacrifice in most situations
with our current design this is built around all the data being organized into tables that share the same table design, this does not require it
- similar to typical document databases, this means we could if we want organize data differently by generation or type if desired
- Wonder if this could be an advantage with large databases compared to a more conventional design. Where your time for migrations that mutate existing data (ex: splitting data off from a table etc...) increases with the amount of data you have (presumably linearly)

boarnoah · 2023-07-01T21:07:30Z

configs/eelbot/config.yaml

+  # Add custom replies by creating a table under the `reply` schema in the database first, then adding an entry here to
+  # reference it. Check the README for more details.
+  - table_name: "myreply"
+    regexps:
+    - "a{3,6}"
+    percent: 33
+    min_delay: 3
+    max_delay: 6
+    timeout: 600


If we are going to do this, IMO this needs to live in the DB not in config, it doesn't fully achieve the goal of being able to extend replies to -> easily / with eelbot-ui this way.

If you do this lets see if we can avoid having so many long type switches for all the numerical types in golang :P

The reason I did this is because the config is loaded once at startup whereas the DB tables can be changed at runtime (replies added/removed etc.). I felt like putting the config in the DB could imply that this too could be changed on the fly. I'll move the config to the DB now, and in a future MR perhaps, I think hot reloading of the config can be handled in the app by watching for changes (not sure how easy it is in Postgres).

(not sure how easy it is in Postgres).

Not difficult IIUC. But I figure what we will more likely want to consider is having eelbot-ui signal to eelbot about changes rather than a DB level event.

I felt like putting the config in the DB could imply that this too could be changed on the fly

Yep, suspected this was the reason. But I do think we should build this around the hopefully realistic future case where eelbot-ui is going to be used to enter in a lot of new reply types without having to re-deploy eelbot often.

right-mouse · 2024-04-19T20:31:07Z

@CodiumAI-Agent /review

CodiumAI-Agent · 2024-04-19T20:32:07Z

PR Review

⏱️ Estimated effort to review [1-5]	3, because the PR involves changes across multiple files with modifications that include database interactions, regex operations, and handling of different data types which require careful review to ensure correctness and efficiency.
🧪 Relevant tests	No
🔍 Possible issues	Possible Bug: The conversion of all numeric types to `float64` in the `toStrKeys` function might lead to precision issues or unexpected behavior when dealing with large integers.
🔍 Possible issues	Performance Concern: The function `getDuration` in `commands/command.go` could be optimized by using a type switch instead of multiple if-else conditions.
🔒 Security concerns	No

Code feedback:

relevant file	cmd/eelbot/main.go
suggestion	Consider using a type switch in the `toStrKeys` function to handle different numeric types more cleanly and efficiently. This approach reduces the redundancy of multiple case statements and improves readability. [important]
relevant line	case int:

relevant file	commands/command.go
suggestion	Optimize the `getDuration` function by implementing a type switch to handle different types of numeric inputs. This will make the code cleaner and potentially more performant by reducing the number of type assertions. [important]
relevant line	case int:

relevant file	cmd/eelbot/main.go
suggestion	Ensure that the conversion of numeric types to `float64` does not cause precision issues, especially with large integers. Consider adding a check or a different handling mechanism for very large numbers. [medium]
relevant line	case uint64:

relevant file	commands/command.go
suggestion	Refactor the repeated code in `getDuration` by using a helper function or a loop to handle the conversion of different integer types to `time.Duration`, reducing code duplication. [medium]
relevant line	case int:

✨ Review tool usage guide:

Overview:
The review tool scans the PR code changes, and generates a PR review which includes several types of feedbacks, such as possible PR issues, security threats and relevant test in the PR. More feedbacks can be added by configuring the tool.

The tool can be triggered automatically every time a new PR is opened, or can be invoked manually by commenting on any PR.

When commenting, to edit configurations related to the review tool (pr_reviewer section), use the following template:

/review --pr_reviewer.some_config1=... --pr_reviewer.some_config2=...

With a configuration file, use the following template:

[pr_reviewer]
some_config1=...
some_config2=...

See the review usage page for a comprehensive guide on using this tool.

Rework replies

bc524bc

right-mouse requested review from boarnoah, SirRam and aashkir March 4, 2023 20:49

Upgrade to golangci-lint v1.51.1

543fd1a

SirRam approved these changes Mar 13, 2023

View reviewed changes

boarnoah reviewed Jul 1, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework Replies #20

Rework Replies #20

right-mouse commented Mar 4, 2023

boarnoah commented Jul 1, 2023

boarnoah Jul 1, 2023 •

edited

Loading

right-mouse Jul 1, 2023

boarnoah Jul 1, 2023

right-mouse commented Apr 19, 2024

CodiumAI-Agent commented Apr 19, 2024

Rework Replies #20

Are you sure you want to change the base?

Rework Replies #20

Conversation

right-mouse commented Mar 4, 2023

boarnoah commented Jul 1, 2023

boarnoah Jul 1, 2023 • edited Loading

Choose a reason for hiding this comment

right-mouse Jul 1, 2023

Choose a reason for hiding this comment

boarnoah Jul 1, 2023

Choose a reason for hiding this comment

right-mouse commented Apr 19, 2024

CodiumAI-Agent commented Apr 19, 2024

PR Review

boarnoah Jul 1, 2023 •

edited

Loading