Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🇫🇷 Large scale verb with frequencies ingestion feature #7

Open
1 of 4 tasks
beverage opened this issue Jun 25, 2024 · 0 comments
Open
1 of 4 tasks

🇫🇷 Large scale verb with frequencies ingestion feature #7

beverage opened this issue Jun 25, 2024 · 0 comments
Labels
language feature 🇫🇷 Language-specific features

Comments

@beverage
Copy link
Owner

beverage commented Jun 25, 2024

Right now only a subset of interesting verbs (auxiliaries, common irregulars, and a selection of pronominal verbs with non-pronominal usages) are hard coded for ingestion. Given a dump of all French words and their frequencies (we have been provided frequencies in books and movies), it should be possible to create a weighted list of infinitives in order to make the question generation more useful for the user.

  • Extract verb infinitive list and frequencies
  • Update verb schema to include their usage frequencies
  • Tune a combination of those frequencies to something ideal for the user (tbd how)
  • Incorporate these into the verb ingestion process

It should be noted that this is not a blocker for sentence generation. We can continue to use the hardcoded lists for that. Every verb on that list is vitally important so frequency of selection is not at all a loss.

@beverage beverage added the enhancement New feature or request label Jun 25, 2024
@beverage beverage changed the title 💪 Large scale verb ingestion feature 💪 Large scale verb with frequencies ingestion feature Jun 25, 2024
@beverage beverage changed the title 💪 Large scale verb with frequencies ingestion feature 🇫🇷 Large scale verb with frequencies ingestion feature Jun 25, 2024
@beverage beverage added language feature 🇫🇷 Language-specific features and removed enhancement New feature or request labels Jul 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language feature 🇫🇷 Language-specific features
Projects
None yet
Development

No branches or pull requests

1 participant