Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Popularity filtering #1141

Merged
merged 19 commits into from
Aug 16, 2021
Merged

Popularity filtering #1141

merged 19 commits into from
Aug 16, 2021

Conversation

palfrey
Copy link
Collaborator

@palfrey palfrey commented Aug 10, 2021

In line with some of the stuff I'd talked about in #981, and with the explicit objective of making it easier to maintain this list going forwards, I've implemented here a popularity filter for the list. The first time it sees an entry, it tries to get github stars and cargo downloads for that entry. If they exceed some documented thresholds (50 and 2000 respectively) it stores that so we never have to check that project for those numbers again (as they almost never go down, or at least not much).

I've removed all the projects that didn't achieve this threshold (and added a few cargo links to help some that should have been allowed). There's also a POPULARITY_OVERRIDES list at the top of the main.rs that can be used to add those projects that are that popular but the current detection mechanisms don't pick up.

If merged, this would result in us being able to reject probably about half of the current PRs as they are below those thresholds, and probably once people fix the other ones, allow us to fairly rapidly decide if a project should be added, which should massively speed up future PR review.

@kud1ing @luciusmagn Thoughts?

README.md Outdated Show resolved Hide resolved
@palfrey palfrey marked this pull request as draft August 10, 2021 22:56
@palfrey palfrey marked this pull request as ready for review August 11, 2021 06:36
@kud1ing
Copy link
Contributor

kud1ing commented Aug 11, 2021

I think automatisation is vital to reduce the maintainer's burden.
I also think it's very important that the content stays high quality. This means saying no to additions and removing entries. This will always be subjective and controversial. See e.g. #738 (comment)

Your attempt seems like a good start. After only a cursory browsing, some random thoughts:

  • maybe be more generous with some underrepresented and/or educational niches like games
  • maybe park the removed entries somewhere so that the information is not lost.
  • removed entries are likely to be added again by someone in the future. Maybe check the PRs with the parking lot of previously removed entries.
  • maybe automatically remove the foo/bar namespace so that is only bar for very popular entries.

Thank you for keeping this alive.

@palfrey
Copy link
Collaborator Author

palfrey commented Aug 12, 2021

I also think it's very important that the content stays high quality. This means saying no to additions and removing entries. This will always be subjective and controversial. See e.g. #738 (comment)

In a way, the removals are easier because we've already got a well-defined line on them (and low-quality removals are less of a problem) and even if that line is controversial, it's fairly clear, which is good from a "maintainer time" perspective.

* maybe be more generous with some underrepresented and/or educational niches like games

Agreed. There was definitely some low-quality ones there, but also two projects with 40+ stars who got removed. Obviously there's always going to be projects who just miss whatever bar we put in, but in this particular case it's worth adding in some lowered thresholds for some areas. I'd say that overall this shouldn't be done for most things, and if an area gets up the numbers then the bar should be reset back, but I'll do this for games and see if there's any other notable areas with multiple "almost good enough" items.

* maybe park the removed entries somewhere so that the information is not lost.
* removed entries are likely to be added again by someone in the future. Maybe check the PRs with the parking lot of previously removed entries.

I'm not sure about this one. So, firstly if they get re-added and still fail, then they should get rapidly closed, so I'm less concerned about that. I'm also wondering where to sensibly park them? The option I can think of is some sort of extra file in the repo, and that almost seems mean-spirited ("here's a list of projects that aren't good enough" kinda thing). If you've got some other idea that aids discovery I'd love to hear it.

* maybe automatically remove the `foo/bar` namespace so that is only `bar` for very popular entries.

Worth looking into, but more of a "fix/unify display of items" issue (which I might get to eventually)

Thank you for keeping this alive.

I like this list, and I'm hoping that adding in things like this will reduce the maintenance burden and so reduce the odds of me burning out again. Maybe we can even get some new maintainers eventually :)

@palfrey
Copy link
Collaborator Author

palfrey commented Aug 14, 2021

In the absence of other feedback, I'm going to merge this on Monday 16th (i.e two days time) and start closing out PRs after that that fail the criteria.

@palfrey palfrey merged commit 1166032 into rust-unofficial:master Aug 16, 2021
@palfrey palfrey deleted the popularity branch August 16, 2021 18:59
@palfrey palfrey mentioned this pull request Jan 23, 2022
@palfrey palfrey mentioned this pull request Apr 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants