Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extension popularity monitoring #4219

Closed
pchaigno opened this issue Aug 4, 2018 · 38 comments
Closed

Extension popularity monitoring #4219

pchaigno opened this issue Aug 4, 2018 · 38 comments
Assignees

Comments

@pchaigno
Copy link
Contributor

pchaigno commented Aug 4, 2018

I'm opening this issue to track the popularity of submitted file extensions (pull requests with the label Pending Popularity). I've used Harvester to count the number of repositories for all extensions where it makes sense (i.e., if the number of files is below 200 we already know there won't be hundreds of repositories as required). All numbers were updated today, in the last few hours.

I'll update the numbers in a few months and will probably stop counting for extensions whose number of repositories stagnated or reduced.

Extension PR Files Repo. Date +/- at Feb. 21
.mxl #3651 106 - Aug. 4 18 +0 files
.exw #3754 618 41 Aug. 4 18 +8 repo.
.exu #3754 54 - Aug. 4 18 +4 files
.sarl #3772 252 49 Aug. 4 18 +5 repo.
.hsig #3855 459 64 Aug. 4 18 +21 repo.
.imba #3869 611 58 Aug. 4 18 +57 repo.
.coco #3872 641 79 Aug. 4 18 +11 repo.
.pbf #3926 62 - Aug. 4 18 +0 files
.pbp #3926 133 - Aug. 4 18 +11 files
.smk #3953 1k 157 Aug. 4 18 +76 repo.
.zig #4005 904 80 Aug. 4 18 +144 repo.
.rho #4071 222 19 Aug. 4 18 +33 repo.
.wren #4088 112 - Aug. 4 18 +110 files, 55 repo.
.archimate #4128 581 258 Aug. 4 18 +49 repo.
.varlink #4164 43 - Aug. 4 18 +5 files
.pq #4191 701 147 Aug. 4 18 +24 repo.
.pqm #4191 51 - Aug. 4 18 +26 files
.m #4191 84 - Aug. 4 18 -19 files
.asx #4193 706 122 Aug. 4 18 +5 repo.
.jsonnet #2653 7k 93 Oct. 3 18 +18 repo.
.htmlx #4323 5 - Nov. 14 18 +0 files
.rego #4371 296 80 Jan. 20 19 -
cabal-ghcjs.project #4419 71 70 Mar. 3 19 -
.asddls #4614 804 109 Aug. 31 19 +140 repos.
.daml #4523 155 37 Aug. 31 19 +122 repo.
.aplf #4526 2709 33 Aug. 31 19 +16 repo.
.carp #4530 362 75 Aug. 31 19 +466 files. -4 repo *.
.scilla #4635 559 61 Sept. 15 19 +21 repo.
.raku #4731 74 29 Jan. 7 20 -
.rakumod #5168 200 9 Jan. 7 20 +259 repo.
.curry #5111 4437 56 Jan. 6 21 +403 files, -18 repo *.
.ispc #5191 6790 142 9 Feb. 21 -
.isph #5191 3074 158 9 Feb. 21 -
.hla #5194 12618 71 9 Feb. 21 -

Please avoid discussing issues with specific extensions (such as an improved search query) here and prefer the associated pull request.

* GitHub's search indexing changed at the end of 2020 to only index repositories active in the last year. A decrease in repositories indicates less activity and a potential decrease in overall popularity.

@andrewrk
Copy link

andrewrk commented Aug 4, 2018

What's the acceptance criteria? archimate has over 200 repositories and yet it's in this list with its PR closed.

@pchaigno
Copy link
Contributor Author

pchaigno commented Aug 4, 2018

What's the acceptance criteria?

That's going to depend on the extension. Hundreds of repositories is a rule of thumb; for a very specific extension with few chances of conflicts, such as .archimate, I'd be in favor of adding it now, with only 200-300 repositories; others will need a few hundreds more (think .m which we already have 7 languages associated to).

archimate has over 200 repositories and yet it's in this list with its PR closed.

I'm working on a new XML Strategy that should handle the .archimate case. If it doesn't work out, I'll reopen the pull request and I'll invite its author to update the branch. In any case, we'll discuss it in #4128.

@tajmone
Copy link

tajmone commented Aug 5, 2018

Excellent idea @pchaigno! And very useful too.

Thanks.

@mattmasson
Copy link

@pchaigno thanks for doing this!

Can we remove the .m file extension from consideration for #4191 then? The code change only includes .pq and .pqm file support (specifically because we didn't want to conflict with the existing .m file highlighters). I think .m was mentioned in the PR comments because that was our old convention (and no longer used).

@andrewrk
Copy link

andrewrk commented Nov 4, 2018

Zig update - nov 4, 2018 - 1547 files, 151 repos
Zig update - nov 21, 2018 - 1812 files, 186 repos

@pchaigno pchaigno mentioned this issue Nov 13, 2018
16 tasks
@andrewrk
Copy link

andrewrk commented Dec 1, 2018

Hello @pchaigno

I have just run harvester on Zig extension and came up with these results:

  • 1861 files
  • 203 repos
  • 93 unique users

https://gist.github.com/andrewrk/33650712e74a65873dd84d82d25f449a

It is time to re-open #4005

@stale
Copy link

stale bot commented Dec 31, 2018

This issue has been automatically marked as stale because it has not had activity in a long time. If this issue is still relevant and should remain open, please reply with a short explanation (e.g. "I have checked the code and this issue is still relevant because ___."). Thank you for your contributions.

@pchaigno
Copy link
Contributor Author

pchaigno commented Apr 4, 2020

@xfix It looks like .rakutest is still used by a handful of users only. The purpose of this issue is to track usage for extensions that we expect to reach our threshold soon. It doesn't look like that's the case for .rakutest. Don't hesitate to ping me if that changes.

@stefanobaghino
Copy link

stefanobaghino commented Apr 24, 2020

Update on .daml (#4523) as of 2020.04.24:

  • 1338 unique files (was 155, +1183 since 2019.08.31)
  • 135 unique repositories (was 37, +98 since 2019.08.31)

Ran Harvester with extension:daml Party.

Raw data: https://gist.github.com/stefanobaghino/6439072413623b0a9b05aabeaeebb4e6

Unique repository count method

cat daml.txt | awk -F '/' '{print $1 "//www.github.com/" $4 "/" $5 }' | sort -u | wc -l

Disclaimer: with my other account @stefanobaghino-da, I'm a member of @digital-asset, which is the author of the DAML language. I ran Harvester with my private account (the one I'm using right now) to exclude private repositories.

@Alhadis
Copy link
Collaborator

Alhadis commented Jan 5, 2021

@pchaigno I've added .curry to the list. Just an FYI in case you need to update any extension lists locally.

@lildude Could we pin this issue?

@lildude lildude pinned this issue Jan 5, 2021
@lildude
Copy link
Member

lildude commented Jan 5, 2021

@lildude Could we pin this issue?

✅ Done.

@Nixinova
Copy link
Contributor

Can all of the entries be updated? Its been a year.

@benstigsen
Copy link

benstigsen commented Feb 9, 2021

Can all of the entries be updated? Its been a year.

I'd like this as well, specifically for Wren support.

@KamilaBorowska
Copy link
Contributor

Could .rakutest be added to extension tracking? It has 584 files.

@sartimo
Copy link

sartimo commented Oct 3, 2021

Hi 👋🏻

How can we add a language to the pending popularity section?

@lildude
Copy link
Member

lildude commented Oct 4, 2021

How can we add a language to the pending popularity section?

Languages are normally added when a PR is opened that doesn't quite yet meet the documented usage requirements, though we've not really been maintaining this list much as PRs no longer auto-close.

@github-linguist github-linguist deleted a comment from Pakon2543 Dec 8, 2021
@lildude lildude closed this as completed Nov 24, 2022
@lildude lildude unpinned this issue Nov 24, 2022
@stefanobaghino
Copy link

@lildude I would be interested in having a look if Daml now meets the popularity requirements, should I report my findings back in #4523 and re-open or open a new one?

@lildude
Copy link
Member

lildude commented Nov 24, 2022

@lildude I would be interested in having a look if Daml now meets the popularity requirements, should I report my findings back in #4523 and re-open or open a new one?

That PR is over three years old, so it's probably best to start a new PR, though from a quick look... usage is still incredibly low with only 698 files which is still waaaay too low.

@stefanobaghino-da
Copy link

Thanks for checking that! By the way the query you linked actually returns an empty result to me. Is that path: kind of search limited to internal usage at Github or something like that?

@lildude
Copy link
Member

lildude commented Nov 24, 2022

Thanks for checking that! By the way the query you linked actually returns an empty result to me. Is that path: kind of search limited to internal usage at Github or something like that?

Nope. It's the new Codesearch beta

@stefanobaghino
Copy link

That PR is over three years old, so it's probably best to start a new PR, though from a quick look... usage is still incredibly low with only 698 files which is still waaaay too low.

I finally got access to Codesearch. How did you get that number? Clicking on your query (which accounts for a similarly named XML-based configuration format) I get ~5.9K results and scoping it down to include a couple of keywords (like so) narrows it down to ~2.9K.

Do you have a rough figure with regards to the expected numbers to include the language in Linguist?

@lildude
Copy link
Member

lildude commented Jan 10, 2023

I finally got access to Codesearch. How did you get that number?

It is what was returned at the time. It's certainly possible there are more files returned now as the Codesearch beta has been expanding and indexing more and more files.

Do you have a rough figure with regards to the expected numbers to include the language in Linguist?

This is documented in the CONTRIBUTING.md file.

From a quick look at the results, it looks like support can be added so feel free to open a new PR. You're going to need to add support to XML for this extension as the same time in the same PR too (it looks popular enough too as it appears to be about one per repo) to ensure they remain being classified as XML with the correct syntax highlighting.

@stefanobaghino-da
Copy link

stefanobaghino-da commented Jan 10, 2023

This is documented in the CONTRIBUTING.md file.

Apologies, it's been a long time and I guess I missed it.

From a quick look at the results, it looks like support can be added so feel free to open a new PR.

This is great news, thanks!

From a quick look at the results, it looks like support can be added so feel free to open a new PR. You're going to need to add support to XML for this extension as the same time in the same PR too (it looks popular enough too as it appears to be about one per repo) to ensure they remain being classified as XML with the correct syntax highlighting.

To clarify, while an XML-based configuration format with the same name exists, the daml I'm talking about is a Haskell dialect. 🙂

@lildude
Copy link
Member

lildude commented Jan 10, 2023

To clarify, while an XML-based configuration format with the same name exists, the daml I'm talking about is a Haskell dialect. 🙂

Yup, but as there are clearly two different languages using the same extension, you will need to add support for both at the same time else they'll all be classified as the one language you add, in this case the Haskell dialect, which won't be correct for the XML files. Think about it the other way, you wouldn't want all the Haskell dialect .daml files to be classified as XML which they would be if the extension were only added to XML.

@stefanobaghino-da
Copy link

Yup, but as there are clearly two different languages using the same extension, you will need to add support for both at the same time else they'll all be classified as the one language you add, in this case the Haskell dialect, which won't be correct for the XML files. Think about it the other way, you wouldn't want all the Haskell dialect .daml files to be classified as XML which they would be if the extension were only added to XML.

Got it, thanks.

@github-linguist github-linguist locked as resolved and limited conversation to collaborators Jun 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

16 participants