Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publishing actions in GitHub Marketplace should suggest repositories should be thin #35436

Closed
1 task done
Tracked by #35771 ...
jsoref opened this issue Nov 26, 2024 · 12 comments
Closed
1 task done
Tracked by #35771 ...
Labels
actions This issue or pull request should be reviewed by the docs actions team content This issue or pull request belongs to the Docs Content team SME reviewed An SME has reviewed this issue/PR

Comments

@jsoref
Copy link
Contributor

jsoref commented Nov 26, 2024

Code of Conduct

What article on docs.github.com is affected?

https://docs.github.com/en/actions/sharing-automations/creating-actions/publishing-actions-in-github-marketplace

What part(s) of the article would you like to see updated?

https://docs.github.com/en/actions/sharing-automations/creating-actions/publishing-actions-in-github-marketplace#about-publishing-actions

When you plan to publish your action to GitHub Marketplace, you'll need to ensure that the repository only includes the metadata file, code, and files necessary for the action. Creating a single repository for the action allows you to tag, release, and package the code in a single unit. GitHub also uses the action's metadata on your GitHub Marketplace page.

This should really include advice not to have 300mb of stuff in a release (github/codeql-action currently does this and it means that anyone using the action has to wait 6 seconds for it to download).

Additional information

Repositories with large amounts of content that would have to be checked out at a specific revision and then collected into a tarball and then downloaded into a runner take much longer to retrieve making workflows slower to start.

@jsoref jsoref added the content This issue or pull request belongs to the Docs Content team label Nov 26, 2024
@github-actions github-actions bot added the triage Do not begin working on this issue until triaged by the team label Nov 26, 2024
@nguyenalex836 nguyenalex836 added actions This issue or pull request should be reviewed by the docs actions team waiting for review Issue/PR is waiting for a writer's review needs SME This proposal needs review from a subject matter expert and removed triage Do not begin working on this issue until triaged by the team labels Nov 26, 2024
@nguyenalex836
Copy link
Contributor

@jsoref Thank you for raising this issue! I'll get this triaged for review ✨ Our team will provide feedback regarding the best next steps for this issue - thanks for your patience! 💛

Copy link
Contributor

Thanks for opening an issue! We've triaged this issue for technical review by a subject matter expert 👀

Copy link
Contributor

This is a gentle bump for the docs team that this issue is waiting for technical review.

@github-actions github-actions bot added the SME stale The request for an SME has staled label Dec 25, 2024
@nguyenalex836 nguyenalex836 added SME reviewed An SME has reviewed this issue/PR and removed waiting for review Issue/PR is waiting for a writer's review needs SME This proposal needs review from a subject matter expert SME stale The request for an SME has staled labels Jan 14, 2025
@nguyenalex836
Copy link
Contributor

Hi @jsoref Thank you for your patience while our SME team reviewed, as well as opening this issue and for the docs and CodeQL feedback! ⚡

We definitely understand your frustration with waiting for large actions to download. In this instance, because engineers are generally already conscious of minimizing file size and because this feedback is about a specific action, we don't think this requires a docs update. Developers will likely already be considering the tradeoffs between building an unwieldy action and ensuring the action can do what it is intended to do, and we don't think updating the docs would change this behavior. We also understand that you have left feedback with the CodeQL team, and think that is the best place for this conversation.

Here is some feedback we received from the SME team:

  • Yes, Action repos should aim to be as minimal for their purpose as possible where possible. Sometimes the use case is more complicated and there is not a single Action within the repo but several that work together. This is the case with github/codeql-action as it is a collection of coarse and fine grained actions that allow users to tailor their GitHub Actions workflows with GHAS in various ways. If you profile the disk usage within the repo, you'll see that a majority of it comes from Node dependencies with the bulk being https://www.npmjs.com/package/@protobuf-ts/plugin

  • The overall feedback is fair - the codeql-action repo is large. This is because it has to contain a copy of its entire node_modules dependencies in order to be runnable as-is in any customer workflow without an npm install step. It is also not one action, but several, and for this reason is not on the Marketplace.

Thank you again for your contributions! 💛

@jsoref
Copy link
Contributor Author

jsoref commented Jan 14, 2025

The feedback really isn't about that action. I'm well aware that the action is bad. And I've already done the profiling and already made a PR to fix it. That isn't going anywhere.

I'm trying to improve the documentation to discourage anyone else from making the same mistakes that that team made. It's a lot easier for me to point to official documentation to get other people to do the right thing, or for people to read the official documentation and not do the wrong thing in the first place, than it is for me to have to try to explain things to each offender individually and have them ask why they should listen to me.

@joshmgross
Copy link
Member

👋 Hey @jsoref, I'm on the Actions engineering team.

I totally understand the desire for actions you depend on to be as lightweight as possible to limit the impact they have on your workflows. For 1st party actions that GitHub owns - we have a responsibility to reduce the burden on users who need our actions for core functionality (checking out a repository, artifacts, caching, etc).

I do not think we can definitely say how large a repository should be though - that's going to entirely depend on the use case and how sensitive users are to increased runtime. Container actions, for example, are likely much larger than most Javascript actions. A six second increase in the runtime of each job is going to matter to some users, but I suspect most would just consider it negligible.

If there are external actions that you want to follow best practices, then I'd encourage you to point them to how 1st party actions like actions/checkout and actions/cache are built & bundled.

@jsoref
Copy link
Contributor Author

jsoref commented Jan 16, 2025

There are dozens of actions that pull in codeql-action. I started a sample repository to test (I have a queue of others from the list, but they're in this cell phone and I have other demands on my time, so I may not get the time to test them):

https://github.com/check-spelling-sandbox/vigilant-octo-rotary-phone/actions

There's no advice here saying something akin to "the size of your repository and its direct dependencies will have a measurable impact on workflow start time which would be billable for private repositories, larger runners, or self-hosted runners as well as an impact on public runner slot availability."

As someone who writes/maintains actions and workflows, this wasn't immediately obvious to me and clearly isn't obvious to anyone else.

@jsoref
Copy link
Contributor Author

jsoref commented Jan 16, 2025

As for codeql, as it's a GitHub API to a GitHub API, one would hope it cares about this too. But that isn't directly in your scope. I've provided a PR for that team that would drastically improve the cost, but, again, the point of this issue is to give advice to other authors not to make the mistakes that everyone (myself, historically included) has made.

I do plan to make a public action for the marketplace for the purpose of uploading sarif that people can use instead of the unlisted one that is referenced by everyone, but, that's really separate from telling people not to ship giant actions.

@joshmgross
Copy link
Member

There's no advice here saying something akin to "the size of your repository and its direct dependencies will have a measurable impact on workflow start time which would be billable for private repositories, larger runners, or self-hosted runners as well as an impact on public runner slot availability."

What is that measurable impact though?

Dozens of seconds added to a workflow execution that can last of minutes or even hours is not significant.

If we are going to offer advice here, we need to back it up with clear data.

@jsoref
Copy link
Contributor Author

jsoref commented Jan 17, 2025

check-spelling generally runs in the span of ~1 minute or maybe 2, it tends to run on every push. If the extra 6 seconds is the difference between 1 and 2 minutes or between 2 minutes and 3 minutes, then that's one extra billable minute per job (check-spelling often is configured to run 2 jobs, so those six seconds can easily be 2 extra billable minutes).

Imagine using the pricing calculator: https://github.com/pricing/calculator (and ignoring that it's incredibly buggy and doesn't understand how trash cans should work, but that's a separate bug), if a workflow runs 1000 times per day and has 2 jobs each of which run over the minute break because of those 6 seconds, that's an extra $480/mo:
Image

@nebuk89
Copy link
Contributor

nebuk89 commented Jan 20, 2025

It's fair feedback @jsoref , but the same is also true for every Docker image, node module and really anything else someone could want or need to put onto their machine which isn't there when we boot :)
I guess my stance coming into this (a little late, apologies I have been away on personal leave) is that our prior assumption was that folks knew by this point that 'if you depend on big things in your CI, it will take longer'.

That said there would be no harm in putting something like "the size of your repository and its direct dependencies will have an impact on the time it takes for your workflow to complete" If that will address your concerns :)

Do you think we need to just make that this explicit somewhere in our docs? We are planning a refactor of them currently including a new performance focused section for Actions, where this is probably a really good topic in general.

@jsoref
Copy link
Contributor Author

jsoref commented Jan 20, 2025

@nebuk89: I think that's more or less what I'm looking for. Note that for actions, it isn't your workflow, it's your consumer's workflows.

It really would be nice if that warning called out that the time will either have an impact on the ability to run other jobs (in the case of free repositories linking to the aforementioned Concurrent jobs) or to that item about job cost.

It took me 16 releases to really understand how expensive Docker was and that it was worth replacing (although I don't think composite was available when I first started).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
actions This issue or pull request should be reviewed by the docs actions team content This issue or pull request belongs to the Docs Content team SME reviewed An SME has reviewed this issue/PR
Projects
None yet
Development

No branches or pull requests

4 participants