-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add draft next-gen Python workflows #25
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a quick review of the PR as-is. Overall, looks like there's a lot of good functionality already built out, but there's also quite a few which I'm not sure if they will be needed or if we'll have to support additional platform integrations (specifically Coveralls).
My suggestion is to build this out incrementally rather than have large commits and review cycles, especially as we'll need to sync on design throughout the process. If you feel that you have enough of the functionality define to use as a minimum viable workflow, then lets schedule some time to figure out in more detail naming and file structures to match the rest of the library files.
|
||
# This workflow is designed to run various types of tests and analyses on the codebase. | ||
# It is triggered from other workflows using `workflow_call`, with specific ref inputs. | ||
# Different jobs cover tests across multiple operating systems (Ubuntu, Windows, macOS) and Python versions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we do not have CI/CD runners for Windows and macOS environments right now, this may not be possible. Is this a strict need for RCSB Python projects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely think it's ideal for packages we intend for wider audiences. How hard would this be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a concern on difficulty, it's a question of effort versus value. When we get to a point where we want to start distributing and maintaining packages for more environments, we can then evaluate this integration.
|
||
preliminary: | ||
name: Start Markdown table | ||
runs-on: ubuntu-latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, the current process is to create and define Docker images for runners within the https://github.com/rcsb/devops-runner-images repo, associate them with a label to runners within our Kubernetes cluster (https://github.com/rcsb/rcsb-kube/blob/master/helm/gh-actions-runner-controller/values/west.yaml#L45), then refer to them in jobs by their label. We'll want to avoid or minimize the use of GitHub hosted runners as much as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be a good way to generate textual reports (like this Markdown table)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would probably be the easier option: https://github.blog/news-insights/product-news/supercharging-github-actions-with-job-summaries/. Otherwise, perhaps building an output markdown build artifact file would work, provided you use an IDE or some other tool to view the markdown content.
- name: Install dependencies and run tests | ||
id: test | ||
run: | | ||
pip install hatch~=1.12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to another comment in this review, package installs will be handled in the Docker runner image build process. This way, we can avoid running the same package install commands multiple time across different jobs, and ensure that the versions of the packages used are consistent throughout the pipeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking I understand. Currently, multiple _test.yaml
jobs install Python, build the main package (let's say pkg
) and download and install pkg
's dependencies along with some extra packages:
pytest
(buildspkg
; installs dependencies, Hatch, and test frameworks)docker
(buildspkg
; installs dependencies and Hatch)docs
(installs Hatch and doc-building tools)security
andcodeql
also set up Python
The pytest
and docker
redundancy isn't great. We could run tests inside Docker to avoid it. The downside is with testing non-containerized RCSB Python packages (i.e. libraries, clients, etc.): Currently, excluding a Dockerfile disables that step, while tests are still run. If we move tests to Docker, things are less straightforward.
As for having to build separately for all 4 jobs -- if I understand correctly, you're proposing a solution for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hatch
, test frameworks, and doc-building tools would be installed as part of the Python runner image. When the pipeline runs for pkg
, it'll pull and use the Python runner image and avoid having to download and install those common build tools. It'll just have to install the pkg
specific dependencies and build pkg
.
- uses: actions/setup-python@v5 | ||
with: | ||
python-version: ${{ matrix.python }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, do you envision that the pipeline will need to support multiple Python versions within the same run? If so, we'll want to look into something like pyenv
within the containerized runner environment, rather than setup python each time on each run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No hard objection, and that would undoubtably be faster, but using only pyenv is limited, and using both a matrix and pyenv gets complicated pretty quickly. For instance, to test for both FastAPI v0.x.x and v1.x.x, where v1 requires Python>3.10:
strategy:
matrix:
os:
- ubuntu-latest
- windows-latest
- macos-latest
python:
- '3.10'
- '3.13'
fastapi-major:
- '0'
- '1'
exclude:
- python: '3.10'
fastapi-major: '0'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we've reached this point for the FastAPI
example, then more likely than not, the matrix would be defined within the workflow of the specific project, which then calls the workflow in this library to pass in the appropriate inputs for a build of a specific python version and dependency.
hatch publish | ||
env: | ||
HATCH_INDEX_USER: __token__ | ||
HATCH_INDEX_AUTH: ${{ secrets.PYPI_TOKEN }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Secrets management is a bit more of a complicated setup. Because we are using the free-tier GitHub organization, we instead pull secret values from our Hashicorp Vault platform and include them in on the runner deployment setup in Kubernetes. We can walk through that in more detail once we're ready to start deploying these workflows.
HATCH_INDEX_AUTH: ${{ secrets.PYPI_TOKEN }} | ||
if: secrets.PYPI_TOKEN != '' | ||
|
||
publish-docker: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have other library workflows already for building and pushing Docker images to our internal Harbor container registry (https://github.com/rcsb/devops-cicd-github-actions/blob/master/.github/workflows/workflow-docker.yaml), so we'll have to sync and see what can be re-used or changed to support the Python workflows.
git commit -m "style: auto-format" | ||
continue-on-error: true | ||
|
||
- name: Push changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving this more of a reminder for myself, but we'll definitely want to review auto-commits from the pipeline in more detail.
Thanks for reviewing! I opened this PR to facilitate review, with no plan for it being merged.
I'll use one PR per workflow. They're all independent, except for
Some features are just retained from old workflows (copied from another repo). Some we can and should drop, but others I'd like to keep. For example, I like CodeCov or Coveralls -- those tasks are just skipped unless secrets for them are provided.
My thoughts are that we should drop anything that won't be used, but use a skip-if-not-configured approach for features that will be used in only some repos. I prefer this over fine-grained workflows because the logic is in one place (here) rather than copied with modifications over many project repos. Though I don't object to splitting large workflows inside here; e.g. |
Python CI/CD overhaul RO-4317
This CommonMark isn't being rendered correctly.
Sorry for the inexplicable line breaks.
This is a draft!
Some of the workflows have been tested (successfully), but others still need work.
These are so time-consuming to test and fix that I'll put them on hold for a few weeks.
If you have time, reviews and/or bugfixes would be helpful -- I'm still very much not a GH workflows expert.
Alternatively, you can wait until I've gotten them mostly working.
Not a great description; will improve.
Scope
Interdependent companion to RO-4311.
Supported by rules in
pyproject.toml
,.prettierrc
, and pre-commit, along with a very modern build.If pre-commit was installed as instructed, auto-formatting occurs per commit;
this always also happens in GitHub workflows.
That setup is correct and working fine.
These workflows
GH settings:
GitHub settings will need to be adjusted per project (but not here).
There is a GH plugin that can make those changes via a
settings.yaml
that I've tested.(Alternatively, calls to the GH GraphQL API.)
Workflows and behaviors:
/please-test
and/please-check-*
PR comments.These use status checks, which should be marked as required in GitHub settings.
(a subset of conventional commits; must pass to merge).
and/or GitHub Releases with elegant auto-generated release notes.