Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bump to 24.1 - Single-container setup #607

Draft
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

jyotipm29
Copy link

@jyotipm29 jyotipm29 commented Oct 3, 2024

Upgrades:

  • Base Ubuntu Image: Upgraded from version 18.04 to 22.04
  • Galaxy: Upgraded from version 20.09 to 24.1
  • PostgreSQL: Upgraded from version 11 to 15
  • Python3: Upgraded from version 3.7 to 3.10 (Python 3.10 is set as the default interpreter)

Updates:

  • New Service Support:

    • Gunicorn: Replaces uWSGI as the web server for Galaxy. Installed by default inside Galaxy's virtual environment. Configured Nginx to proxy Gunicorn enabled on port 4001.
    • Celery: Installed by default inside Galaxy's virtual environment. Enabled Celery for distributed task queues and Celery Beat for periodic task running. RabbitMQ serves as the broker for Celery (if RabbitMQ is disabled, it defaults to PostgreSQL database connection). Redis is used as the backend for Celery (if Redis is disabled, it defaults to a SQLite database). Flower service is added for monitoring and debugging Celery.
    • RabbitMQ Management: Enabled the RabbitMQ management plugin on port 15672 for managing and monitoring the RabbitMQ server. The dashboard is exposed via Nginx and is accessible at the /rabbitmq/ path. The default access credentials are admin:admin.
    • Redis: Added Redis server on port 6379 as a backend for Celery.
    • Flower: Added Flower service on port 5555 for monitoring and debugging Celery. The dashboard is exposed via Nginx and is available at the /flower/ path. The default access credentials are admin:admin.
    • TUSd: Added TUSd server on port 1080 to support fault-tolerant uploads; Nginx is configured to proxy TUSd.
    • gx-it-proxy: Added gx-it-proxy service on port 4002 to support Interactive Tools.
  • Ansible Playbooks:

    • Added configure_rabbitmq_users.yml Ansible playbook, which removes the default guest user and adds admin, galaxy, and flower users for RabbitMQ during container startup.
  • Environment Variables:

    • Added GUNICORN_WORKERS and CELERY_WORKERS magic environment variables to set the number of Gunicorn and Celery workers, respectively, during container startup.
  • Configuration Changes:

    • Replaced the Galaxy Reports sample configuration file.
    • Removed galaxy_web, handlers, reports, and ie_proxy services from Supervisor.
    • Added Gravity for managing Galaxy services such as Gunicorn, Celery, gx-it-proxy, TUSd, reports, and handlers. It uses Supervisor as the process manager, with the configuration file located at /etc/galaxy/gravity.yml.
    • Added support for dynamic handlers (set as the default handler type).
    • Redis and Flower services are now managed by Supervisor.
    • Since Galaxy Interactive Environments are deprecated, they have been replaced by Interactive Tools (ITs). The sample configuration file tools_conf_interactive.xml.sample is placed inside GALAXY_CONFIG_DIR. Nginx is also configured to support both domain and path-based ITs.
    • Switched to using the cvmfs-config.galaxyproject.org repository for automatic configuration and updates of Galaxy project CVMFS repositories. Updated tool data table config path to include CVMFS locations from data.galaxyproject.org in --privileged mode.
    • Enabled IPv6 support in Nginx for ports 80 and 443.
    • Added Subject Alternative Name (SAN) extension (DNS:localhost and IP:127.0.0.1) while generating a self-signed SSL certificate.
    • Ensured the Nginx SSL certificate is trusted system-wide by adding it to the CA store.
    • Updated Galaxy extra dependencies.
    • Added docker_net, docker_auto_rm, and docker_set_user parameters for Docker-enabled job destinations.
    • Added update_yaml_value.py script to update nested key values in a YAML file.
    • Replaced ie_proxy with gx-it-proxy.
    • Replaced nginx_upload_module with TUSd for delegated uploads.

@bgruening
Copy link
Owner

Very cool, I triggered a test run. It would be nice if we can get tests to turn green at some point. But they probably also need to be updated.

Thanks a lot!

.gitmodules Show resolved Hide resolved
@jyotipm29 jyotipm29 marked this pull request as draft October 16, 2024 19:29
@bgruening
Copy link
Owner

As a first step feel free to concentrate on the single-container option. Then the PR are easier to review.

@jyotipm29
Copy link
Author

jyotipm29 commented Oct 30, 2024

The single-container changes are done, and I’ve also updated the compose files. Everything seems to be working well, and all tests in the forked repo have passed. I also updated the tool versions for the workflow tests, as the old ones seemed incompatible.

Next, I’ll work on adding the Rustus service, integrating interactive tools, and replacing Nginx with Traefik in the compose setup. Let me know if you have any feedback on the current changes. Thanks!

@bgruening
Copy link
Owner

@jyotipm29 really impressive work. Thanks you a lot.
Please split this PR into multiple PRs. So that we can get the single-container merged with documentation and you can try to hack further on the compose stuff.

@mira-miracoli
Copy link

This looks really cool @jyotipm29 Thanks a lot! 🚀
From what I see, the tests are failing due to github walltime limits and storage limits.
If you are sure that it works, maybe we could also test it manually again, we could already merge this. What do you think @sanjaysrikakulam ?

Next, I’ll work on adding the Rustus service, integrating interactive tools, and replacing Nginx with Traefik in the compose setup. Let me know if you have any feedback on the current changes. Thanks!

A bit more important than the compose setup and traefik would be to successively replace ansible-galaxy-extras with the roles that are maintained and e.g. used in usegalaxy-eu/infrastructure-playbook or on the org server. Sorry that we did not came up with this earlier and you already updated the role in your fork. Maybe it makes the replacement easier, because you can replace the roles one by one and run the ci tests in between.
Sorry I think we are currently lacking a bit a use-case for the compose setup and we should have told you earlier.

@jyotipm29
Copy link
Author

Thanks! I will check that out.
The current PR will track the single-container setup and this will track the compose one.

@jyotipm29 jyotipm29 changed the title bump to 24.1 bump to 24.1 - Single-container setup Oct 30, 2024
.travis.yml Outdated Show resolved Hide resolved
.github/workflows/single.sh Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
galaxy/Dockerfile Show resolved Hide resolved
galaxy/tools_conf_interactive.xml.sample Outdated Show resolved Hide resolved
galaxy/startup.sh Show resolved Hide resolved
galaxy/startup.sh Show resolved Hide resolved
@sanjaysrikakulam
Copy link

This looks really cool @jyotipm29 Thanks a lot! 🚀 From what I see, the tests are failing due to github walltime limits and storage limits. If you are sure that it works, maybe we could also test it manually again, we could already merge this. What do you think @sanjaysrikakulam ?

Next, I’ll work on adding the Rustus service, integrating interactive tools, and replacing Nginx with Traefik in the compose setup. Let me know if you have any feedback on the current changes. Thanks!

A bit more important than the compose setup and traefik would be to successively replace ansible-galaxy-extras with the roles that are maintained and e.g. used in usegalaxy-eu/infrastructure-playbook or on the org server. Sorry that we did not came up with this earlier and you already updated the role in your fork. Maybe it makes the replacement easier, because you can replace the roles one by one and run the ci tests in between. Sorry I think we are currently lacking a bit a use-case for the compose setup and we should have told you earlier.

Sure! Björn already left his comments and suggestions, and I thought Jyoti would probably want to address them. However, as you pointed out, this can be merged.

@jyotipm29 Excellent work! Thank you! :)

@bgruening
Copy link
Owner

bgruening commented Oct 30, 2024

@jyotipm29 what do you think about postponing the CI compose tests until after the single-container tests is green. This way we safe a bit of CI time and you can faster iterate on the single-container one?

README.md Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
@jyotipm29
Copy link
Author

@jyotipm29 what do you think about postponing the CI compose tests until after the single-container tests is green. This way we safe a bit of CI time and you can faster iterate on the single-container one?

Yes, good idea. I would temporarily disable those tests in the next commit.

@bgruening
Copy link
Owner

If someone wants to test it quickly :)

docker run -p 8080:80 quay.io/bgruening/galaxy:24.1-beta

@bgruening
Copy link
Owner

I get ...

PermissionError: [Errno 13] Permission denied: '/home/galaxy/.config/conda/.condarc'

So installing tools into the container does not work with the container. Its strange, I thought we had a test for this.

@jyotipm29
Copy link
Author

I get ...

PermissionError: [Errno 13] Permission denied: '/home/galaxy/.config/conda/.condarc'

So installing tools into the container does not work with the container. Its strange, I thought we had a test for this.

This is weird. The tool installation worked in my environment.

@bgruening
Copy link
Owner

Did you run with or without --privileged=true?

@jyotipm29
Copy link
Author

jyotipm29 commented Oct 31, 2024

It worked both ways. Even I can see in the CI test logs that the tool installation worked.

@bgruening
Copy link
Owner

Which tool.are you using to install?

@jyotipm29
Copy link
Author

I tested cherry_pick_fasta and abyss. Is there any particular tool that you want me to check?

@jyotipm29
Copy link
Author

This looks really cool @jyotipm29 Thanks a lot! 🚀 From what I see, the tests are failing due to github walltime limits and storage limits. If you are sure that it works, maybe we could also test it manually again, we could already merge this. What do you think @sanjaysrikakulam ?

Next, I’ll work on adding the Rustus service, integrating interactive tools, and replacing Nginx with Traefik in the compose setup. Let me know if you have any feedback on the current changes. Thanks!

A bit more important than the compose setup and traefik would be to successively replace ansible-galaxy-extras with the roles that are maintained and e.g. used in usegalaxy-eu/infrastructure-playbook or on the org server. Sorry that we did not came up with this earlier and you already updated the role in your fork. Maybe it makes the replacement easier, because you can replace the roles one by one and run the ci tests in between. Sorry I think we are currently lacking a bit a use-case for the compose setup and we should have told you earlier.

Just to confirm, the idea is to completely phase out ansible-galaxy-extras and instead use individual roles like usegalaxy_eu.nginx, usegalaxy_eu.htcondor etc, similar to how we currently use galaxyproject.postgresql in this repository. Is this correct?

@bgruening
Copy link
Owner

Just to confirm, the idea is to completely phase out ansible-galaxy-extras and instead use individual roles like usegalaxy_eu.nginx, usegalaxy_eu.htcondor etc, similar to how we currently use galaxyproject.postgresql in this repository. Is this correct?

Yes :-). And now since the tests work I would do that one commit at a time and see if tests still work.

galaxy/Dockerfile Outdated Show resolved Hide resolved
@jyotipm29
Copy link
Author

While installing cvmfs client from ansible-cvmfs role during docker build, it requires autofs to be running, and autofs can't run without privileged mode. Do you have any suggestions for this?

@sanjaysrikakulam
Copy link

While installing cvmfs client from ansible-cvmfs role during docker build, it requires autofs to be running, and autofs can't run without privileged mode. Do you have any suggestions for this?

Maybe this might help: https://github.com/cvmfs/cvmfsexec

@bgruening
Copy link
Owner

See #609 for a few more information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants