-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Augur cannot be installed natively for ARM64 on Bioconda even though it is labeled "noarch" #3
Comments
This is presumably the same for our conda runtime, given that it installs augur?
Yes and no. Augur needs a tree builder for most workflows 😉 But there are many tools that are workflow specific, and this list will grow. At some point we need to move towards workflows being able to specify these rather than including them within |
This seems reasonable to me given our current directions with runtimes and project packaging. Note that it will be a breaking change for anyone used to installing Augur with Conda and finding that they now need to also install some other stuff they never had to before. Note also that we didn't originally define these deps in the Conda packaging of Augur: they were in the original third-party packaging. |
This can be done tandem with an Augur release and mentioned in the changelog. My thinking is that it would be a "note" to add after usual changelog entries without effect on semver. Example:
I'd also do this after nextstrain/docs.nextstrain.org#157 is merged and create another PR to update the FAQ section there. |
If you're talking about the inability to use ARM64, yes. An inherent problem with Conda is that, given my limited trials, it doesn't play well when you try to install packages from different architectures into the same environment. This is because most packages (such as Augur currently) define dependencies, and dependency resolution is limited to one architecture. This contrasts with the Docker image that we build, where we have the option to build/install packages individually for the native architecture. Because of this, the Conda runtime uses an Intel-based Miniconda installation for all Macs regardless of processor architecture (src). |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
Which dependencies do not have ARM64 support? Is that information published somewhere explicitly? I can't tell from the recipe meta YAML files for these packages alone... |
I just looked and none of |
Thanks, @victorlin! I noticed the badges but never saw one with ARM64 specified. That Bioconda issue explains why! :D Since all of Bioconda doesn't support ARM64, that makes this outcome less desirable for users:
As @jameshadfield pointed out, the features users would be missing include an aligner and tree builder which are pretty key to most Augur workflows. I would really prefer to not have to install/upgrade IQ-TREE, mafft, etc. manually and outside of my Conda environments.
Instead of removing the architecture-specific dependencies from the Bioconda recipe, could we instead recommend the same approach as above of using the Intel-based Miniconda on Mac regardless of actual architecture? This is how I have Augur installed "ambiently" now. Even though this approach slows down workflows because of emulation, the other non-Docker alternative is the managed Conda environment which will also be equally slow because of emulation. It also seems like Bioconda will eventually support ARM64, so we could eventually also recommend using the ARM64-based Miniconda in the future... |
@huddlej I see your point. I was thinking of In other words, the average user might expect That said, I think the deeper problem is that the average Apple silicon user is unaware of the emulation subtleties (because macOS hides them very well), meaning they aren't aware of potentially faster options. I think the best we can do here is:
|
Hmm. I'm confused. Maybe I'm missing something?
Why would this change force you to install them outside of your Conda environments? It wouldn't prevent you from doing what you do now (namely, using "Intel-based Miniconda on Mac regardless of actual architecture"). It would force you to manually install/upgrade them separately from Augur itself. |
Yeah, I see how that was confusing; I'm currently a Linux, Mac OS X Intel, and ARM64 user of Augur, so here's how I'm thinking about the experience: If I'm a new/current Linux or Mac OS X Intel user, I would have to manually install/upgrade Augur and its compiled dependencies separately. If I'm a new/current Mac OS X ARM64 user, I would have to manually install/upgrade Augur in a Conda environment and then manually install/upgrade the dependencies outside of the Conda environment with Homebrew, etc. For all architectures, this setup means I need to know that the additional compiled dependencies have to be installed separately and I need to know what those dependencies are and which versions to install. When I install Augur from Conda, I don't necessarily know that it can't/won't install the required compiled dependencies for me unless I'm following the docs or I try to run For the Linux and Max OS X Intel users, the compiled dependencies would at least still be in a Conda environment. The new setup would be a downgrade of the current user experience where I can just install Augur on both of those systems and all dependencies are managed for me. For the ARM64 users, their dependency management is much more complicated and loses the benefits of self-managed Conda environments. If we propose in the docs that ARM64 users install Augur with ARM64 Conda, they would not think they have an option to avoid this complexity. The experience for the ARM64 user right now is not good at all, since they can't install Augur with Conda at all and don't get any explanation from Conda about why not. The proposed solution of removing the compiled dependencies from the Bioconda Augur recipe marginally improves the situation by allowing them to install an incomplete installation of Augur. This comes at the cost of a poorer user experience for Linux and Mac OS X Intel users (everyone gets an incomplete installation of Augur). It seems like there are two other alternate installation paths for ARM64 users that don't degrade the experience for non-ARM64 users:
When Bioconda finally supports ARM64 builds, the problem for ARM64 users goes away and Linux and OS X Intel users don't notice the difference. |
Ok, I think I see your point now: Removing MAFFT, IQ-TREE, etc. deps from the Conda packaging results in a degraded experience for anyone not in an arm64 Conda env, and that isn't made worth it by the gained ability to get an incomplete (and thus not super useful) Augur install in an arm64 Conda env. That makes sense to me. It does make me think that instead of making the Conda package like the Python package and removing these deps, we could actually make the Python package more like the Conda package and bundle fasttree, iqtree, mafft, raxml, or vcftools into the wheels. |
Mixing Channel SubdirectoriesTheoretically, the mamba create -n foo --override-channels -c conda-forge -c conda-forge/osx-64 -c bioconda/osx-64 -c -c bioconda/noarch augur This assumes:
If the system base is osx-64 (but still on Apple Silicon), the meaning of straight mamba create -n foo --override-channels -c conda-forge/osx-arm64 -c conda-forge -c bioconda augur Note that Probably better would be to simply spell out everything explicitly in a YAML: augur-arm64.yaml channels:
- conda-forge/osx-arm64 # prefer native
- conda-forge/noarch # or noarch
- conda-forge/osx-64 # otherwise, emulate
- bioconda/noarch
- bioconda/osx-64
- nodefaults # equivalent of "--override-channels"
dependencies:
- augur mamba env create -n foo -f augur-arm64.yaml I've never really played with this, so suggest it only for experimentation. |
@mfansler interesting idea! I just tried it on my M1. The environment creation was successful, and I can see that packages were pulled from a mix of the Output
However, upon running
From the detailed logs above, the obvious reason is that Even if this were to work, it might be unnecessary hassle for users who are just trying to Thank you for the suggestion though, the capability of conda to solving with mixed archs is good to know! |
Thanks for reporting back the results of your testing! Yes, I wasn't too hopeful and suspected that while Rosetta could handle launching new processes in emulation, it might not cover dynamic library callouts. Particularly problematic would be when two packages of different architectures both link against something like BLAS. The one would have references for the osx-arm64 BLAS and the other for osx-64 BLAS. Since Conda can't install both versions, I'd guess this could result in missing symbols references. Anyway, I suppose it was worth the try! I do find the |
ARM builds are coming, first linux aarch64, but maybe also soon M1: bioconda/bioconda-docs#16 What I do is a little hacky but it works well for me:
|
Augur is now natively installable on micromamba install -c conda-forge -c bioconda augur
Note that |
Nice! Just a minor note: I see the environment installs OpenBLAS by default. From what I've benchmarked, it significantly underperforms Accelerate on osx-arm64, so I would encourage to try including a |
@mfansler thanks for sharing, that is useful info. We don't use BLAS directly in Augur – it is the numpy dependency that could potentially use it. I think the proper solution is for numpy to declare that dependency on
but it seems that maintainers are reluctant to using Accelerate as default due to compatibility issues with SciPy. |
Background
Augur's Bioconda recipe page shows
noarch
, presumably because it is a pure-Python package that can be run natively on any architecture. In practice, this can be useful for running computationally expensive pure-Python Augur subcommands such asaugur filter
.However, installing from Bioconda in an ARM64 context is currently impossible (I thought there might be a chance with
--no-deps
but it didn't work for me). The reason is because the recipe defines the dependencies explicitly, and at least one dependency is not available for ARM64 and the singleconda install
command can only search resolve packages under a single architecture. This forces all Augur subcommands to use emulation when only certain dependencies require emulation.Possible solutions
1. Remove dependencies from Augur's Bioconda recipe
Since these dependencies are only used in a small part of Augur, I'd like to think of them as optional dependencies that can be installed independently. In other words, one would install them separately, not necessarily through Conda, if they want to use those features. This seems to be what the current Augur-specific installation page implies.
This is also how it's done for building the Docker image (1, 2) and anyone that is using Augur via
pip install
.Proposed changes:
2. Bundle required dependencies into Augur's wheels
See #3 (comment).
3. Wait for all dependencies to be available for ARM64
See #3 (comment).
The text was updated successfully, but these errors were encountered: