-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possibility of adding support for Linux for Apple AMX1 and AMX2 #4
Comments
My understanding is that to make this all come together, we'd need the following: Assembler support for apple AMX in LLVM/GCC Followed by any system/software needing support for those 3 things before being able to support development of compute kernels/code using the ISA |
Links to the relevant conversations from Asahi and OpenBLAS folks here: Asahi: https://mast.hpc.social/@fclc/109914828822965657 OpenBLAS: https://twitter.com/FelixCLC_/status/1627404588574818304?s=20 |
This bit is not strictly required; aarch64.h works with unmodified compilers.
On the technical front, you've got extra state that needs saving/restoring on context switches. The political front seems more concerning to the Asahi folk though. |
Talking with a few people at the vendors in question, as well as the Asahi Folks, looks very much to be a political issue, and concerns around what Arm might do in a scorched earth scenario. For now I'm finishing my x86 FP16 work before investing too much time and energy into this. It's worth mentioning that BLIS already has a "research" version of AMX support. |
Hi all,
I'm in the process of researching Apple AMX as a potential way of speeding up IEEE FP BLAS kernels in OpenBLAS.
On the MacOS side, it seems that between this repository and other resources, I have all I need to be able to write the kernels.
The issue as of now is Linux. Speaking with the folks supporting/developing Asahi Linux (see mastodon thread here: https://mast.hpc.social/@fclc/109914828822965657) discussion came up that Asahi has no plans to support the EL0 CPU state required for AMX.
I'm of the opinion that it may be possible to implement a Linux kernel module to allow for the usage of AMX on M1, M2 and the various SKUs based on those SOCs.
This would probably require fairly tight understanding of AMX and its underlying operations.
I was hoping for insight from any of the folks working on this present project.
The text was updated successfully, but these errors were encountered: