-
Notifications
You must be signed in to change notification settings - Fork 189
ARM Tech Preview (1.2)
This area houses the info/tips regarding the ARM tech preview included in the OpenHPC 1.2 release (November 2016). This document has been updated to reflect 1.3.1 here: https://github.com/openhpc/ohpc/wiki/ARM-Tech-Preview-(1.3.1)
The provided packages are targeted at 64-bit server platforms, however it is being released as a Tech Preview initially as there are some known issues around provisioning and a subset of development libraries when exercised on the target OS distributions versions and tested hardware platforms.
The information on this page is intended to supplement the aarch64
OpenHPC Installation Guides for SLES-12-SP1 and CentOS-7.2. In particular, the provisioning steps outlined with Warewulf (most of the steps in sections 4.3 thru 4.9) are not directly usable without additional modification to the PXE boot configuration.
- GSL: a small subset of tests performed with the GSL library failed precision related tests. This is currently attributed to the fact that the tests included in GSL are tuned for x86 which does 80-bit extended precision.
- PAPI: hardware counter availability may not be available depending on the underlying ARM platform.
- MPI: available hardware for this Tech Preview release was ethernet only. The available MPI stacks reflect this test environment.
- mpiP: appears to have trouble collecting certain information in certain scenarios causing it to fail integration tests
- Nagios and Ganglia: don't work on SLES-12-SP1 due to missing PHP5 dependencies
- Warewulf: the ARM Standard Base Boot Requirements and Standard Base System Architecture requires specific UEFI support during the boot process which doesn't seem to be compatible with the way warewulf currently provisions worker nodes. There is a work-around, but it requires some manual intervention during installation and deployment of the nodes.
SLES 12 SP1 AARCH64 support was a Beta release which has now been superseded by the commercial release of 64-bit ARM support in SLES 12 SP2. However, in order to match the x86_64 versions of the packages we built and tested against this beta for the 1.2 release of OpenHPC. If you were not already part of the SLES 12 SP1 beta, you may have trouble getting access to the necessary base OS installation. We are in the process of validating that these packages may be used on SLES 12 SP2 and hope to move to the commercially available release in the next version of OpenHPC.
64-bit ARM support has been made available as an altarch variation of the CentOS 7.2 release. There is some information here and ISO images available here. You will also need to acquire access to certain EL7 packages in order to complete a full installation of OpenHPC, these can be found here.
On the three tested system configurations we found disparity with respect to the availability of performance counters. Since all ARM 64-bit hardware has performance counters and PAPI has support for ARM64 performance counter hardware, this is likely a problem with either the kernel or the device tree passed to the kernel from the firmware. You can determine whether or not you have access on your platform's configuration by running papi_avail(1):
# module load papi
# papi_avail
...
Number Hardware Counters : 0 (Xgene Mustang)
-- or --
Number Hardware Counters : 6 (Softiron Seattle)
-- or --
Number Hardware Counters : 0 (Cavium ThunderX)
The other thing to note is that while the ARM Architecture specifies a core set of performance counters, many more may be available depending on the microarchitecture. We are in the process of working with the various silicon partners to make sure support for these is available in PAPI. As we discover workarounds enabling additional counter support for various platforms we will include them here.
Lustre client support has been available for a while on both the 32-bit and 64-bit ARM platforms. However, since different ARM platforms require different kernels than the standard ones found in the SLES-12-SP1 and CentOS-7.2 distributions we couldn't easily build a lustre that would work for specific platform configurations. As better ARM support is added to commercial distributions (like the support now in SLES-12-SP2), this will become easier. For now, you'll have to build your own kernel if you want lustre support.
While both of these libraries build, we discovered anomalies during testing that we have not yet been able to resolve. Once we have a workaround we will include it here.
MVAPICH packages compiled but we did not have InfiniBand hardware support in our testbeds at the time of these release to validate the packages and/or any instructions relating to them. We are working with platform vendors to acquire sufficient hardware to test this in the future. If you have working InfiniBand support on your ARM platform you may be able to get existing libraries to work on your own.
SLES 12 SP1 did not contain PHP5 packages which were required for Nagios and Ganglia, CentOS-7.2 works fine. You may be able to work around this by building PHP5 yourself, or finding a compatible PHP5 package that can be installed on SLES 12 SP1.
Network booting is a bit different on ARM platforms - ARM servers all must use UEFI firmware, so in order to network boot them at the moment you must netboot a GRUB2 EFI netboot image which then tftpboots kernel and RAMFS from the server. It is also important to remember that current ARM servers may use a different kernel than the one provided by a distribution. The best chance of success is to use the kernel and modules that come installed on the server and use those for network booting with a warewulf created ramdisk. Basic PXE boot instructions can be found on the Linaro website: https://wiki.linaro.org/LEG/Engineering/Kernel/UEFI/UEFI_Network_Booting . However, the Linaro instructions are specific to running on ARM emulators, specific instructions for the OpenHPC test platforms follow:
Obtain kernel and modules from your current platform and place in /warewulf/bootstrap directory and /lib/modules You'll need an Image kernel versus a vmlinuz kernel, if your current platform doesn't have one you'll need to build it from source.
Obtain a bootnetaa64.efi GRUB2 image from your distribution or build it yourself and put in directory
-
Install grub2 packages:
% rpm -ihv http://build.openhpc.community/home:/eric/SLE_12_SP1/aarch64/grub2-2.02~beta3-1.1.aarch64.rpm http://build.openhpc.community/home:/eric/SLE_12_SP1/aarch64/grub2-arm64-efi-2.02~beta3-1.1.aarch64.rpm
-
Create a working grub2 EFI binary and copy it into /srv/tftpboot/aarch64/grub.efi:
% grub2-mkimage -O arm64-efi -o grub.efi -p /aarch64/boot/grub2 `ls /usr/lib/grub2/arm64-efi/*.mod | cut -d . -f 1`
-
Edit /srv/tftpboot/aarch64/boot/grub2/grub.cfg, adjust bootstrap path to match your warewulf generated setup
echo Now booting ${net_efinet0_hostname} with Warewulf bootstrap
echo Loading kernel...
linux (tftp)/warewulf/bootstrap/6/Image ro wwhostname=$net_efinet0_hostname quiet wwmaster=$net_default_server \ wwipaddr=$net_efinet0_ip wwnetmask=255.255.0.0 wwnetdev=eth0
echo Done!
echo Loading initrd...
initrd (tftp)/warewulf/bootstrap/6/initfs.gz
echo Done!
# Override for ARM Servers
if substring (option vendor-class-identifier, 15, 5) = "00011" {
filename "aarch64/grub.efi";
}
Right now it doesn't appear ipmi pxe commands effect the UEFI boot configuration settings, so you'll have to interrupt boot on the serial console and configure PXE manually on each worker. This is also a good time to capture the hardware MAC address to give to DHCP and warewulf if you don't know it already.
- Cavium ThunderX uArchitecture, armv8
- ThunderX (version a1)
- 2 socket, 48-core, 128GB of Memory
- Linux Version 4.4.21-64-default
- Tested against SLES-12-SP1 install
- EFI v2.40 by Cavium Thunder cn88xx EFI ThunderX-Firmware-Release-1.22.9-15-gcc66a09 Aug 4 2016 16:55:45
APM X-C1 Server Development Platform (Mustang)
- APM X-Gene uArchitecture, armv8
- APM X-Gene-1
- 1 socket, 8 core, 16GB of memory
- Linux Version 4.4.11-reference.135.aarch64
- Tested against CentOS-7.2 install
- EFI v2.40 by X-Gene Mustang Board EFI Nov 24 2015 13:22:41
- ARM Cortex-A57 uArchitecture, armv8
- AMD Seattle Processor (Rev.B0)
- 1 socket, 8 core, 16GB of Memory
- Linux version 4.4.21-64-default
- Tested against SLES-12-SP1 install
- EFI v2.40 by American Megatrends
Please feel free to email any questions related to this Tech Preview to the OpenHPC mailing list ([email protected] & https://groups.io/g/openhpc-users) and we will endeavor to do our best to answer them and include the response for others to benefit from.