Skip to content

Conference notes for 2021 07 06

Robert McLay edited this page Aug 3, 2021 · 1 revision

Agenda

  • Introduction
  • Purpose of these meeting
  • Limited to 1-hour
  • Discuss "ml overview" vs "ml -d av"
  • Poll attendees for other topics of interest

Attendees (15)

  • Robert McLay (TACC)
  • Kenneth Hoste (HPC-UGent)
  • Bennet Fauber (Univ. of Michigan)
  • Lev Gorenstein
  • Maxime Boissonneault (Compute Canada)
  • Jeremy Siadal (Intel)
  • Shahzeb Siddiqui (NERSC?)
  • Zhoufei Hou
  • Benjamin Cronheim (Univ. of Georgia)
  • Cecilia Villaveces
  • Joey Dumont
  • Paul Brunk (Univ. of Georgia)
  • Shan-Ho Tsai
  • Shelly Johnson
  • Matt Thompson

After meeting comments:

  • Thanks for all of you to come to yet another zoom meeting. There will be another meeting August 3, 2021 at 9 am Central (GMT-5)
  • See for Lmod Wiki Home Page zoom link.

avail

  • ideas for "module avail" - "ml -d avail" - only show defaults - Maxime: could make sense to let Lmod only produce output for default versions for "module avail" (configurable) - or only show latest of every module - R: that's a bigger change - could be done by implementing your own logic in a hook - special cases due to funky versioning scheme (cfr. LAMMPS) - new command: "ml overview" - only list of module name (without versions) + version count - same information as "module avail" (modules that can be loaded), but "summarized" - Bennet: how does this relates to "module spider" - R: spider already gives you a list of matching modules - problem at TACC: lots of bioinfo containers - 8k modules, and growing - looking for ways to limit info to produce - also groups of modules that only become visible after loading "bioinfo" module (separate cache) - Shahzeb: output similar to "module avail"? - could be just an option to "module avail"

                      $ module avail
                      GCC/9.3   GCC/10.3   GCC/11.1
    
                      $ module overview
                      GCC (3)
    
                      - Shahzeb: how about condensed view with versions:
    
                              $ module --short avail
                              GCC: 9,3 10.3, 11.1
    
                              - R: this could get complicated with different module naming schemes like <cat>/<version>, etc.
    
  • Maxime: would filtering based on properties be possible?

      $ module avail --prop=gpu
    
      - R: please suggest an interface for how this could look
      - R: list of properties? (colon-separated?)
              $ module avail --proper=gpu:bio
    
              - how about AND vs OR?
              $ module avail --prop=gpu+bio (AND)
              $ module avail --prop=gpu,bio (OR)
    
                      - idea Kenneth: --prop-any vs --prop-all?
    
              - Bennet: special whatis lines to be able to filter based on categories via "module keyword"
    
  • Shahzeb: "module --mt" to show list of active families - is there a better way to get list of all known families? - R: this is probably available in output of "spider" command (not "module spider") - related to Cray: lots of families used there (perftools, hugepages, PrgEnv, ...) - R: please make a specific suggestion of how this could look - via avail or spider? - which question do you want answered? - correlation of module names & families - how to display result?

  • Bennet: how does "categories" of modules fit in with module hierarchy? - (Kenneth: is actually "moduleclass" in EasyBuild) - would be nice to be able to query for modules in a specific category - R: should be an option to keyword? - Bennet: something like "ml keyword --category" ?

  • Bennet: query whether a particular module is also available for another compiler+MPI combo - R: "module spider foo/1.2.3" output tells you this

  • Shahzeb: using Lmod hook to set up properties dynamically - properties to show for each module are determined when running "module list" - based on ComputeCanada approach for families - output only pops up in "module list", not in "module avail" - Maxime: properties are showing up in both "avail" and "list" output for us - need to call same function in both "avail" and "list" hooks - load_hook and spider_hook - load_hook is triggered every time the module file is interpreted by Lmod ("loaded" in the broad sense) - R: you could also define your own function, and call it from all module files - Shahzeb: could be a case of better documentation - R: use "ml -D" to figure out what's going on - Bennet: setting up a small module tree in a GitHub repo to demonstrate something to Robert can help a lot

https://docs.nersc.gov/environment/lmod/#module-properties

elvis@perlmutter> module list

Currently Loaded Modules:
   1) nvidia/20.9     (g,c)   4) libfabric/1.11.0.3.66   7)
perftools-base/21.02.0                    (dev)   10)
PrgEnv-nvidia/8.0.0 (cpe)  13) cray-pmi/6.0.10
   2) craype/2.7.6    (c)     5) craype-network-ofi      8)
xpmem/2.2.40-7.0.1.0_1.9__g1d7a24d.shasta (H)     11) xalt/2.10.2
         14) cray-pmi-lib/6.0.10
   3) craype-x86-rome         6) cray-dsmml/0.1.4        9)
cray-libsci/21.04.1.1                     (math)  12) darshan/3.2.1
  (io)   15) Default

   Where:
    g:     built for GPU
    cpe:   Cray Programming Environment Modules
    math:  Mathematical libraries
    io:    Input/output software
    c:     Compiler
    dev:   Development Tools and Programming Languages
    H:                Hidden Module

https://github.com/ComputeCanada/software-stack-config/blob/main/lmod/SitePackage.lua#L261-L272

local function load_hook(t)
        local valid = validate_license(t)
        set_props(t)
        set_family(t)
        default_module_change_warning(t)
        log_module_load(t,true)
        set_local_paths(t)
end
local function spider_hook(t)
        set_props(t)
        set_local_paths(t)
end
  • Jeremy: conversion of Tcl module files - making isavail available? R: no. - would require to rewrite all of Lmod in Tcl... - should cover the most basic functionality that people use it for - one-name-rule that's baked into Lmod makes things complicated - figuring out where the name vs version of a module file is not trivial - should request Xavier a request to support tryload in Environment Modules 4.x (Tmod) - this would avoid the need for isavail in Lmod entirely... - typical usage of isavail is to check before loading it - Jeremy: couldn't this be implemented by essentially running a "module avail" from the module file (which is what isavail does) - using of isavail in if statement requires that the isavail check has to be done in Tcl itself - can't be done after the conversion of the Tcl module file to Lua - similar problem for functions in Tcl module files - Joey: are the local variables available during the evaluation of the (Tcl) module file? - R: yes - may need a specific example to understand the question better - Tcl module file is evaluated entirely in Tcl - tcl2lua generates print statements to make changes in environment, which are applied by Lmod - Jeremy: is there a workaround to check whether a module can be loaded without actually loading it - can be done in Lua syntax easily, using the isavail Lua function - potential hacks: - implement isavail Tcl function in Lmod: lots of work - call Lua stuff from Tcl, not trivial either - Jeremy: oneAPI includes a bunch of Tcl environment modules - Kenneth: how about providing moduels files in both Tcl and Lua syntax? - Lmod will ignore the Tcl module file if there's a corresponding Lua one - Tmod will ignore Lua module files because they don't include the "#%Module" magic string - seems feasible if module files are generated - Maxime: are there many sites who use vendor-generated module files? - Compute Canada doesn't

  • Bennet: fleshing out reporting of Lmod to syslog - R: good topic for next time? - currently at TACC: module usage data is used to decide which software installs can be cleaned up - Bennet: some custom stuff is being logged by us, like whether "module load" is triggered by a human, or whether load is done by another module - R: there's examples of this in contrib/ directory (contributed by Ward) - Bennet: can information like be included in the syslog messages? - yes, see HPC-UGent (https://github.com/hpcugent/Lmod-UGent/blob/master/SitePackage.lua#L40)

  • same Zoom link for next monthly meeting - see Lmod wiki: https://github.com/TACC/Lmod/wiki - planning to set up a hackmd.io document to collect questions up front

  • another monthly meeting will be set up for XALT, probably in the middle of the month

Clone this wiki locally