-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explain how to read record of jobs from the database #80
Comments
ArchitectureThe job scheduling daemons and the applications of the user interfaces in Slurm are all synchronizing the system state through a database. The SlurmBDB (Slurm Database Daemon) is an application interface between the database and the services and application that access the database data. The reasons behind the use of
Consider for instance the There is a single database for all clusters on site (UL HPC follows the same architecture). This configuration allows for cross access across the clusters, as for instance in applications that require submission of jobs and job dependencies across clusters. The back end is a MySQL database (the default and most tested); PostgreSQL is also supported, but with a few front end features missing. Data access patternsAll
Always update the Slurm database daemon ( Data schemeThe basic unit in the Slurm data base is an association, a combination of cluster, account, user name, and (optionally) partition name. Data are maintained by association. For instance, each association can have different limits assigned to it.
Association and account managementAccount names are hierarchical, but no account name can be repeated (i.e. no cycles). Coordinators can manage users lower in the hierarchy. Resources are inherited from parent accounts. The command to manage accounts and associations is
Fair Share SchedulingApart from hard absolute limits for association, the fair share scheduling plugin allows for a more flexible management of access to resources. Soft limits, for instance setting the available resource for an association to a fraction of the total available, will allow jobs scheduled for the association to the scheduled up to the limit, and when the limit is exceeded, the priority of all further scheduled jobs decreases. Resources: Tools of Slurm to monitor jobsSlurm provides a set of tools to monitor and analyze jobs during their execution and after their completion.
Derived scriptsThe UL HPC provides some convenient shortcuts for a few common commands.
Monitoring tools |
The command
can be used to print details of active and past jobs from the database. The relevant section should be extended to explain the meaning of the reported fields.
The text was updated successfully, but these errors were encountered: