Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test the job grouping capability of CiGri #14

Open
jgaida opened this issue Jul 10, 2015 · 1 comment
Open

Test the job grouping capability of CiGri #14

jgaida opened this issue Jul 10, 2015 · 1 comment
Assignees
Milestone

Comments

@jgaida
Copy link

jgaida commented Jul 10, 2015

CiGri provides two options for grouping task into OAR jobs:

  • dimensional_grouping: allow to execute several jobs in parallel in a
    single submission if possible
  • temporal_grouping: allow to execute several jobs one after the other
    in a single submission. The number of jobs is computed automatically
    by Cigri

Before advising users to use this feature, we should check if these options work as expected.

@jgaida jgaida self-assigned this Jul 10, 2015
@jgaida jgaida added this to the ASAP milestone Jul 10, 2015
@jgaida
Copy link
Author

jgaida commented Sep 29, 2015

I investigated this issue and here is what I found.

  • both dimensional_grouping and temporal_grouping are documented in doc_jdl.rst.
  • dimensional_grouping is not implemented (cf. comments in lib/cigri-joblib.rb)
  • temporal_grouping seems to be implemented by submit_batch_job (in lib/cigri-joblib.rb) but I was unable to use it.

For temporal_grouping, my campaign script looks like this:

  {
    "name": "Some campaign",
    "nb_jobs": 40,

    "resources": "nodes=1",
    "properties": "",
    "exec_file": "$HOME/script.sh",
    "exec_directory": "$HOME",
    "temporal_grouping": "yes",

    "clusters": {
      "nancy": {}
    }

The temporal_grouping property is properly inserted into the database:

cigri=# select * from campaign_properties where campaign_id=90;
  id  | cluster_id | campaign_id |          name           | value
------+------------+-------------+-------------------------+-----------------

 5743 |          5 |          90 | checkpointing_type      | None
 5744 |          5 |          90 | dimensional_grouping    | false
 5745 |          5 |          90 | exec_file               |
$HOME/script.sh
 5746 |          5 |          90 | output_gathering_method | None
 5747 |          5 |          90 | properties              |
 5748 |          5 |          90 | resources               | nodes=1
 5749 |          5 |          90 | temporal_grouping       | yes
 5750 |          5 |          90 | walltime                | 03:00:00
 5751 |          5 |          90 | type                    | best-effort
 5752 |          5 |          90 | test_mode               | false
 5753 |          5 |          90 | project                 |
 5754 |          5 |          90 | exec_directory          | $HOME
(12 rows)

After that, the submit_batch_job function is supposed to do the actual grouping. This function is called by submit2if temporal_grouping is within the option list. Inside the loop by_options_jobs.each do |runner_options,jobs| of submit2, I only get the following options:

{"besteffort"=>true, "batch_id"=>2}. 

I thinks those options are related to the default configuration of the runner and are not related to the campaign properties (?) but bottom line, the function submit_batch_job is not called (as temporal_grouping is absent from the option list).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant