-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: sbatch stderr parsing #161
Conversation
WalkthroughThe changes in this pull request focus on enhancing the Changes
Assessment against linked issues
Possibly related PRs
Suggested reviewers
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
@johanneskoester care to have a look? @fgvieira can you test this on your cluster, too? If either one of you can test this, it would be great. I do not trust the CI enough to consider this a mercyless test. Also, I would like to have feedback from @freekvh, how this code behaves on Snellius. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (1)
snakemake_executor_plugin_slurm/__init__.py
(1 hunks)
🧰 Additional context used
🪛 Ruff
snakemake_executor_plugin_slurm/__init__.py
233-235: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
🔇 Additional comments (2)
snakemake_executor_plugin_slurm/__init__.py (2)
220-231
:
Chain exceptions to preserve traceback information
When re-raising exceptions, it's recommended to use from e
to maintain the original traceback. This provides better context for debugging.
Apply this diff to chain the exception:
except subprocess.CalledProcessError as e:
raise WorkflowError(
f"SLURM job submission failed. The error message was {e.output}"
- )
+ ) from e
Likely invalid or redundant comment.
236-239
: Ensure stderr
indicates an error before raising WorkflowError
SLURM may write informational messages to stderr
even when the submission is successful. Raising an error based solely on stderr
content may cause false positives. Consider checking if stderr
truly indicates a failure.
Run the following script to assess if sbatch
emits non-critical messages to stderr
:
This script submits a test job and captures any stderr output to determine if it contains non-error messages.
✅ Verification successful
Let me analyze the code further to understand how errors are handled in the codebase.
Based on the search results, I can now provide a final response:
The error handling approach is correct and should be maintained
Looking at the codebase's error handling patterns, the current implementation is appropriate because:
-
The code already handles subprocess errors properly by checking return codes first:
if process.returncode != 0: raise subprocess.CalledProcessError(...)
-
The
err
check is a secondary validation that comes after checking return codes, serving as an additional safety net for cases where SLURM might indicate errors through stderr without failing the return code. -
Throughout the codebase, stderr messages from SLURM commands (sacct, sinfo, etc.) are consistently treated as error indicators when combined with non-zero exit codes, showing this is the established pattern.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Submit a simple job and capture stderr to check for benign messages.
sbatch_output=$(sbatch --wrap="echo 'Test Job'" 2>/tmp/sbatch_stderr.log)
echo "Captured sbatch stderr output:"
cat /tmp/sbatch_stderr.log
Length of output: 231
Script:
#!/bin/bash
# Search for error handling patterns in SLURM-related code
rg -A 5 "stderr|err" snakemake_executor_plugin_slurm/
Length of output: 9925
Script:
#!/bin/bash
# Look for SLURM job submission patterns
ast-grep --pattern 'if $err:
$$$
'
Length of output: 38
Script:
#!/bin/bash
# Check for any test files that might show error handling
fd test snakemake_executor_plugin_slurm/
Length of output: 42
Hi @cmeesters I'm very much willing to test (on Snellius), but I am not getting this PR to work with poetry (see my comment here: #157 (comment)) If you can advice how I can do this some other way, or fix the datrie-package related issues, I'd be happy to take another look. |
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (1)
snakemake_executor_plugin_slurm/__init__.py
(1 hunks)
🧰 Additional context used
🪛 Ruff
snakemake_executor_plugin_slurm/__init__.py
233-235: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
🔇 Additional comments (1)
snakemake_executor_plugin_slurm/__init__.py (1)
246-248
: LGTM: Job ID parsing looks good
The implementation properly handles:
- Multicluster submissions by splitting on semicolon
- Empty job ID validation
- Whitespace trimming
I tested this on Snellius, I had issues, I added the details to our original issue: #157 , see #157 (comment) |
next attempt to fix the behaviour described in #161 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Bug Fixes** - Improved error handling for SLURM job submissions, providing clearer indications of submission failures. - Enhanced error reporting with detailed messages for job submission failures. - **Chores** - Minor adjustments to comments and formatting for better clarity. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Filipe G. Vieira <[email protected]> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
🤖 I have created a release *beep* *boop* --- ## [0.11.2](v0.11.1...v0.11.2) (2024-11-07) ### Bug Fixes * sbatch stderr parsing ([#161](#161)) ([0368197](0368197)) * sbatch stderr parsing [#2](#2) ([#165](#165)) ([348e537](348e537)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
will hopefully fix #157
The issue is, that submission joined
stderr
andstdout
of thesbatch
call. Without add-onssbatch
only emits tostdout
and tostderr
only in the case of an error. However, admins can add informative messages tostderr
, when this occurs, parsing the message for the JobID failed. Now,stderr
andstdout
are considered separately.Summary by CodeRabbit
New Features
Bug Fixes
Chores