Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Instanovo #51796

Open
wants to merge 19 commits into
base: master
Choose a base branch
from
Open

Add Instanovo #51796

wants to merge 19 commits into from

Conversation

BioGeek
Copy link
Contributor

@BioGeek BioGeek commented Oct 29, 2024

Add instanovo, a package for de novo peptide sequencing


Please read the guidelines for Bioconda recipes before opening a pull request (PR).

General instructions

  • If this PR adds or updates a recipe, use "Add" or "Update" appropriately as the first word in its title.
  • New recipes not directly relevant to the biological sciences need to be submitted to the conda-forge channel instead of Bioconda.
  • PRs require reviews prior to being merged. Once your PR is passing tests and ready to be merged, please issue the @BiocondaBot please add label command.
  • Please post questions on Gitter or ping @bioconda/core in a comment.

Instructions for avoiding API, ABI, and CLI breakage issues

Conda is able to record and lock (a.k.a. pin) dependency versions used at build time of other recipes.
This way, one can avoid that expectations of a downstream recipe with regards to API, ABI, or CLI are violated by later changes in the recipe.
If not already present in the meta.yaml, make sure to specify run_exports (see here for the rationale and comprehensive explanation).
Add a run_exports section like this:

build:
  run_exports:
    - ...

with ... being one of:

Case run_exports statement
semantic versioning {{ pin_subpackage("myrecipe", max_pin="x") }}
semantic versioning (0.x.x) {{ pin_subpackage("myrecipe", max_pin="x.x") }}
known breakage in minor versions {{ pin_subpackage("myrecipe", max_pin="x.x") }} (in such a case, please add a note that shortly mentions your evidence for that)
known breakage in patch versions {{ pin_subpackage("myrecipe", max_pin="x.x.x") }} (in such a case, please add a note that shortly mentions your evidence for that)
calendar versioning {{ pin_subpackage("myrecipe", max_pin=None) }}

while replacing "myrecipe" with either name if a name|lower variable is defined in your recipe or with the lowercase name of the package in quotes.

Bot commands for PR management

Please use the following BiocondaBot commands:

Everyone has access to the following BiocondaBot commands, which can be given in a comment:

@BiocondaBot please update Merge the master branch into a PR.
@BiocondaBot please add label Add the please review & merge label.
@BiocondaBot please fetch artifacts Post links to CI-built packages/containers.
You can use this to test packages locally.

Note that the @BiocondaBot please merge command is now depreciated. Please just squash and merge instead.

Also, the bot watches for comments from non-members that include @bioconda/<team> and will automatically re-post them to notify the addressed <team>.

Copy link
Contributor

coderabbitai bot commented Oct 29, 2024

📝 Walkthrough
📝 Walkthrough

Walkthrough

The pull request introduces an updated meta.yaml file for the instanovo package, which includes essential metadata and build instructions for the Python project. The package is named "instanovo" with a version set to "1.0.0". It specifies the source URL for downloading the package's tarball from PyPI and includes a SHA-256 checksum for integrity verification. The build section indicates that the package is architecture-independent and provides a script for installation via pip, without dependencies or build isolation. The requirements section details both host and runtime dependencies, including specific version constraints for Python and various libraries, with notable updates such as a new dependency on pytorch and a modified version constraint for numpy. The test section outlines commands for verifying the installation and checking for dependency issues. Additional information includes the project's homepage, a summary of its purpose, licensing details, and a list of maintainers, ensuring comprehensive documentation of the package's structure and requirements.

Possibly related PRs

Suggested labels

please review & merge


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@BioGeek
Copy link
Contributor Author

BioGeek commented Oct 29, 2024

@coderabbitai full review

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (1)
recipes/instanovo/meta.yaml (1)

47-54: Consider adding functional tests.

While the current tests verify installation and dependencies, consider adding basic functional tests to ensure core functionality works. For example, test loading a model or processing a small example.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between ba3cde9 and d846578.

📒 Files selected for processing (1)
  • recipes/instanovo/meta.yaml (1 hunks)
🧰 Additional context used
🪛 yamllint
recipes/instanovo/meta.yaml

[error] 1-1: syntax error: found character '%' that cannot start any token

(syntax)

🔇 Additional comments (6)
recipes/instanovo/meta.yaml (6)

12-18: LGTM! Build configuration follows best practices.

The build configuration correctly:

  • Uses noarch: python for platform independence
  • Includes run_exports to prevent ABI incompatibilities
  • Uses appropriate pip installation flags

62-64: LGTM! Maintainer section is properly configured.


56-60: Verify license file existence.

Please ensure that LICENSE.md exists in the source distribution.

✅ Verification successful

License file verification successful

The LICENSE.md file exists in the source distribution at instanovo-1.0.0/LICENSE.md, confirming that the license file reference in meta.yaml is correct.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if LICENSE.md exists in the source distribution
VERSION="1.0.0"
PACKAGE="instanovo"
URL="https://pypi.io/packages/source/${PACKAGE:0:1}/$PACKAGE/$PACKAGE-$VERSION.tar.gz"
curl -sL "$URL" | tar tz | grep -i license

Length of output: 240


8-10: Verify source checksum.

The SHA256 checksum should be verified against the PyPI source package.


1-6: Verify package version compatibility.

The package is being introduced at version 1.0.0. Please ensure this matches the latest stable release on PyPI and the referenced GitHub repository.

🧰 Tools
🪛 yamllint

[error] 1-1: syntax error: found character '%' that cannot start any token

(syntax)


20-24: Consider broadening Python version support.

The current Python version constraint (>=3.10,<3.12) seems restrictive. If possible, consider testing with Python 3.9 to broaden compatibility.

Comment on lines 25 to 45
run:
- click >=8.1.7
- datasets >=3.0.1
- hydra-core >=1.3.2
- jaxtyping >=0.2.34
- jiwer >=3.0.4
- matchms >=0.27.0
- neptune >=1.12.0
- numpy >=1.26.4
- omegaconf >=2.3.0
- pandas >=2.2.3
- polars >=1.9.0
- pyopenms >=3.2.0
- python-dotenv >=1.0.1
- pytorch-lightning >=2.4.0
- s3fs >=2024.6.1
- scikit-learn >=1.5.2
- spectrum_utils >=0.4.2
- tensorboard >=2.18.0
- pytorch >=2.4.1
- tqdm >=4.66.5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Based on the verification results, I can now provide a final response about the dependency availability concerns:

Several dependencies require attention in conda channels

Some required dependencies are not readily available in the standard conda channels:

  • pytorch package name might be pytorch-cpu or pytorch-gpu in conda-forge
  • matchms, spectrum-utils, and pyopenms were not found in conda-forge or bioconda
  • Other core dependencies like numpy, pandas, scikit-learn, pytorch-lightning, jiwer, and jaxtyping are available in conda-forge

Please ensure these packages are available in the specified versions through conda channels or consider providing alternative installation methods in the documentation.

🔗 Analysis chain

Verify dependency availability in conda channels.

Please ensure all runtime dependencies are available in conda-forge or bioconda channels with the specified versions.


Since the sandbox environment doesn't have conda installed, I'll try to verify the package availability through a different approach by checking the conda-forge feedstock repositories which contain the package build recipes.


Let me check the bioconda channel for the missing packages, particularly for the bioinformatics-related ones like matchms, spectrum-utils, and pyopenms.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check availability of key dependencies
for pkg in pytorch-lightning pytorch numpy pandas scikit-learn; do
  echo "Checking $pkg..."
  conda search -c conda-forge $pkg
done

Length of output: 1322


Script:

#!/bin/bash
# Check existence of conda-forge feedstock repositories for the dependencies
for pkg in pytorch-lightning pytorch numpy pandas scikit-learn matchms spectrum-utils pyopenms jiwer jaxtyping; do
  echo "Checking $pkg feedstock..."
  gh repo view conda-forge/${pkg}-feedstock --json name,description,url 2>/dev/null || echo "Not found: $pkg"
done

Length of output: 3695


Script:

#!/bin/bash
# Check existence of bioconda packages
for pkg in matchms spectrum-utils pyopenms; do
  echo "Checking $pkg in bioconda..."
  gh repo view bioconda/bioconda-recipes --json name,description,url 2>/dev/null && \
  rg -l "name:\s*$pkg\s*$" recipes/
done

# Also check for PyTorch which wasn't found in conda-forge
echo "Checking for pytorch package..."
rg -l "name:\s*pytorch\s*$" recipes/

Length of output: 1224

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (1)
recipes/instanovo/meta.yaml (1)

48-56: Consider enhancing test coverage.

The current test configuration is good but could be improved by adding more comprehensive tests.

Consider adding these test commands:

  commands:
    - pip check
    - python -c "import instanovo; print(instanovo.__version__)"
+   - python -c "from instanovo import models"
+   - python -c "from instanovo import data"
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between d846578 and f6a93d4.

📒 Files selected for processing (1)
  • recipes/instanovo/meta.yaml (1 hunks)
🧰 Additional context used
🪛 yamllint
recipes/instanovo/meta.yaml

[error] 1-1: syntax error: found character '%' that cannot start any token

(syntax)

🔇 Additional comments (4)
recipes/instanovo/meta.yaml (4)

1-7: LGTM: Package metadata is well-defined.

The package name and version are correctly specified using Jinja2 templating.

🧰 Tools
🪛 yamllint

[error] 1-1: syntax error: found character '%' that cannot start any token

(syntax)


12-18: LGTM: Build configuration follows best practices.

The build configuration is well-structured:

  • Uses noarch: python for architecture independence
  • Correctly uses --no-deps for pip installation
  • Includes run_exports for version pinning

57-67: LGTM: Package metadata is complete.

The about and extra sections provide all necessary information:

  • License and license file are specified
  • GitHub repository is linked
  • DOI is included
  • Maintainer is listed

8-11: Verify the source URL and checksum.

The source configuration looks correct, but let's verify the URL and checksum.

✅ Verification successful

Source URL and checksum are valid

The SHA256 checksum matches exactly with the one specified in the meta.yaml file, confirming the integrity of the package source configuration.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify PyPI package existence and checksum
curl -s "https://pypi.org/pypi/instanovo/1.0.0/json" | jq -r '.urls[] | select(.packagetype=="sdist") | .digests.sha256'

Length of output: 185

Comment on lines 20 to 46
requirements:
host:
- python >=3.10,<3.12
- setuptools >=69.1.1
- pip
run:
- click >=8.1.7
- datasets >=3.0.1
- hydra-core >=1.3.2
- jaxtyping >=0.2.34
- jiwer >=3.0.4
- matchms >=0.27.0
- neptune >=1.12.0
- numpy >=1.23,<1.27
- omegaconf >=2.3.0
- pandas >=2.2.3
- polars >=1.9.0
- pyopenms >=3.2.0
- python-dotenv >=1.0.1
- pytorch-lightning >=2.4.0
- s3fs >=2024.6.1
- scikit-learn >=1.5.2
- spectrum_utils >=0.4.2
- tensorboard >=2.18.0
- pytorch >=2.4.1
- tqdm >=4.66.5
- eigen >=3.3.4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Address dependency availability issues.

Based on previous review comments and analysis:

  1. Several dependencies are not available in conda channels:
    • matchms
    • spectrum_utils
    • pyopenms
  2. Package name adjustments needed:
    • pytorch should be pytorch-cpu or pytorch-gpu in conda-forge
  3. System dependency considerations:
    • eigen is typically a system-level dependency

Please make the following adjustments:

  1. Add these packages to conda-forge first or provide alternative installation methods
  2. Update pytorch package name according to conda-forge conventions
  3. Consider if eigen should be handled differently as a system dependency

Would you like assistance in creating conda-forge feedstock PRs for the missing packages?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (1)
recipes/instanovo/meta.yaml (1)

47-54: Consider adding more comprehensive tests.

While the current tests verify basic installation and importing, consider adding:

  1. Basic functionality tests
  2. Command-line interface tests if applicable
  3. Key feature verification tests

Would you like assistance in generating additional test commands?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between f6a93d4 and 5b33ce3.

📒 Files selected for processing (1)
  • recipes/instanovo/meta.yaml (1 hunks)
🧰 Additional context used
🪛 yamllint
recipes/instanovo/meta.yaml

[error] 1-1: syntax error: found character '%' that cannot start any token

(syntax)

🔇 Additional comments (4)
recipes/instanovo/meta.yaml (4)

1-6: LGTM!

The package metadata section follows conda-forge best practices with proper templating.

🧰 Tools
🪛 yamllint

[error] 1-1: syntax error: found character '%' that cannot start any token

(syntax)


8-10: LGTM!

The source configuration correctly specifies the PyPI package source with proper checksum verification.


56-66: LGTM!

The package information section is complete with all required fields, proper license specification, and citation information.


16-17: Consider if run_exports is necessary for this package.

The run_exports section with max_pin="x.x" might be unnecessarily strict for a Python package. This is typically used for compiled libraries where ABI compatibility is important. For pure Python packages, this level of version pinning is usually not required and might make dependency resolution more difficult.

Comment on lines 25 to 45
run:
- click >=8.1.7
- datasets >=3.0.1
- hydra-core >=1.3.2
- jaxtyping >=0.2.34
- jiwer >=3.0.4
- matchms >=0.27.0
- neptune >=1.12.0
- numpy >=1.23,<1.27
- omegaconf >=2.3.0
- pandas >=2.2.3
- polars >=1.9.0
- pyopenms >=3.2.0
- python-dotenv >=1.0.1
- pytorch-lightning >=2.4.0
- s3fs >=2024.6.1
- scikit-learn >=1.5.2
- spectrum_utils >=0.4.2
- tensorboard >=2.18.0
- pytorch-gpu >=2.4.1
- tqdm >=4.66.5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

⚠️ Potential issue

Package availability and PyTorch specification need attention

The verification confirms critical issues with the package requirements:

  • The following packages are not available in conda-forge:

    • matchms
    • spectrum_utils
    • pyopenms
    • pytorch-gpu (this specific name is not a valid conda-forge package)
  • For PyTorch:

    • Neither pytorch-gpu nor the main pytorch package exists as conda-forge feedstocks
    • Only pytorch-cpu is found in conda-forge
    • The current specification pytorch-gpu >=2.4.1 will not work

Recommendations:

  1. Use PyPI versions for matchms, spectrum_utils, and pyopenms by adding them to pip requirements
  2. Replace pytorch-gpu with the correct package name from the channel where you intend to source it (likely pytorch from the pytorch channel)
🔗 Analysis chain

Address dependency availability and specification issues.

  1. Several dependencies are not available in the conda channels:

    • matchms
    • spectrum_utils
    • pyopenms
  2. The pytorch-gpu specification might be too restrictive:

    • Some users might prefer CPU-only installations
    • Consider using a metapackage or providing CPU/GPU variants

Consider these solutions:

  1. First ensure these packages are available in conda-forge:
    • Submit feedstock PRs for missing packages
    • Or use PyPI fallback for these specific packages
  2. For PyTorch, either:
    • Use pytorch as a metapackage
    • Or create CPU/GPU variants of this recipe

Let me verify the PyTorch package availability and naming conventions in conda-forge to provide a complete response.


Let me check for the main PyTorch package to complete the verification.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify package availability in conda-forge
for pkg in matchms spectrum_utils pyopenms; do
  echo "Checking $pkg in conda-forge..."
  gh repo view conda-forge/${pkg}-feedstock --json name,description,url 2>/dev/null || echo "Not found: $pkg"
done

Length of output: 771


Script:

#!/bin/bash
# Check pytorch packages in conda-forge
echo "Checking pytorch packages..."
gh search repos "pytorch-feedstock" --owner conda-forge --json name,description,url

# Check if pytorch-gpu exists as a separate package
echo -e "\nChecking pytorch-gpu specifically..."
gh repo view conda-forge/pytorch-gpu-feedstock --json name,description,url 2>/dev/null || echo "pytorch-gpu feedstock not found"

Length of output: 5521


Script:

#!/bin/bash
# Check main pytorch package
echo "Checking main pytorch package..."
gh repo view conda-forge/pytorch-feedstock --json name,description,url 2>/dev/null || echo "pytorch feedstock not found"

Length of output: 212

@hechth
Copy link
Contributor

hechth commented Nov 6, 2024

@hechth for reference

@hechth
Copy link
Contributor

hechth commented Nov 6, 2024

@BioGeek I've updated matchms' dependencies to newer versions (py>3.10, numpy>2, numba>0.60). Currently I'm waiting for sparsestack to get updated on conda-forge, then I will make the next release for matchms with the newer dependencies and then will update the matchms bioconda package. Then you can base your package on matchms >=0.28.0 and I hope things should be fine.

@BioGeek
Copy link
Contributor Author

BioGeek commented Nov 6, 2024

@hechth Thanks for the support!

recipes/instanovo/meta.yaml Outdated Show resolved Hide resolved
recipes/instanovo/meta.yaml Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 5b33ce3 and a5bff4f.

📒 Files selected for processing (1)
  • recipes/instanovo/meta.yaml (1 hunks)
🧰 Additional context used
🪛 yamllint
recipes/instanovo/meta.yaml

[error] 1-1: syntax error: found character '%' that cannot start any token

(syntax)

🔇 Additional comments (5)
recipes/instanovo/meta.yaml (5)

1-7: LGTM: Package metadata is well-structured.

The package name and version are correctly defined using Jinja2 templating.

🧰 Tools
🪛 yamllint

[error] 1-1: syntax error: found character '%' that cannot start any token

(syntax)


8-11: LGTM: Source configuration is properly defined.

The PyPI source URL and SHA256 checksum are correctly specified.


12-18: LGTM: Build configuration follows best practices.

The build section correctly specifies:

  • Platform independence with noarch: python
  • Proper pip installation command
  • run_exports to prevent ABI breakage

47-55: LGTM: Test configuration is comprehensive.

The test section properly includes:

  • Package import verification
  • Dependency checks via pip
  • Version verification

56-66: LGTM: Package information is complete.

The about and extra sections include all necessary metadata:

  • Project homepage and summary
  • License information
  • Maintainer
  • DOI identifier

Comment on lines 20 to 46
requirements:
host:
- python >=3.10,<3.12
- setuptools >=69.1.1
- pip
run:
- click >=8.1.7
- datasets >=3.0.1
- hydra-core >=1.3.2
- jaxtyping >=0.2.34
- jiwer >=3.0.4
- matchms >=0.28.1
- neptune >=1.12.0
- numpy >=2.0.0
- omegaconf >=2.3.0
- pandas >=2.2.3
- polars >=1.9.0
- pyopenms >=3.2.0
- python-dotenv >=1.0.1
- pytorch-lightning >=2.4.0
- s3fs >=2024.6.1
- scikit-learn >=1.5.2
- spectrum_utils >=0.4.2
- tensorboard >=2.18.0
- pytorch-gpu >=2.4.1
- tqdm >=4.66.5

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Critical: Several dependencies need attention.

Based on previous verification:

  1. The following packages are not available in conda channels:

    • matchms
    • spectrum_utils
    • pyopenms
  2. pytorch-gpu is not a valid conda-forge package name. Consider:

    • Using pytorch with CUDA variants
    • Or specifying the correct channel for GPU support

Consider these solutions:

  1. First ensure these packages are available in conda-forge:
    • Submit feedstock PRs for missing packages
    • Or use PyPI fallback for these specific packages
  2. For PyTorch, either:
    • Use pytorch as the package name
    • Or specify the correct channel for GPU support

Would you like assistance in:

  • Creating conda-forge feedstock PRs for the missing packages?
  • Updating the PyTorch dependency specification?

@hechth
Copy link
Contributor

hechth commented Nov 7, 2024

@BioGeek seems like pyOpenMS is also on numpy < 2 - do you really need that library?

@hechth
Copy link
Contributor

hechth commented Nov 7, 2024

xref #51992

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (1)
recipes/instanovo/meta.yaml (1)

48-56: Consider enhancing test coverage.

While the basic import and version checks are good, consider adding functional tests to verify core features, especially for a scientific package.

Example additions:

 test:
   imports:
     - instanovo
   commands:
     - pip check
     - python -c "import instanovo; print(instanovo.__version__)"
+    # Verify core functionality
+    - python -c "from instanovo import __version__; assert __version__ == '{{ version }}'"
+    - python -c "from instanovo.model import InstaNovo"
   requires:
     - pip
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between a5bff4f and ac53d18.

📒 Files selected for processing (1)
  • recipes/instanovo/meta.yaml (1 hunks)
🧰 Additional context used
🪛 yamllint
recipes/instanovo/meta.yaml

[error] 1-1: syntax error: found character '%' that cannot start any token

(syntax)

🔇 Additional comments (3)
recipes/instanovo/meta.yaml (3)

1-10: LGTM! Package metadata is well-defined.

The package metadata follows conda-forge conventions with proper Jinja2 templating and includes all required fields.

🧰 Tools
🪛 yamllint

[error] 1-1: syntax error: found character '%' that cannot start any token

(syntax)


57-67: LGTM! Package information is complete.

The about section includes all required fields with appropriate metadata, license information, and maintainer details.


20-47: ⚠️ Potential issue

Critical: Address dependency availability and GPU requirements.

Several critical issues need attention:

  1. Missing packages in conda channels:

    • matchms
    • spectrum_utils
    • pyopenms
  2. GPU dependencies:

    • pytorch-gpu is not a valid conda-forge package
    • Forcing GPU dependencies (cudatoolkit) may not be suitable for all users

Suggested changes:

  1. Replace GPU-specific dependencies with more flexible options:
-    - pytorch-gpu >=2.4.1
-    - cudatoolkit
+    - pytorch >=2.4.1
+    - cudatoolkit # [cuda]
  1. For the missing packages, either:
    • Submit them to conda-forge first, or
    • Add them as pip dependencies in the build script

Would you like assistance in:

  • Creating conda-forge feedstock PRs for the missing packages?
  • Implementing a more flexible GPU/CPU configuration?

Comment on lines +12 to +18
build:
noarch: python
script: {{ PYTHON }} -m pip install . -vv --no-deps --no-build-isolation
number: 0
run_exports:
- {{ pin_subpackage(name | lower, max_pin="x.x") }}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove unnecessary run_exports section.

The run_exports section is typically used for C/C++ libraries to ensure ABI compatibility. For a pure Python package (noarch: python), this section is unnecessary and could cause unintended version pinning issues.

Apply this diff to remove the section:

 build:
   noarch: python
   script: {{ PYTHON }} -m pip install . -vv --no-deps --no-build-isolation
   number: 0
-  run_exports:
-    - {{ pin_subpackage(name | lower, max_pin="x.x") }}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
build:
noarch: python
script: {{ PYTHON }} -m pip install . -vv --no-deps --no-build-isolation
number: 0
run_exports:
- {{ pin_subpackage(name | lower, max_pin="x.x") }}
build:
noarch: python
script: {{ PYTHON }} -m pip install . -vv --no-deps --no-build-isolation
number: 0

@hechth
Copy link
Contributor

hechth commented Nov 7, 2024

@BioGeek seems like the problem is now somewhere with cuda - maybe @bgruening can confirm or has an idea of how to fix this? I did not wrap any packages using CUDA before.

Add TODO comments

dependencies of matchms have been updated

Remove duplicate python lines, remove uper version bound

Add upper python bound again

unpin all packages

re-add pinned packages, add eigen, add doi

remove eigen, use pytorch-gpu

Bump version of matchms to 0.28.0
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between ac53d18 and 01c40d1.

📒 Files selected for processing (1)
  • recipes/instanovo/meta.yaml (1 hunks)
🧰 Additional context used
🪛 yamllint
recipes/instanovo/meta.yaml

[error] 1-1: syntax error: found character '%' that cannot start any token

(syntax)

- hydra-core >=1.3.2
- jaxtyping >=0.2.34
- jiwer >=3.0.4
- matchms >=0.28.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Critical: Unavailable dependencies in conda channels

The following dependencies are not available in conda-forge or bioconda channels:

  • matchms
  • pyopenms
  • spectrum_utils

These packages need to be added to conda-forge first. Until then, consider:

  1. Submitting feedstock PRs for these packages to conda-forge
  2. Or using PyPI fallback for these specific packages

Would you like assistance in creating conda-forge feedstock PRs for these packages?

Also applies to: 37-37, 42-42

- scikit-learn >=1.5.2
- spectrum_utils >=0.4.2
- tensorboard >=2.18.0
- pytorch >=2.4.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Update PyTorch and CUDA configuration

Current issues:

  1. pytorch version >=2.4.1 is not yet available
  2. cudatoolkit should have a version constraint
  3. Consider making CUDA optional for CPU-only installations

Recommendations:

  1. Use a more conservative PyTorch version that's available in conda-forge
  2. Add version constraint for cudatoolkit
  3. Consider splitting into CPU/GPU variants

Apply this diff to update the configuration:

-    - pytorch >=2.4.1
-    - cudatoolkit
+    - pytorch >=2.0.0
+    - cudatoolkit >=11.8

Also applies to: 46-46

Comment on lines +16 to +17
run_exports:
- {{ pin_subpackage(name | lower, max_pin="x.x") }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove unnecessary run_exports section

The run_exports section is typically used for C/C++ libraries to ensure ABI compatibility. For a pure Python package (noarch: python), this section is unnecessary and could cause unintended version pinning issues.

Apply this diff to remove the section:

 build:
   noarch: python
   script: {{ PYTHON }} -m pip install . -vv --no-deps --no-build-isolation
   number: 0
-  run_exports:
-    - {{ pin_subpackage(name | lower, max_pin="x.x") }}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
run_exports:
- {{ pin_subpackage(name | lower, max_pin="x.x") }}
build:
noarch: python
script: {{ PYTHON }} -m pip install . -vv --no-deps --no-build-isolation
number: 0

@BioGeek
Copy link
Contributor Author

BioGeek commented Nov 7, 2024

@hechth , I tried with using pytorch instead of pytorch-gpu , but that doesn't seem to have helped either.

Thanks for your support so far already!

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (1)
recipes/instanovo/meta.yaml (1)

31-42: Address dependency availability in conda channels.

Several dependencies are not available in conda channels:

  • matchms
  • spectrum_utils
  • pyopenms

Options to resolve this:

  1. Submit these packages to conda-forge first (recommended)
  2. Use PyPI fallback for these specific packages
  3. Wait for the updated matchms package (as mentioned by hechth in PR comments)

Would you like assistance in:

  1. Creating conda-forge feedstock PRs for the missing packages?
  2. Setting up a staged deployment plan to coordinate with the matchms update?
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 01c40d1 and 177596f.

📒 Files selected for processing (1)
  • recipes/instanovo/meta.yaml (1 hunks)
🧰 Additional context used
🪛 yamllint
recipes/instanovo/meta.yaml

[error] 1-1: syntax error: found character '%' that cannot start any token

(syntax)

🔇 Additional comments (5)
recipes/instanovo/meta.yaml (5)

1-10: LGTM! Package metadata is well-defined.

The package metadata section is correctly configured with proper templating and source verification.

🧰 Tools
🪛 yamllint

[error] 1-1: syntax error: found character '%' that cannot start any token

(syntax)


12-18: Remove unnecessary run_exports section.

The run_exports section is typically used for C/C++ libraries to ensure ABI compatibility. For a pure Python package (noarch: python), this section is unnecessary.


48-56: LGTM! Test configuration is appropriate.

The test section includes necessary checks for package installation and importing.


57-67: LGTM! Package information is complete.

The about section contains all necessary metadata including license, homepage, and maintainer information.


44-46: ⚠️ Potential issue

Update PyTorch and CUDA configuration.

Current issues:

  1. PyTorch version 2.4.1 is not yet available
  2. cudatoolkit should be optional and version-constrained

Recommendations:

-    - pytorch >=2.4.1
-    - cudatoolkit
+    - pytorch >=2.0.0
+    - cudatoolkit >=11.8  # Optional: Add selectors for GPU/CPU variants

Consider splitting into CPU/GPU variants to support both use cases.

Likely invalid or redundant comment.

- jiwer >=3.0.4
- matchms >=0.28.0
- neptune >=1.12.0
- numpy ==2.1.0rc1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Update numpy version constraint.

Using a release candidate version (2.1.0rc1) in production is risky. Consider using a stable version:

-    - numpy ==2.1.0rc1
+    - numpy >=2.0.0
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- numpy ==2.1.0rc1
- numpy >=2.0.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants