Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

first push for sema-check feature #125

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Conversation

cedricdcc
Copy link
Member

@cedricdcc cedricdcc commented Oct 9, 2024

tackles issue #111 , #112

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced a new HTML file (test_results.html) for structured display of test results.
    • Added a new HTML file (results.html) to present test results in a tabular format.
    • Enhanced YAML entries to represent test outcomes, confirming successful and failed executions.
    • Implemented a command-line interface for running tests with specified input and output formats (CSV, HTML, YAML).
  • Documentation

    • Updated README.md with details on YAML format for test configurations and added a flowchart for the testing process.
  • Tests

    • Added comprehensive unit tests for the service and sink modules, ensuring functionality and output accuracy.

Copy link
Contributor

coderabbitai bot commented Oct 9, 2024

📝 Walkthrough

Walkthrough

The pull request introduces several new files and functionalities related to a testing framework. Key additions include a structured HTML file (results.html) and a YAML file (results.yml) to represent test results, along with a command-line interface in __main__.py for executing tests. The check.py file handles YAML file loading and test execution, while various sink modules (csv_sink.py, html_sink.py, yml_sink.py) manage output formatting. Additionally, new test files validate the functionalities of these components.

Changes

File Path Change Summary
results.html New file created with a structured HTML table for displaying test results.
results.yml Two new entries added representing successful test outcomes.
sema/check/README.md New section added detailing YAML format for test configurations, description of CheckBase dataclass, and flowchart.
sema/check/init.py New file created to initialize the module and import various components.
sema/check/main.py New file created for CLI functionality, parsing command-line arguments for test execution.
sema/check/check.py New file introducing Check and CheckResult classes for processing tests and managing results.
sema/check/base.py New file defining CheckResult and CheckBase classes for encapsulating test results and abstracting check implementations.
sema/check/implementations/test_example.py New file containing ExampleTest class for executing tests and returning results.
sema/check/sinks/csv_sink.py New file containing write_csv function for outputting results in CSV format.
sema/check/sinks/html_sink.py New file containing write_html function for outputting results in HTML format.
sema/check/sinks/yml_sink.py New file containing write_yml function for outputting results in YAML format.
tests/check/out/test_results.html New file created presenting test results in a tabular HTML format.
tests/check/test_check_service.py New file introducing unit tests for loading YAML files and running checks.
tests/check/test_sema_check_cli.py New file introducing tests for the CLI of the sema.check module.
tests/check/test_sinks.py New file containing unit tests for the output functions in the sinks module.

Possibly related PRs

  • Kinda fixes #66 #124: The changes in this PR involve modifications to the testing framework, which may indirectly relate to the overall structure of test results, but do not directly connect to the creation of the results.html file or the entries in results.yml.

Suggested reviewers

  • bulricht
  • mpo-vliz

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@cedricdcc cedricdcc changed the title first push for sema feature first push for sema-check feature Oct 9, 2024
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 64

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE

📥 Commits

Files that changed from the base of the PR and between b1edc54 and bcf933e.

⛔ Files ignored due to path filters (2)
  • results.csv is excluded by !**/*.csv
  • tests/check/out/output.csv is excluded by !**/*.csv
📒 Files selected for processing (21)
  • results.html (1 hunks)
  • results.yml (1 hunks)
  • sema/check/README.md (1 hunks)
  • sema/check/init.py (1 hunks)
  • sema/check/main.py (1 hunks)
  • sema/check/check.py (1 hunks)
  • sema/check/service.py (1 hunks)
  • sema/check/sinks/init.py (1 hunks)
  • sema/check/sinks/csv_sink.py (1 hunks)
  • sema/check/sinks/html_sink.py (1 hunks)
  • sema/check/sinks/yml_sink.py (1 hunks)
  • sema/check/testing/base.py (1 hunks)
  • sema/check/testing/test_example.py (1 hunks)
  • sema/subyt/main.py (1 hunks)
  • tests/check/out/output.yml (1 hunks)
  • tests/check/out/test_results.html (1 hunks)
  • tests/check/test_files/good_yaml.yml (1 hunks)
  • tests/check/test_files/should not be processed.txt (1 hunks)
  • tests/check/test_sema_check_cli.py (1 hunks)
  • tests/check/test_service.py (1 hunks)
  • tests/check/test_sinks.py (1 hunks)
🧰 Additional context used
🪛 Markdownlint
sema/check/README.md

1-1: null
First line in a file should be a top-level heading

(MD041, first-line-heading, first-line-h1)

🪛 Ruff
sema/check/__main__.py

4-4: sys imported but unused

Remove unused import: sys

(F401)


49-49: Missing type annotation for *args_list

(ANN002)


50-50: Logging statement uses f-string

(G004)


58-58: Do not catch blind exception: Exception

(BLE001)


59-59: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


59-59: Logging statement uses f-string

(G004)


63-63: Missing return type annotation for public function main

Add return type annotation: None

(ANN201)


63-63: Missing type annotation for function argument args

(ANN001)


64-64: Logging statement uses f-string

(G004)


66-66: f-string without any placeholders

Remove extraneous f prefix

(F541)


66-66: Logging statement uses f-string

(G004)


67-67: Logging statement uses f-string

(G004)

sema/check/check.py

12-12: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)


12-12: Missing type annotation for self in method

(ANN101)


16-16: Missing type annotation for self in method

(ANN101)


23-23: Missing type annotation for self in method

(ANN101)


35-35: Logging statement uses f-string

(G004)


35-35: Trailing comma missing

Add trailing comma

(COM812)


38-38: Use of assert detected

(S101)


39-39: Use of assert detected

(S101)


44-44: Missing type annotation for self in method

(ANN101)


48-48: Logging statement uses f-string

(G004)


55-55: Logging statement uses f-string

(G004)


57-57: Do not catch blind exception: Exception

(BLE001)


58-58: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


58-58: Logging statement uses f-string

(G004)


61-61: return inside finally blocks cause exceptions to be silenced

(B012)


61-61: Don't use return in try-except and finally

(SIM107)

sema/check/service.py

5-5: .testing.base.TestBase imported but unused

Remove unused import: .testing.base.TestBase

(F401)


13-13: Missing return type annotation for public function load_yaml_files

(ANN201)


13-13: Missing type annotation for function argument input_folder

(ANN001)


14-14: Logging statement uses f-string

(G004)


18-18: Call endswith once with a tuple

Merge into a single endswith call

(PIE810)


22-22: Logging statement uses f-string

(G004)


24-24: Unnecessary open mode parameters

Remove open mode parameters

(UP015)


30-30: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


30-30: Logging statement uses f-string

(G004)


32-32: Logging statement uses f-string

(G004)


32-32: Redundant exception object included in logging.exception call

(TRY401)


33-33: Logging statement uses f-string

(G004)


37-37: Missing return type annotation for public function instantiate_test

(ANN201)


37-37: Missing type annotation for function argument rule

(ANN001)


38-38: Logging statement uses f-string

(G004)


40-40: Logging statement uses f-string

(G004)


48-48: Unnecessary else after return statement

Remove unnecessary else

(RET505)


49-49: Avoid specifying long messages outside the exception class

(TRY003)


52-52: Missing return type annotation for public function run_tests

(ANN201)


52-52: Missing type annotation for function argument input_folder

(ANN001)


54-54: Logging statement uses f-string

(G004)

sema/check/sinks/csv_sink.py

8-8: Missing return type annotation for public function write_csv

Add return type annotation: None

(ANN201)


22-22: Trailing comma missing

Add trailing comma

(COM812)

sema/check/sinks/html_sink.py

7-7: Missing return type annotation for public function write_html

Add return type annotation: None

(ANN201)

sema/check/sinks/yml_sink.py

8-8: Missing return type annotation for public function write_yml

Add return type annotation: None

(ANN201)

sema/check/testing/base.py

17-17: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)


17-17: Missing type annotation for self in method

(ANN101)


30-30: Missing type annotation for self in method

(ANN101)

sema/check/testing/test_example.py

7-7: Missing type annotation for self in method

(ANN101)


11-11: Trailing comma missing

Add trailing comma

(COM812)


18-18: Do not catch blind exception: Exception

(BLE001)

sema/subyt/__main__.py

156-156: Missing return type annotation for public function main

Add return type annotation: None

(ANN201)


156-156: Missing type annotation for *cli_args

(ANN002)

tests/check/test_sema_check_cli.py

9-9: Missing return type annotation for public function test_main

Add return type annotation: None

(ANN201)


10-10: f-string without any placeholders

Remove extraneous f prefix

(F541)


10-10: Logging statement uses f-string

(G004)


20-20: Consider ["sema-check", *cli_line.split()] instead of concatenation

Replace with ["sema-check", *cli_line.split()]

(RUF005)


21-21: Local variable success is assigned to but never used

Remove assignment to unused variable success

(F841)

tests/check/test_service.py

1-1: sys imported but unused

Remove unused import: sys

(F401)


9-9: Missing return type annotation for public function test_load_yaml_file

Add return type annotation: None

(ANN201)


29-29: Missing return type annotation for public function test_instantiate_tests

Add return type annotation: None

(ANN201)


41-41: Missing return type annotation for public function test_instantiate_unknown_type

Add return type annotation: None

(ANN201)


49-49: Assertion always fails, replace with pytest.fail()

(PT015)


49-49: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)


51-51: Found assertion on exception e in except block, use pytest.raises() instead

(PT017)


54-54: Missing return type annotation for public function test_run_tests

Add return type annotation: None

(ANN201)

tests/check/test_sinks.py

2-2: os imported but unused

Remove unused import: os

(F401)


9-9: Missing return type annotation for public function test_write_csv

Add return type annotation: None

(ANN201)


51-51: Missing return type annotation for public function test_write_html

Add return type annotation: None

(ANN201)


75-75: Unnecessary open mode parameters

Remove open mode parameters

(UP015)


95-95: Missing return type annotation for public function test_write_yml

Add return type annotation: None

(ANN201)


116-116: Unnecessary open mode parameters

Remove open mode parameters

(UP015)


121-121: Avoid equality comparisons to True; use if loaded_results[0]["success"]: for truth checks

Replace with loaded_results[0]["success"]

(E712)


122-122: Comparison to None should be cond is None

Replace with cond is None

(E711)


126-126: Avoid equality comparisons to False; use if not loaded_results[1]["success"]: for false checks

Replace with not loaded_results[1]["success"]

(E712)

🪛 LanguageTool
tests/check/test_files/should not be processed.txt

[style] ~1-~1: To form a complete sentence, be sure to include a subject.
Context: should not be processed by sema-check

(MISSING_IT_THERE)

🔇 Additional comments (19)
tests/check/test_files/should not be processed.txt (1)

1-1: LGTM! Test file added as expected.

The content of this file is appropriate for its purpose as a test file. The statement "should not be processed by sema-check" clearly indicates the expected behavior for this file when encountered by the sema-check tool.

Note:

  1. The spaces in the filename appear to be intentional, likely to test how the tool handles filenames with spaces.
  2. The content is not meant to be a complete sentence, as it's a test file and not documentation.
🧰 Tools
🪛 LanguageTool

[style] ~1-~1: To form a complete sentence, be sure to include a subject.
Context: should not be processed by sema-check

(MISSING_IT_THERE)

tests/check/test_files/good_yaml.yml (1)

1-9: LGTM! The YAML structure looks good.

The file contains two well-structured entries, each with a URL, type, and options. This appears to be a good example of configuration data for testing purposes.

To ensure this file complies with the expected schema, please run the following verification:

This script will help verify the structure against a schema (if available) and check for any YAML parsing errors.

sema/check/sinks/__init__.py (1)

3-5: Imports look good!

The imports are well-structured and follow Python best practices:

  • Using relative imports for intra-package modules is appropriate.
  • The naming convention for the imported functions is consistent.
results.yml (1)

1-12: LGTM: Overall structure is well-formed

The YAML structure is correct and consistent. The file contains two test result entries with proper indentation and syntax.

sema/check/__init__.py (1)

1-17: Overall, the __init__.py file looks good

This __init__.py file is well-structured and serves its purpose effectively. It properly imports the necessary components and exposes them through the __all__ list. The suggested minor improvements (updating the module name comment and reordering the __all__ list) will enhance consistency and clarity, but the current implementation is functional and follows good practices for module initialization.

sema/check/testing/base.py (3)

3-4: LGTM: Import statements are appropriate.

The import statements are correct and necessary for the classes defined in this file.


7-13: LGTM: TestResult dataclass is well-structured.

The TestResult dataclass is correctly implemented with appropriate attributes and type annotations. It follows Python naming conventions and provides a clear structure for representing test results.


1-31: Overall, excellent foundation for the testing framework.

This file provides a solid base for the testing framework with well-structured TestResult and TestBase classes. The code is clean, follows Python conventions, and uses appropriate design patterns (dataclass, abstract base class).

A few minor type annotation improvements were suggested, but these are not critical. The structure allows for easy extension and implementation of specific test cases.

Great job on laying the groundwork for the testing framework!

🧰 Tools
🪛 Ruff

17-17: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)


17-17: Missing type annotation for self in method

(ANN101)


30-30: Missing type annotation for self in method

(ANN101)

sema/check/sinks/html_sink.py (1)

3-4: LGTM: Imports are correct and necessary.

The import statements are appropriate for the function implementation. They import the required List type hint and the TestResult class.

sema/check/sinks/csv_sink.py (1)

1-6: File structure and imports look good.

The file is well-structured with appropriate imports for the CSV writing functionality. The use of type hinting with List and the relative import of TestBase indicate good coding practices.

tests/check/test_sema_check_cli.py (1)

1-6: LGTM: Imports and logger setup are appropriate.

The necessary modules are imported, and the logger is correctly set up using the module name. This follows good practices for Python testing and logging.

sema/check/testing/test_example.py (2)

1-6: LGTM: Import and class definition look good.

The import statement and class definition are correctly implemented. The ExampleTest class inherits from TestBase, which is appropriate for a test class.


1-42: Overall, good foundation for the testing framework

This file provides a solid starting point for the testing framework. It introduces a basic test structure with ExampleTest and outlines plans for future tests. The code follows Python conventions and implements basic error handling.

Key strengths:

  1. Clear class structure inheriting from TestBase.
  2. Basic error handling in place.
  3. Comprehensive TODO list for future test implementations.

Areas for improvement:

  1. Enhance error handling as suggested earlier.
  2. Improve task management for TODOs.
  3. Consider adding more detailed documentation for the ExampleTest class and its methods.

Great job on setting up this foundation! As you proceed with implementing the planned tests, remember to keep the code well-documented and maintain consistent error handling across all test cases.

🧰 Tools
🪛 Ruff

7-7: Missing type annotation for self in method

(ANN101)


11-11: Trailing comma missing

Add trailing comma

(COM812)


18-18: Do not catch blind exception: Exception

(BLE001)

tests/check/test_service.py (1)

1-61: Overall, well-structured and comprehensive test suite.

This test file provides good coverage for the main functionalities of the service module. The tests are clear, concise, and effective in verifying the expected behavior of the module.

Main points for improvement:

  1. Add return type annotations to all test functions for consistency.
  2. Consider using pytest-specific assertions and patterns, such as pytest.raises() for exception testing.
  3. In test_run_tests, consider adding more specific assertions about the content of the test results.

These improvements will enhance the maintainability and robustness of your test suite. Great job on creating a solid foundation for testing the service module!

🧰 Tools
🪛 Ruff

1-1: sys imported but unused

Remove unused import: sys

(F401)


9-9: Missing return type annotation for public function test_load_yaml_file

Add return type annotation: None

(ANN201)


29-29: Missing return type annotation for public function test_instantiate_tests

Add return type annotation: None

(ANN201)


41-41: Missing return type annotation for public function test_instantiate_unknown_type

Add return type annotation: None

(ANN201)


49-49: Assertion always fails, replace with pytest.fail()

(PT015)


49-49: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)


51-51: Found assertion on exception e in except block, use pytest.raises() instead

(PT017)


54-54: Missing return type annotation for public function test_run_tests

Add return type annotation: None

(ANN201)

sema/check/__main__.py (3)

13-39: Well-structured argument parser setup

The get_arg_parser function is well-implemented. It provides clear documentation, sets up required and optional arguments correctly, and uses appropriate argument types and choices. Good job on following CLI best practices!


42-46: Concise and well-typed service creation

The make_service function is implemented correctly. It efficiently creates a Check object using the parsed arguments, and includes appropriate type annotations. Well done!


1-72: Well-structured CLI implementation

The overall structure and functionality of this file are well-implemented. It follows good practices for Python CLI scripts:

  1. Clear separation of concerns with distinct functions for argument parsing, service creation, and main execution.
  2. Use of argparse for robust command-line argument handling.
  3. Proper logging setup and usage throughout the script.
  4. Appropriate error handling in the main execution flow.

The script is set up correctly to be run as a standalone program. Good job on creating a clean and maintainable CLI implementation!

🧰 Tools
🪛 Ruff

4-4: sys imported but unused

Remove unused import: sys

(F401)


49-49: Missing type annotation for *args_list

(ANN002)


50-50: Logging statement uses f-string

(G004)


58-58: Do not catch blind exception: Exception

(BLE001)


59-59: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


59-59: Logging statement uses f-string

(G004)


63-63: Missing return type annotation for public function main

Add return type annotation: None

(ANN201)


63-63: Missing type annotation for function argument args

(ANN001)


64-64: Logging statement uses f-string

(G004)


66-66: f-string without any placeholders

Remove extraneous f prefix

(F541)


66-66: Logging statement uses f-string

(G004)


67-67: Logging statement uses f-string

(G004)

sema/subyt/__main__.py (2)

162-162: LGTM: Correct usage of command-line arguments

The update to the main function invocation is correct and consistent with the new function signature. It properly passes all command-line arguments except the script name.


156-162: 🧹 Nitpick (assertive)

Consider updating documentation and tests

The changes to the main function and its invocation improve the flexibility of the command-line interface. To ensure completeness:

  1. Update any existing documentation that describes the usage of this script to reflect the new argument handling.
  2. Review and update any tests that involve the main function to ensure they cover the new argument passing mechanism.

To help verify the impact of these changes, you can run the following script:

This will help identify files that might need to be updated to reflect the changes in the command-line interface.

🧰 Tools
🪛 Ruff

156-156: Missing return type annotation for public function main

Add return type annotation: None

(ANN201)


156-156: Missing type annotation for *cli_args

(ANN002)

Comment on lines +7 to +11
__all__ = [
"write_csv",
"write_html",
"write_yml",
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

__all__ list is well-defined.

The __all__ list correctly includes all imported functions, which is a good practice for controlling the module's public interface.

Consider adding a docstring at the beginning of the file to briefly explain the purpose of this module and its contents. For example:

"""
This module provides write functions for different output formats.

It includes functions to write data in CSV, HTML, and YAML formats.
"""

Comment on lines +1 to +11
# your_submodule/sinks/__init__.py

from .csv_sink import write_csv
from .html_sink import write_html
from .yml_sink import write_yml

__all__ = [
"write_csv",
"write_html",
"write_yml",
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Imported write_* functions are missing in sink modules.

The __init__.py file imports write_csv, write_html, and write_yml from their respective sink modules. However, these functions were not found in csv_sink.py, html_sink.py, or yml_sink.py.

  • Ensure that each sink module (csv_sink.py, html_sink.py, yml_sink.py) defines the corresponding write_* function.
🔗 Analysis chain

Overall, the file structure is clean and well-organized.

This __init__.py file effectively sets up the public interface for the sinks package. It follows Python best practices for imports and module structure. The consistent naming of imported functions (write_*) suggests a well-thought-out API design.

To ensure consistency across the package, let's verify the existence and naming of the imported modules:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the existence of imported modules and their naming consistency

# Test 1: Check if the imported modules exist
echo "Checking for imported modules:"
fd --type f --glob '*_sink.py' sema/check/sinks

# Test 2: Verify that each *_sink.py file contains a write_* function
echo "Verifying write_* functions in sink modules:"
for file in $(fd --type f --glob '*_sink.py' sema/check/sinks); do
    echo "Checking $file:"
    ast-grep --lang python --pattern 'def write_$_($$$):' "$file"
done

Length of output: 1072

tests/check/out/output.yml Show resolved Hide resolved
tests/check/out/output.yml Show resolved Hide resolved
tests/check/out/output.yml Show resolved Hide resolved
sema/check/check.py Outdated Show resolved Hide resolved
sema/check/check.py Outdated Show resolved Hide resolved
sema/check/check.py Show resolved Hide resolved
sema/check/check.py Outdated Show resolved Hide resolved
sema/check/check.py Show resolved Hide resolved
Copy link
Contributor

@mpo-vliz mpo-vliz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good enough to get started,
reversed: not enough actual body to disagree atm

so actually also questioning why we don't keep this in a separate branch some longer?
(would be ok to wait for merge until things are more feature-complete)
(and keep the PR in draft?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion to use jinja and manage the produced html template separately

should get the Result object-model as parameter to be fed into the template?

results.csv Outdated Show resolved Hide resolved
@cedricdcc cedricdcc marked this pull request as draft October 9, 2024 13:20
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 28

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE

📥 Commits

Files that changed from the base of the PR and between bcf933e and 643dcf9.

📒 Files selected for processing (11)
  • sema/check/README.md (1 hunks)
  • sema/check/init.py (1 hunks)
  • sema/check/main.py (1 hunks)
  • sema/check/check.py (1 hunks)
  • sema/check/service.py (1 hunks)
  • sema/check/sinks/csv_sink.py (1 hunks)
  • sema/check/sinks/html_sink.py (1 hunks)
  • sema/check/sinks/yml_sink.py (1 hunks)
  • tests/check/test_sema_check_cli.py (1 hunks)
  • tests/check/test_service.py (1 hunks)
  • tests/check/test_sinks.py (1 hunks)
🧰 Additional context used
🪛 LanguageTool
sema/check/README.md

[uncategorized] ~19-~19: Loose punctuation mark.
Context: ...t with the following structure: - url: The target URL for the test (required) ...

(UNLIKELY_OPENING_PUNCTUATION)

🪛 Ruff
sema/check/__main__.py

45-45: Missing type annotation for *args_list

(ANN002)


54-54: Replace aliased errors with OSError

Replace with builtin OSError

(UP024)


55-55: Redundant exception object included in logging.exception call

(TRY401)

sema/check/check.py

14-14: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)


14-14: Missing type annotation for self in method

(ANN101)


18-18: Missing type annotation for self in method

(ANN101)


25-25: Missing type annotation for self in method

(ANN101)


43-43: Use of assert detected

(S101)


44-44: Use of assert detected

(S101)


49-49: Missing type annotation for self in method

(ANN101)


53-53: Logging statement uses f-string

(G004)


60-60: Logging statement uses f-string

(G004)


62-62: Do not catch blind exception: Exception

(BLE001)


63-63: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


63-63: Logging statement uses f-string

(G004)


66-66: return inside finally blocks cause exceptions to be silenced

(B012)


66-66: Don't use return in try-except and finally

(SIM107)

sema/check/service.py

14-14: Missing return type annotation for public function load_yaml_files

(ANN201)


14-14: Missing type annotation for function argument input_folder

(ANN001)


15-15: Logging statement uses f-string

(G004)


19-19: Call endswith once with a tuple

Merge into a single endswith call

(PIE810)


23-23: Logging statement uses f-string

(G004)


25-25: Unnecessary open mode parameters

Remove open mode parameters

(UP015)


31-31: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


31-31: Logging statement uses f-string

(G004)


33-33: Logging statement uses f-string

(G004)


33-33: Redundant exception object included in logging.exception call

(TRY401)


34-34: Logging statement uses f-string

(G004)


38-38: Missing return type annotation for public function instantiate_test

(ANN201)


38-38: Missing type annotation for function argument rule

(ANN001)


39-39: Logging statement uses f-string

(G004)


41-41: Logging statement uses f-string

(G004)


49-49: Unnecessary else after return statement

Remove unnecessary else

(RET505)


50-50: Avoid specifying long messages outside the exception class

(TRY003)


53-53: Missing return type annotation for public function run_tests

(ANN201)


53-53: Missing type annotation for function argument input_folder

(ANN001)


55-55: Logging statement uses f-string

(G004)

sema/check/sinks/csv_sink.py

9-9: Missing return type annotation for public function write_csv

Add return type annotation: None

(ANN201)


23-23: Trailing comma missing

Add trailing comma

(COM812)

sema/check/sinks/html_sink.py

8-8: Missing return type annotation for public function write_html

Add return type annotation: None

(ANN201)

sema/check/sinks/yml_sink.py

10-10: Missing return type annotation for public function write_yml

Add return type annotation: None

(ANN201)

tests/check/test_service.py

10-10: Missing return type annotation for public function test_load_yaml_file

Add return type annotation: None

(ANN201)


30-30: Missing return type annotation for public function test_instantiate_tests

Add return type annotation: None

(ANN201)


42-42: Missing return type annotation for public function test_instantiate_unknown_type

Add return type annotation: None

(ANN201)


50-50: Assertion always fails, replace with pytest.fail()

(PT015)


50-50: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)


52-52: Found assertion on exception e in except block, use pytest.raises() instead

(PT017)


55-55: Missing return type annotation for public function test_run_tests

Add return type annotation: None

(ANN201)

tests/check/test_sinks.py

76-76: Unnecessary open mode parameters

Remove open mode parameters

(UP015)


96-96: Missing return type annotation for public function test_write_yml

Add return type annotation: None

(ANN201)


117-117: Unnecessary open mode parameters

Remove open mode parameters

(UP015)

🔇 Additional comments (8)
sema/check/sinks/yml_sink.py (2)

1-8: LGTM: Imports are appropriate for the functionality.

The imports are well-structured and provide the necessary components for the YAML sink functionality.


10-10: ⚠️ Potential issue

Add return type annotation to the function.

To improve type hinting and satisfy the static analysis tool, add a return type annotation to the write_yml function.

Apply this change:

-def write_yml(results: List[TestResult], output_file: str):
+def write_yml(results: List[TestResult], output_file: str) -> None:

Likely invalid or redundant comment.

🧰 Tools
🪛 Ruff

10-10: Missing return type annotation for public function write_yml

Add return type annotation: None

(ANN201)

sema/check/__init__.py (2)

1-8: LGTM: Module comment and imports are well-organized

The module comment correctly reflects the file path, and the imports are appropriately organized. They include all necessary components from various submodules, providing a clear structure for the module's functionality.


10-17: 🧹 Nitpick (assertive)

Consider reordering __all__ list for consistency

The __all__ list correctly includes all imported components, which is good practice for explicitly declaring the public interface of the module. However, for improved readability and consistency, consider reordering the items in the __all__ list to match the order of imports.

Apply this diff to reorder the list:

 __all__ = [
     "run_tests",
-    "TestBase",
-    "Check",
+    "Check",
+    "TestBase",
     "write_csv",
     "write_html",
     "write_yml",
 ]
sema/check/sinks/html_sink.py (1)

1-6: LGTM: Imports and file structure are appropriate.

The imports and file structure are clean and follow good practices. The necessary types are imported, and the file is well-organized.

sema/check/sinks/csv_sink.py (1)

1-7: Imports and structure look good.

The necessary modules are imported, and the overall structure of the file is clean and follows Python conventions.

tests/check/test_sema_check_cli.py (1)

1-7: LGTM: Imports and logger setup are well-structured.

The imports are appropriate for the test file, including necessary standard library modules and the main function from the module being tested. The logger setup using __name__ is a good practice for maintaining consistent logging across the project.

sema/check/README.md (1)

25-37: TestBase dataclass description looks great

The updated TestBase dataclass description is clear, concise, and informative. It addresses all the points raised in the previous review:

  1. Data types are now included for each field.
  2. The relationship between success and error fields in the result is clarified.
  3. A nested list structure is used for better readability.

The additional note about a test being unsuccessful without encountering an error is particularly helpful for understanding the nuances of test results.

Great job on these improvements!

Comment on lines 10 to 12
def write_yml(results: List[TestResult], output_file: str):
with open(output_file, "w") as f:
yaml.dump([result.__dict__ for result in results], f)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Enhance error handling and consider edge cases.

The current implementation lacks error handling for file operations and doesn't account for the case where the results list might be empty. Consider adding appropriate error handling and a check for empty results.

Here's an example of how you could improve the function:

def write_yml(results: List[TestResult], output_file: str) -> None:
    if not results:
        print("Warning: No results to write.")
        return

    try:
        with open(output_file, "w") as f:
            yaml.dump([result.__dict__ for result in results], f)
    except IOError as e:
        print(f"Error writing to file {output_file}: {e}")
    except yaml.YAMLError as e:
        print(f"Error in YAML serialization: {e}")

This version handles potential file I/O errors, YAML serialization errors, and warns if the results list is empty.

🧰 Tools
🪛 Ruff

10-10: Missing return type annotation for public function write_yml

Add return type annotation: None

(ANN201)

sema/check/sinks/html_sink.py Outdated Show resolved Hide resolved
Comment on lines +15 to +16
with open(output_file, "w") as f:
f.write(html_content)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add error handling for file operations.

The current implementation doesn't handle potential errors when writing to the file. Consider adding a try-except block to catch and handle IOErrors, as suggested in a past review comment.

Here's an example of how you could add error handling:

try:
    with open(output_file, "w") as f:
        f.write(html_content)
except IOError as e:
    print(f"Error writing to file {output_file}: {e}")
    # Consider raising an exception or returning a status code

sema/check/sinks/html_sink.py Outdated Show resolved Hide resolved
Comment on lines +19 to +20
# In the future add some way to visualise this with graphs or something
# For now this is good enough
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Consider creating a GitHub issue for future enhancement.

The comment about future visualization with graphs is a good idea. To ensure this doesn't get lost, consider creating a GitHub issue to track this enhancement.

Would you like me to create a GitHub issue for tracking the future enhancement of adding graph visualizations to the HTML output?

tests/check/test_sinks.py Outdated Show resolved Hide resolved
tests/check/test_sinks.py Outdated Show resolved Hide resolved
tests/check/test_sinks.py Outdated Show resolved Hide resolved
tests/check/test_sinks.py Show resolved Hide resolved
tests/check/test_sinks.py Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE

📥 Commits

Files that changed from the base of the PR and between 643dcf9 and db7d284.

📒 Files selected for processing (2)
  • sema/check/README.md (1 hunks)
  • sema/check/sinks/html_sink.py (1 hunks)
🧰 Additional context used
🪛 LanguageTool
sema/check/README.md

[uncategorized] ~22-~22: Loose punctuation mark.
Context: ...t with the following structure: - url: The target URL for the test (required) ...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~24-~24: Loose punctuation mark.
Context: ... the implemented test types. - options: A map of key-value pairs for additional...

(UNLIKELY_OPENING_PUNCTUATION)


[grammar] ~60-~60: This phrase is duplicated. You should probably use “the CLI” only once.
Context: ... sema-check process: 1. User invokes the CLI 2. The CLI module parses arguments and initiates t...

(PHRASE_REPETITION)


[uncategorized] ~64-~64: The preposition ‘in’ seems more likely in this position.
Context: ...s are collected 5. Results are output to various formats (CSV, HTML, YML) using ...

(AI_HYDRA_LEO_REPLACE_TO_IN)

🪛 Markdownlint
sema/check/README.md

1-1: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines

(MD022, blanks-around-headings)


2-2: Expected: 1; Actual: 0; Above
Headings should be surrounded by blank lines

(MD022, blanks-around-headings)


31-31: null
Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


34-34: null
Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


36-36: null
Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


67-67: null
Lists should be surrounded by blank lines

(MD032, blanks-around-lists)

🪛 Ruff
sema/check/sinks/html_sink.py

57-57: Replace aliased errors with OSError

Replace IOError with builtin OSError

(UP024)

sema/check/README.md Outdated Show resolved Hide resolved
sema/check/README.md Outdated Show resolved Hide resolved
sema/check/README.md Outdated Show resolved Hide resolved
sema/check/README.md Show resolved Hide resolved
sema/check/README.md Outdated Show resolved Hide resolved
Comment on lines +28 to +31
<tr>
<th>Success</th>
<th>Error</th>
<th>Message</th>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Adjust table headers for clarity

To improve the clarity of the HTML table, consider updating the headers to more accurately reflect the content of each column. For example, "Status" instead of "Success" and "Details" instead of "Message".

Apply this diff to update the table headers:

-                    <th>Success</th>
-                    <th>Error</th>
-                    <th>Message</th>
+                    <th>Status</th>
+                    <th>Error Message</th>
+                    <th>Details</th>
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<tr>
<th>Success</th>
<th>Error</th>
<th>Message</th>
<tr>
<th>Status</th>
<th>Error Message</th>
<th>Details</th>

Comment on lines +57 to +58
except IOError as e:
print(f"Error writing to file {output_file}: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Replace IOError with OSError

In Python 3, IOError is an alias of OSError. For clarity and to follow modern Python conventions, it's recommended to use OSError directly.

Apply this diff to update the exception handling:

-    except IOError as e:
+    except OSError as e:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except IOError as e:
print(f"Error writing to file {output_file}: {e}")
except OSError as e:
print(f"Error writing to file {output_file}: {e}")
🧰 Tools
🪛 Ruff

57-57: Replace aliased errors with OSError

Replace IOError with builtin OSError

(UP024)

Comment on lines +38 to +41
success_class = "success" if result.success else "error"
html_content += f"""
<tr>
<td class="{success_class}">{result.success}</td>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Display user-friendly success status

Currently, the result.success value is displayed directly, which will appear as True or False in the HTML table. For better readability, consider displaying "Success" or "Failure" instead.

Apply this diff to update the display:

-            <td class="{success_class}">{result.success}</td>
+            <td class="{success_class}">{'Success' if result.success else 'Failure'}</td>
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
success_class = "success" if result.success else "error"
html_content += f"""
<tr>
<td class="{success_class}">{result.success}</td>
success_class = "success" if result.success else "error"
html_content += f"""
<tr>
<td class="{success_class}">{'Success' if result.success else 'Failure'}</td>

with open(output_file, "w") as f:
f.write(html_content)
except IOError as e:
print(f"Error writing to file {output_file}: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use logging instead of print for error messages

Using the logging module instead of print statements allows for better control over log levels and output destinations, which is especially important in larger applications.

Add the import at the top of the file:

+import logging

Update the exception handling:

-        print(f"Error writing to file {output_file}: {e}")
+        logging.error(f"Error writing to file {output_file}: {e}")

Ensure that the logging system is configured appropriately in your application to capture and handle these logs.

Committable suggestion was skipped due to low confidence.

Comment on lines +61 to +80
for result in results:
success_class = "success" if result.success else "error"
html_content += f"""
<tr>
<td class="{success_class}">{result.success}</td>
<td>{result.error}</td>
<td>{result.message}</td>
</tr>
"""

html_content += """
</tbody>
</table>
</body>
</html>
"""

with open(output_file, "w") as f:
f.write(html_content)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove duplicate code blocks to avoid redundancy

The code from lines 61-80 duplicates the previous code from lines 37-60. This redundancy can lead to maintenance issues and potential bugs. It's important to remove the duplicated code to keep the codebase clean and efficient.

Apply this diff to remove the duplicated code:

-        for result in results:
-            success_class = "success" if result.success else "error"
-            html_content += f"""
-                <tr>
-                    <td class="{success_class}">{result.success}</td>
-                    <td>{result.error}</td>
-                    <td>{result.message}</td>
-                </tr>
-            """
-
-        html_content += """
-            </tbody>
-        </table>
-    </body>
-    </html>
-        """
-
-        with open(output_file, "w") as f:
-            f.write(html_content)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for result in results:
success_class = "success" if result.success else "error"
html_content += f"""
<tr>
<td class="{success_class}">{result.success}</td>
<td>{result.error}</td>
<td>{result.message}</td>
</tr>
"""
html_content += """
</tbody>
</table>
</body>
</html>
"""
with open(output_file, "w") as f:
f.write(html_content)
with open(output_file, "w") as f:
f.write(html_content)
```
Note: This suggestion removes the entire duplicated code block as indicated in the diff. However, it's important to ensure that the remaining code (not shown in this snippet) still contains the necessary HTML structure and content generation that was removed here. The removal of this duplicate code assumes that the HTML table, body, and document closing tags, as well as the loop for generating result rows, are present earlier in the file.
Additionally, it's worth noting that the previous review comment suggested adding error handling for file operations. While this is not directly addressed in the current review, it would be beneficial to implement that suggestion as well, perhaps like this:
```suggestion
try:
with open(output_file, "w") as f:
f.write(html_content)
except IOError as e:
print(f"Error writing to file {output_file}: {e}")
# Consider raising an exception or returning a status code

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 35

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE

📥 Commits

Files that changed from the base of the PR and between db7d284 and 8b446d3.

📒 Files selected for processing (13)
  • sema/check/README.md (1 hunks)
  • sema/check/init.py (1 hunks)
  • sema/check/main.py (1 hunks)
  • sema/check/base.py (1 hunks)
  • sema/check/check.py (1 hunks)
  • sema/check/implementations/test_example.py (1 hunks)
  • sema/check/sinks/csv_sink.py (1 hunks)
  • sema/check/sinks/html_sink.py (1 hunks)
  • sema/check/sinks/yml_sink.py (1 hunks)
  • tests/check/out/test_results.html (1 hunks)
  • tests/check/test_check_service.py (1 hunks)
  • tests/check/test_sema_check_cli.py (1 hunks)
  • tests/check/test_sinks.py (1 hunks)
🧰 Additional context used
📓 Learnings (1)
sema/check/README.md (1)
Learnt from: cedricdcc
PR: vliz-be-opsci/py-sema#125
File: sema/check/README.md:75-114
Timestamp: 2024-10-16T07:43:49.737Z
Learning: In the `sema-check` module's `README.md`, titles will not be provided for Mermaid flowcharts unless explicitly requested.
🪛 LanguageTool
sema/check/README.md

[uncategorized] ~24-~24: Loose punctuation mark.
Context: ...t with the following structure: - url: The target URL for the test (required) ...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~26-~26: Loose punctuation mark.
Context: ... the implemented test types. - options: A map of key-value pairs for additional...

(UNLIKELY_OPENING_PUNCTUATION)


[grammar] ~100-~100: This phrase is duplicated. You should probably use “the CLI” only once.
Context: ...he sema-check process: 1. User invokes the CLI 2. The CLI module parses arguments and initiates t...

(PHRASE_REPETITION)


[uncategorized] ~104-~104: The preposition ‘in’ seems more likely in this position.
Context: ...lts are collected 5. Results are output to various formats (CSV, HTML, YML) using ...

(AI_HYDRA_LEO_REPLACE_TO_IN)

🪛 Ruff
sema/check/__main__.py

46-46: Missing type annotation for *args_list

(ANN002)

sema/check/base.py

17-17: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)


17-17: Missing type annotation for self in method

(ANN101)


30-30: Missing type annotation for self in method

(ANN101)

sema/check/check.py

17-17: Missing return type annotation for public function load_yaml_files

(ANN201)


17-17: Missing type annotation for function argument input_folder

(ANN001)


18-18: Logging statement uses f-string

(G004)


22-22: Call endswith once with a tuple

Merge into a single endswith call

(PIE810)


26-26: Logging statement uses f-string

(G004)


28-28: Unnecessary open mode parameters

Remove open mode parameters

(UP015)


36-36: Do not catch blind exception: Exception

(BLE001)


37-37: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


37-37: Logging statement uses f-string

(G004)


39-39: Logging statement uses f-string

(G004)


40-40: Logging statement uses f-string

(G004)


44-44: Missing return type annotation for public function instantiatie_impl_checks

(ANN201)


44-44: Missing type annotation for function argument rule

(ANN001)


45-45: Logging statement uses f-string

(G004)


47-47: Logging statement uses f-string

(G004)


55-55: Unnecessary else after return statement

Remove unnecessary else

(RET505)


56-56: Avoid specifying long messages outside the exception class

(TRY003)


61-61: Logging statement uses f-string

(G004)


67-67: Logging statement uses f-string

(G004)


67-67: Trailing comma missing

Add trailing comma

(COM812)


72-72: Logging statement uses f-string

(G004)


72-72: Redundant exception object included in logging.exception call

(TRY401)


74-74: Logging statement uses f-string

(G004)


81-81: Missing type annotation for self in method

(ANN101)


85-85: Missing type annotation for self in method

(ANN101)


92-92: Missing type annotation for self in method

(ANN101)


111-111: Avoid specifying long messages outside the exception class

(TRY003)


113-113: Avoid specifying long messages outside the exception class

(TRY003)


117-117: Missing type annotation for self in method

(ANN101)


121-121: Logging statement uses f-string

(G004)


130-130: Do not catch blind exception: Exception

(BLE001)


131-131: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


131-131: Logging statement uses f-string

(G004)


133-133: return inside finally blocks cause exceptions to be silenced

(B012)

sema/check/implementations/test_example.py

9-9: Missing type annotation for self in method

(ANN101)

sema/check/sinks/csv_sink.py

9-9: Missing return type annotation for public function write_csv

Add return type annotation: None

(ANN201)


23-23: Trailing comma missing

Add trailing comma

(COM812)

sema/check/sinks/html_sink.py

57-57: Replace aliased errors with OSError

Replace IOError with builtin OSError

(UP024)

sema/check/sinks/yml_sink.py

10-10: Missing return type annotation for public function write_yml

Add return type annotation: None

(ANN201)

tests/check/test_check_service.py

55-55: Assertion always fails, replace with pytest.fail()

(PT015)


55-55: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)


57-57: Found assertion on exception e in except block, use pytest.raises() instead

(PT017)

tests/check/test_sinks.py

84-84: Missing return type annotation for public function test_write_yml

Add return type annotation: None

(ANN201)

🔇 Additional comments (23)
sema/check/sinks/yml_sink.py (2)

1-8: 🧹 Nitpick (assertive)

Consider adding a module-level docstring.

To improve code documentation and maintainability, it would be beneficial to add a module-level docstring explaining the purpose of this YAML sink module and its role in the larger system.

Example:

"""
This module provides functionality to write check results to a YAML file.

It defines a `write_yml` function that takes a list of CheckResult objects
and serializes them into a YAML format, writing the output to a specified file.
"""

10-12: 🛠️ Refactor suggestion

⚠️ Potential issue

Enhance error handling, consider edge cases, and improve type annotations.

The current implementation of write_yml can be improved in several ways:

  1. Add error handling for file operations and YAML serialization.
  2. Check for empty results list.
  3. Add a return type annotation.
  4. Consider a more robust serialization method for CheckResult objects.

Here's an example of how you could improve the function:

from typing import List
import yaml
from sema.check.base import CheckResult

def write_yml(results: List[CheckResult], output_file: str) -> None:
    if not results:
        print("Warning: No results to write.")
        return

    try:
        with open(output_file, "w") as f:
            yaml.dump([result.to_dict() for result in results], f)
    except IOError as e:
        print(f"Error writing to file {output_file}: {e}")
    except yaml.YAMLError as e:
        print(f"Error in YAML serialization: {e}")

# In the CheckResult class:
def to_dict(self):
    return {
        'attribute1': self.attribute1,
        'attribute2': self.attribute2,
        # ... other attributes
    }

This version:

  • Handles potential file I/O errors and YAML serialization errors.
  • Warns if the results list is empty.
  • Includes the return type annotation.
  • Uses a to_dict() method for more controlled serialization of CheckResult objects.
🧰 Tools
🪛 Ruff

10-10: Missing return type annotation for public function write_yml

Add return type annotation: None

(ANN201)

sema/check/__init__.py (3)

1-2: Module comment is correct and up-to-date.

The comment accurately reflects the file path, which addresses a previous review comment. Good job on keeping the documentation in sync with the code structure.


3-7: Imports are well-organized and follow best practices.

The imports are clearly structured, each on a separate line, and import specific components. This approach enhances readability and maintainability. Good job on following Python import best practices.


1-16: Overall, well-structured module initializer.

This __init__.py file effectively serves as an initializer for the check module. It properly imports and exposes the necessary components, following good Python practices. The only minor suggestion is to reorder the __all__ list for consistency with the import order. Great job on creating a clean and organized module structure!

sema/check/base.py (1)

1-5: LGTM: Imports are appropriate

The imports from abc and dataclasses modules are correctly used for defining the abstract base class and data class in this file.

sema/check/sinks/csv_sink.py (1)

1-8: Imports and structure look good.

The imports are appropriate for the CSV writing functionality, and the overall structure of the file is clean and follows good practices.

tests/check/test_sema_check_cli.py (3)

1-7: LGTM: Well-structured imports and logger setup.

The imports are appropriate for the test file's needs, and using Path from pathlib ensures cross-platform compatibility. The logger setup follows best practices.


1-37: LGTM: Well-structured test with minor suggestions for improvement.

Overall, this test file is well-structured and implements good testing practices. It effectively tests the CLI functionality across multiple output formats and includes proper setup and cleanup procedures. The suggestions provided in the previous comment will further enhance the robustness of the tests by ensuring both the creation and correct content of output files.


10-37: 🛠️ Refactor suggestion

Verify assertion logic and consider adding content checks.

  1. The assertion on line 22 checks for failure (not success). Please verify if this is intentional:
assert (
    not success
), f"sema-check failed for output format: {output}"

If you expect the query_main function to return True on success, you might want to change this to:

assert success, f"sema-check failed for output format: {output}"
  1. Consider adding more specific content checks for each output format. For example:
if output == "csv":
    assert output_file.read_text().startswith("column1,column2"), f"Unexpected CSV content for {output}"
elif output == "html":
    assert "<html" in output_file.read_text(), f"Unexpected HTML content for {output}"
elif output == "yml":
    assert output_file.read_text().startswith("---"), f"Unexpected YAML content for {output}"

These additional checks would ensure that the output files not only exist but also contain the expected content structure.

tests/check/out/test_results.html (3)

1-7: Well-structured HTML5 document with proper metadata

The HTML structure follows HTML5 standards, and includes essential metadata such as charset, viewport, and title. This is a good practice for ensuring proper rendering and SEO.


16-43: Well-structured and semantic table for test results

The body content is well-organized, using semantic HTML elements like <thead> and <tbody>. The use of CSS classes for success and error states is a good practice for visual feedback.


1-61: Overall improvement with room for minor enhancements

The current implementation of test_results.html is a significant improvement over the version mentioned in the past review comments. It addresses most of the concerns raised earlier, including proper HTML5 structure, semantic elements, and basic styling.

To further enhance the file:

  1. Consider moving the CSS to an external file for better separation of concerns.
  2. Remove the duplicate table rows at the end of the file.

These changes will make the code more maintainable and eliminate redundancy. Great job on the improvements made so far!

sema/check/implementations/test_example.py (2)

3-5: LGTM: Imports are appropriate and concise.

The imports are well-chosen for the functionality implemented in this file. Good job on keeping the imports minimal and relevant.


1-50: Overall: Good start, but needs implementation of actual test logic.

This file provides a good foundation for implementing tests, with appropriate error handling and a clear structure. However, there are a few key areas for improvement:

  1. Implement actual test logic in the run method instead of simulating success.
  2. Enhance error handling with more specific exception cases.
  3. Convert TODO comments into trackable GitHub issues.
  4. Add the missing type annotation for self in the run method.

Once these improvements are made, this will serve as a solid example for implementing other tests in the system.

🧰 Tools
🪛 Ruff

9-9: Missing type annotation for self in method

(ANN101)

tests/check/test_check_service.py (2)

1-10: LGTM: Imports and logger setup are appropriate.

The necessary functions are imported from the sema.check.check module, and a logger is set up for debugging purposes. This is a good practice for unit testing.


1-67: Overall, good test coverage with room for improvements.

This test file provides good coverage for the sema.check.check module. The tests are well-structured and cover the main functionalities. However, there are several opportunities for improvement:

  1. Correct the typo in the test_instantiatie_impl_checks function name.
  2. Remove the duplicate assertion in test_load_yaml_file.
  3. Refactor test_instantiate_unknown_type to use pytest.raises().
  4. Enhance test_run_tests with more comprehensive checks.

Implementing these suggestions will result in more robust and maintainable tests.

🧰 Tools
🪛 Ruff

55-55: Assertion always fails, replace with pytest.fail()

(PT015)


55-55: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)


57-57: Found assertion on exception e in except block, use pytest.raises() instead

(PT017)

tests/check/test_sinks.py (2)

1-7: LGTM: Well-organized import statements

The import statements are logically grouped and include all necessary modules and functions for the tests. Good job on following import best practices.


1-119: Overall excellent test coverage and structure

This test file provides comprehensive coverage for the CSV, HTML, and YAML writing functions. The tests are well-structured, following a consistent pattern of arranging test data, acting by calling the respective write function, and asserting the results.

Key strengths:

  1. Thorough testing of both successful and failed scenarios
  2. Appropriate use of the CheckResult class for test data
  3. Comprehensive assertions covering all relevant fields

The minor suggestions provided earlier will further improve the already high-quality test suite. Great job on creating these tests!

🧰 Tools
🪛 Ruff

84-84: Missing return type annotation for public function test_write_yml

Add return type annotation: None

(ANN201)

sema/check/README.md (2)

1-155: Excellent README for the Sema Check Module

This README provides a comprehensive and well-structured overview of the Sema Check Module. It effectively covers test configuration, base class structure, and process flow, including helpful YAML examples, Python code snippets, and a clear Mermaid flowchart.

The content is informative and easy to follow, with only minor suggestions for improvement in formatting and consistency. Great job on creating a thorough and user-friendly documentation!

🧰 Tools
🪛 LanguageTool

[uncategorized] ~24-~24: Loose punctuation mark.
Context: ...t with the following structure: - url: The target URL for the test (required) ...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~26-~26: Loose punctuation mark.
Context: ... the implemented test types. - options: A map of key-value pairs for additional...

(UNLIKELY_OPENING_PUNCTUATION)


[grammar] ~100-~100: This phrase is duplicated. You should probably use “the CLI” only once.
Context: ...he sema-check process: 1. User invokes the CLI 2. The CLI module parses arguments and initiates t...

(PHRASE_REPETITION)


[uncategorized] ~104-~104: The preposition ‘in’ seems more likely in this position.
Context: ...lts are collected 5. Results are output to various formats (CSV, HTML, YML) using ...

(AI_HYDRA_LEO_REPLACE_TO_IN)


22-26: 🧹 Nitpick (assertive)

Fix loose punctuation marks

There are some loose punctuation marks in the list structure.

Apply these changes to fix the issue:

-The YAML configuration for tests defines the parameters for each test to be executed. Each test is represented by a YAML object with the following structure:
+The YAML configuration for tests defines the parameters for each test to be executed. Each test is represented by a YAML object with the following structure:

-- `url`: The target URL for the test (required)
-- `type`: The type of test to be performed (required). Possible values depend on the implemented test types.
-- `options`: A map of key-value pairs for additional test parameters (optional)
+- `url`: The target URL for the test (required)
+- `type`: The type of test to be performed (required). Possible values depend on the implemented test types.
+- `options`: A map of key-value pairs for additional test parameters (optional)

Likely invalid or redundant comment.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~24-~24: Loose punctuation mark.
Context: ...t with the following structure: - url: The target URL for the test (required) ...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~26-~26: Loose punctuation mark.
Context: ... the implemented test types. - options: A map of key-value pairs for additional...

(UNLIKELY_OPENING_PUNCTUATION)

sema/check/__main__.py (1)

46-46: ⚠️ Potential issue

Add type annotation for *args_list in _main function

Adding type annotations enhances code readability and helps with static type checking. Please specify the expected type for *args_list.

Apply this diff to add the type annotation:

-def _main(*args_list) -> bool:
+def _main(*args_list: str) -> bool:
🧰 Tools
🪛 Ruff

46-46: Missing type annotation for *args_list

(ANN002)

sema/check/sinks/html_sink.py (1)

8-80: LGTM!

The write_html function is well-implemented and effectively generates the HTML report for the test results.

🧰 Tools
🪛 Ruff

57-57: Replace aliased errors with OSError

Replace IOError with builtin OSError

(UP024)

Comment on lines +9 to +16
__all__ = [
"run_tests",
"CheckBase",
"Check",
"write_csv",
"write_html",
"write_yml",
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Consider reordering __all__ list for consistency with imports.

While the __all__ list correctly includes all imported components, its order doesn't match the order of imports. For improved readability and consistency, consider reordering the items in the __all__ list to match the order of imports.

Here's a suggested reordering:

 __all__ = [
-    "run_tests",
-    "CheckBase",
-    "Check",
+    "CheckBase",
+    "Check",
+    "run_tests",
     "write_csv",
     "write_html",
     "write_yml",
 ]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
__all__ = [
"run_tests",
"CheckBase",
"Check",
"write_csv",
"write_html",
"write_yml",
]
__all__ = [
"CheckBase",
"Check",
"run_tests",
"write_csv",
"write_html",
"write_yml",
]

Comment on lines +7 to +14
@dataclass
class CheckResult:
success: bool
error: bool
message: str
url: str
type_test: str

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

LGTM with a suggestion: Consider simplifying boolean flags

The CheckResult dataclass is well-structured and uses appropriate types for its attributes. However, consider simplifying the boolean flags:

  • Having both success and error as separate boolean flags might lead to inconsistent states (e.g., both True or both False).
  • You could use a single status attribute instead, which could be an enum with values like SUCCESS, ERROR, and possibly PENDING or SKIPPED.

Would you like me to provide an example of how to refactor this using an enum?

Comment on lines +16 to +28
class CheckBase(ABC):
def __init__(self, url: str, type_test: str, options: dict):
self.url = url
self.type = type_test
self.options = options
self.result = CheckResult(
success=False,
error=False,
message="",
url=url,
type_test=type_test,
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

LGTM with a suggestion: Reconsider default CheckResult

The CheckBase class is well-structured as an abstract base class. However, consider the following suggestion:

  • The default CheckResult in __init__ sets success=False. This might not be ideal as it assumes failure by default. Consider using None or a separate state like PENDING to indicate that the check hasn't been run yet.

Would you like an example of how to implement this suggestion?

🧰 Tools
🪛 Ruff

17-17: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)


17-17: Missing type annotation for self in method

(ANN101)

Comment on lines +29 to +31
@abstractmethod
def run(self) -> CheckResult:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Add a docstring to the run method

The run method lacks a docstring. Adding one would improve the clarity of the interface for implementers. For example:

@abstractmethod
def run(self) -> CheckResult:
    """
    Execute the check and return the result.

    This method should be implemented by subclasses to perform the actual check.

    Returns:
        CheckResult: The result of the check.
    """
    pass
🧰 Tools
🪛 Ruff

30-30: Missing type annotation for self in method

(ANN101)



class CheckBase(ABC):
def __init__(self, url: str, type_test: str, options: dict):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Add return type annotation to init method

To improve type hinting, add a return type annotation to the __init__ method:

def __init__(self, url: str, type_test: str, options: dict) -> None:

This addresses the Ruff hint ANN204.

🧰 Tools
🪛 Ruff

17-17: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)


17-17: Missing type annotation for self in method

(ANN101)

log = logging.getLogger(__name__)


def load_yaml_files(input_folder):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add type annotations to the load_yaml_files function.

The function load_yaml_files lacks type annotations for its parameter input_folder and its return type. Adding type annotations improves code readability and assists with static analysis and type checking.

Apply this diff to add type annotations:

-def load_yaml_files(input_folder):
+def load_yaml_files(input_folder: str) -> List[Dict[str, Any]]:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def load_yaml_files(input_folder):
def load_yaml_files(input_folder: str) -> List[Dict[str, Any]]:
```
Note: To use this suggestion, you'll need to ensure that the necessary imports are present at the top of the file, such as:
```python
from typing import List, Dict, Any
🧰 Tools
🪛 Ruff

17-17: Missing return type annotation for public function load_yaml_files

(ANN201)


17-17: Missing type annotation for function argument input_folder

(ANN001)

Comment on lines +55 to +56
else:
raise ValueError(f"Unknown test type: {test_type}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Remove unnecessary else after return statement.

Since the if block ends with a return, the subsequent else block is unnecessary. Removing it simplifies the code and improves readability.

Apply this diff:

-    else:
-        raise ValueError(f"Unknown test type: {test_type}")
+    raise ValueError(f"Unknown test type: {test_type}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
else:
raise ValueError(f"Unknown test type: {test_type}")
raise ValueError(f"Unknown test type: {test_type}")
🧰 Tools
🪛 Ruff

55-55: Unnecessary else after return statement

Remove unnecessary else

(RET505)


56-56: Avoid specifying long messages outside the exception class

(TRY003)

yaml_files = [
f
for f in os.listdir(input_folder)
if f.endswith(".yaml") or f.endswith(".yml")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Simplify endswith by using a tuple of suffixes.

Instead of checking each file extension separately, you can pass a tuple of suffixes to endswith for more concise and efficient code.

Apply this diff:

-        if f.endswith(".yaml") or f.endswith(".yml"):
+        if f.endswith((".yaml", ".yml")):
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if f.endswith(".yaml") or f.endswith(".yml")
if f.endswith((".yaml", ".yml")):
🧰 Tools
🪛 Ruff

22-22: Call endswith once with a tuple

Merge into a single endswith call

(PIE810)

result = test.run()
results.append(result)
except Exception as e:
log.exception(f"Error running test {i}/{len(rules)}: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Remove redundant exception information in log.exception call.

When using log.exception(), the exception information is automatically included in the log message and traceback. Including the exception object e in the message is redundant.

Apply this diff:

-            log.exception(f"Error running test {i}/{len(rules)}: {e}")
+            log.exception(f"Error running test {i}/{len(rules)}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
log.exception(f"Error running test {i}/{len(rules)}: {e}")
log.exception(f"Error running test {i}/{len(rules)}")
🧰 Tools
🪛 Ruff

72-72: Logging statement uses f-string

(G004)


72-72: Redundant exception object included in logging.exception call

(TRY401)

Comment on lines +36 to +39
except Exception:
log.error(f"Error parsing {file}")
except Exception:
log.exception(f"Error loading {file}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Avoid catching broad exceptions without specifying the exception type.

Catching Exception can mask unexpected errors and make debugging difficult. It's better to catch specific exceptions that you anticipate, such as yaml.YAMLError for YAML parsing errors or OSError for file-related errors.

Consider updating the except clauses to catch specific exceptions:

-            except Exception:
+            except yaml.YAMLError:

And:

-        except Exception:
+        except OSError:

Committable suggestion was skipped due to low confidence.

🧰 Tools
🪛 Ruff

36-36: Do not catch blind exception: Exception

(BLE001)


37-37: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


37-37: Logging statement uses f-string

(G004)


39-39: Logging statement uses f-string

(G004)

@cedricdcc cedricdcc requested a review from mpo-vliz October 22, 2024 11:43
@cedricdcc cedricdcc marked this pull request as ready for review October 22, 2024 11:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants