-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add WRROC models, validators & unit tests #20
base: main
Are you sure you want to change the base?
Changes from 9 commits
e0fe7e2
92f4814
9eca261
069397f
3dc009a
7e85db7
ef290a5
0efb216
71b1143
9f5b887
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If I'm not mistaken, all the changes introduced here are linting/style-related changes. They are all fine, and I do think they should be made, but please not in this PR. Please create a separate PR for linting/style changes, where you can do reformatting changes across the entire codebase. I suggest you create that one ASAP, we quickly review and merge it into the default branch, and then you can merge the default branch into this branch to get rid of many of the suggested changes in this PR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I’ll create a separate PR for the linting and style-related changes in |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,30 +1,37 @@ | ||
import click | ||
import json | ||
|
||
import click | ||
|
||
from crategen.converter_manager import ConverterManager | ||
|
||
|
||
@click.command() | ||
@click.option('--input', prompt='Input file', help='Path to the input JSON file.') | ||
@click.option('--output', prompt='Output file', help='Path to the output JSON file.') | ||
@click.option('--conversion-type', prompt='Conversion type', type=click.Choice(['tes-to-wrroc', 'wes-to-wrroc']), help='Type of conversion to perform.') | ||
@click.option("--input", prompt="Input file", help="Path to the input JSON file.") | ||
@click.option("--output", prompt="Output file", help="Path to the output JSON file.") | ||
@click.option( | ||
"--conversion-type", | ||
prompt="Conversion type", | ||
type=click.Choice(["tes-to-wrroc", "wes-to-wrroc"]), | ||
help="Type of conversion to perform.", | ||
) | ||
def cli(input, output, conversion_type): | ||
""" | ||
Command Line Interface for converting TES/WES to WRROC. | ||
""" | ||
"""Command Line Interface for converting TES/WES to WRROC.""" | ||
manager = ConverterManager() | ||
|
||
# Load input data from JSON file | ||
with open(input, 'r') as input_file: | ||
with open(input) as input_file: | ||
data = json.load(input_file) | ||
|
||
# Perform the conversion based on the specified type | ||
if conversion_type == 'tes-to-wrroc': | ||
if conversion_type == "tes-to-wrroc": | ||
result = manager.convert_tes_to_wrroc(data) | ||
elif conversion_type == 'wes-to-wrroc': | ||
elif conversion_type == "wes-to-wrroc": | ||
result = manager.convert_wes_to_wrroc(data) | ||
|
||
# Save the result to the output JSON file | ||
with open(output, 'w') as output_file: | ||
with open(output, "w") as output_file: | ||
json.dump(result, output_file, indent=4) | ||
|
||
if __name__ == '__main__': | ||
|
||
if __name__ == "__main__": | ||
cli() |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Style/format change unrelated to this PR: See above There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I’ll move the style and formatting changes from |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Mostly just style/format changes: See above Exception: The There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I’ll retain the |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,10 +1,11 @@ | ||
from abc import ABC, abstractmethod | ||
|
||
|
||
class AbstractConverter(ABC): | ||
@abstractmethod | ||
def convert_to_wrroc(self, data): | ||
"""Convert data to WRROC format""" | ||
|
||
@abstractmethod | ||
def convert_from_wrroc(self, wrroc_data): | ||
def convert_from_wrroc(self, data): | ||
"""Convert WRROC data to the original format""" |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,52 +1,113 @@ | ||
from datetime import datetime | ||
|
||
from pydantic import AnyUrl, ValidationError | ||
|
||
from ..models.tes_models import ( | ||
TESData, | ||
TESExecutor, | ||
TESInput, | ||
TESOutput, | ||
TESState, | ||
TESTaskLog, | ||
) | ||
from ..models.wrroc_models import WRROCDataTES | ||
from ..validators import validate_wrroc_tes | ||
from .abstract_converter import AbstractConverter | ||
uniqueg marked this conversation as resolved.
Show resolved
Hide resolved
|
||
from .utils import convert_to_iso8601 | ||
|
||
|
||
class TESConverter(AbstractConverter): | ||
def convert_to_wrroc(self, data: dict) -> dict: | ||
""" | ||
Convert TES data to WRROC format. | ||
|
||
Args: | ||
data: The input TES data. | ||
|
||
Returns: | ||
The converted WRROC data. | ||
|
||
Raises: | ||
ValidationError: If TES data is invalid. | ||
""" | ||
# Validate TES data | ||
uniqueg marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please remove all comments from the code in this entire module, I think the code is trivial enough to understand without comments (and the comments are distracting and sometimes misleading). Please also remove all trivial comments from any other modules. I would say that this should be all or almost all comments. There are no complex algorithms in the project and no workarounds for bugs in other libraries etc so far (as far as I remember). If you feel that your code is too complex to understand just by reading it and the associated docstring, consider refactoring it to make it simpler. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I’ll remove the comments from |
||
try: | ||
data_tes = TESData(**data) | ||
except ValidationError as e: | ||
raise ValueError(f"Invalid TES data: {e.errors()}") from e | ||
|
||
executors = data_tes.executors | ||
end_time = data_tes.logs[0].end_time if data_tes.logs else None | ||
|
||
def convert_to_wrroc(self, tes_data): | ||
# Validate and extract data with defaults | ||
id = tes_data.get("id", "") | ||
name = tes_data.get("name", "") | ||
description = tes_data.get("description", "") | ||
executors = tes_data.get("executors", [{}]) | ||
inputs = tes_data.get("inputs", []) | ||
outputs = tes_data.get("outputs", []) | ||
creation_time = tes_data.get("creation_time", "") | ||
end_time = tes_data.get("logs", [{}])[0].get("end_time", "") # Corrected to fetch from logs | ||
|
||
# Convert to WRROC | ||
wrroc_data = { | ||
"@id": id, | ||
"name": name, | ||
"description": description, | ||
"instrument": executors[0].get("image", None) if executors else None, | ||
"object": [{"@id": input.get("url", ""), "name": input.get("path", "")} for input in inputs], | ||
"result": [{"@id": output.get("url", ""), "name": output.get("path", "")} for output in outputs], | ||
"startTime": convert_to_iso8601(creation_time), | ||
"endTime": convert_to_iso8601(end_time), | ||
"@id": data_tes.id, | ||
"name": data_tes.name, | ||
"description": data_tes.description, | ||
"instrument": executors[0].image if executors else None, | ||
uniqueg marked this conversation as resolved.
Show resolved
Hide resolved
uniqueg marked this conversation as resolved.
Show resolved
Hide resolved
|
||
"object": [ | ||
{"@id": input.url, "name": input.path} for input in data_tes.inputs | ||
], | ||
"result": [ | ||
{"@id": output.url, "name": output.path} for output in data_tes.outputs | ||
], | ||
"startTime": data_tes.creation_time, | ||
"endTime": end_time, | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't you want to validate your There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good catch! I’ll add a validation step for |
||
|
||
validate_wrroc_tes(wrroc_data) | ||
return wrroc_data | ||
|
||
def convert_from_wrroc(self, wrroc_data): | ||
# Validate and extract data with defaults | ||
id = wrroc_data.get("@id", "") | ||
name = wrroc_data.get("name", "") | ||
description = wrroc_data.get("description", "") | ||
instrument = wrroc_data.get("instrument", "") | ||
object_data = wrroc_data.get("object", []) | ||
result_data = wrroc_data.get("result", []) | ||
start_time = wrroc_data.get("startTime", "") | ||
end_time = wrroc_data.get("endTime", "") | ||
|
||
# Convert from WRROC to TES | ||
tes_data = { | ||
"id": id, | ||
"name": name, | ||
"description": description, | ||
"executors": [{"image": instrument}], | ||
"inputs": [{"url": obj.get("@id", ""), "path": obj.get("name", "")} for obj in object_data], | ||
"outputs": [{"url": res.get("@id", ""), "path": res.get("name", "")} for res in result_data], | ||
"creation_time": start_time, | ||
"logs": [{"end_time": end_time}], # Added to logs | ||
} | ||
return tes_data | ||
def convert_from_wrroc(self, data: dict) -> dict: | ||
""" | ||
Convert WRROC data to TES format. | ||
|
||
Args: | ||
data: The input WRROC data. | ||
|
||
Returns: | ||
The converted TES data. | ||
|
||
Raises: | ||
ValidationError: If WRROC data is invalid. | ||
""" | ||
# Validate WRROC data | ||
try: | ||
data_wrroc = WRROCDataTES(**data) | ||
except ValidationError as e: | ||
raise ValueError(f"Invalid WRROC data: {e.errors()}") from e | ||
|
||
# Convert URL strings to AnyUrl | ||
tes_inputs = [TESInput(url=AnyUrl(url=obj.id), path=obj.name) for obj in data_wrroc.object] | ||
tes_outputs = [TESOutput(url=AnyUrl(url=res.id), path=res.name) for res in data_wrroc.result] | ||
|
||
# Ensure 'image' and 'command' fields are provided | ||
tes_executors = [TESExecutor(image=data_wrroc.instrument or "", command=[])] # Provide default empty list for command | ||
|
||
# Ensure correct type for end_time (datetime) | ||
end_time = datetime.fromisoformat(data_wrroc.endTime) if data_wrroc.endTime else None | ||
|
||
tes_logs = [ | ||
TESTaskLog( | ||
logs=[], | ||
metadata=None, | ||
start_time=None, | ||
end_time=end_time, | ||
outputs=[], | ||
system_logs=None | ||
) | ||
] | ||
|
||
tes_data = TESData( | ||
id=data_wrroc.id, | ||
name=data_wrroc.name, | ||
description=data_wrroc.description, | ||
executors=tes_executors, | ||
inputs=tes_inputs, | ||
outputs=tes_outputs, | ||
creation_time=None, | ||
logs=tes_logs, | ||
state=TESState.UNKNOWN | ||
) | ||
|
||
# Validate TES data before returning | ||
tes_data = TESData(**tes_data.dict()) | ||
return tes_data.dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine, but with the exception of the last couple of lines, the changes here are not related to this PR. And given that the PR is already huge and difficult to review, I would strongly suggest not to blow it up with anything that isn't directly related to its core focus.
My suggestion:
ci: reformat, reset branches & upgrade actions
(or similar) from a branch created from the default branch that introduces all of the changes introduced here, except for adding the job for running the unit tests (however, do not include the commented stuff, just remove it for now). We can then merge this PR immediately and you can merge the changes into this branch.In this way, we already merge some unrelated changes from this PR and make it a bit smaller.
I will probably suggest some more such points, because this PR is dragging along waaay too long, which often happens when there are too many different things being cobbled together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ll move the changes related to .github/workflows/ci.yml into a new PR, as suggested. I’ll only include the job for running the unit tests in this current PR. This should make the current PR smaller and easier to review.