Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scripts to convert tests from old format to new format and vice-versa #97

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

srikrishnak
Copy link

No description provided.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New and old are ephemeral terms. Five years from now they'll both be old. One is YAML-based, the other is Substrait test-based (or antlr4 based). Perhaps use those as names instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this run on every PR?

return not diff


def compare_directories(original_dir, roundtrip_dir):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document that this compares either format to a Substrait test format directory.

intermediate_dir = f"{temp_dir}/substrait_cases"
roundtrip_dir = f"{temp_dir}/cases"
uri_prefix = (
"https://github.com/substrait-io/substrait/blob/main/extensions/substrait"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By referring to the GitHub directory this test will fail when the test cases and functions change breaking the CI here. Probably better to refer to the tests in the local submodule (when the tests are available).

"https://github.com/substrait-io/substrait/blob/main/extensions/substrait"
)

# Step 1: Convert from `../../cases` to `./temp/substrait_cases/`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're converting from initial cases, the variable may change especially if you end up pulling those locations from the command line (so CI can control where they are located).

if value_type != "str":
return None
pattern = r"^(?:(\d+)\s+days?,\s*)?(\d+):(\d+):(\d+)$"
return re.match(pattern, value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning a match object breaks the encapsulation -- usually the caller doesn't need to know how a function did its work. If the function later changes to not use regular expressions to parse the caller would need to be updated because the contract would have to change even though it would essentially be doing the same work -- breaking a time into parts.

Check if a string is in the format of 'X days, HH:MM:SS'.
Returns a match object if successful, or None if it doesn't match.
"""
if value_type != "str":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring says it matches strings -- should the caller be making that determination. If so, value_type isn't needed in the signature.


def format_timestamp_tz(timestamp_with_tz):
"""Convert a timestamp with timezone abbreviation to ISO 8601 with offset."""
if "-" in timestamp_with_tz and not has_last_colon_after_last_dash(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line appears to be wanting a function to test whether the provided timestamp has a particular property such as being a timestamp with an explicit timezone in it. Perhaps create a function for that purpose?

iso_interval = "".join(iso_parts)

# Determine whether it's a 'iyear' or 'iday' based on units
if iyear:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't hold for compound intervals, but we won't have those in YAML tests. Add a comment to state that we only will see these two types?

return f"{day_str}, {time_str}" if day_str else time_str


def convert_to_old_value(value, value_type, level=0):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

old is yaml?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants